Cavit Erginsoy
511425797f
Improve efficiency and reduce plural bias in word ladder generation
...
- Precomputed sorted word lists for each word length (stored in self.words_lists) to avoid redundant sorting on every _generate_word_pair call.
- Updated _generate_word_pair to utilize the cached sorted list, significantly improving computational efficiency.
- Implemented weighted random sampling for 5-letter words, giving words ending with 'S' a lower weight (0.5) to reduce bias without completely filtering them out.
2025-02-01 14:37:21 +00:00
Cavit Erginsoy
fce0c4fa3f
refactor: Clarify word ladder question
2025-02-01 14:27:06 +00:00
Cavit Erginsoy
d57a7947a4
INIT
2025-01-30 21:32:46 +00:00
Cavit Erginsoy
4f14a20725
INIT
2025-01-30 19:42:58 +00:00
Andreas Koepf
7b3cc45bbd
add newline to word sorting template
2025-01-27 16:57:49 +01:00
Andreas Koepf (aider)
b7029fc5df
feat: Clarify word sorting instructions with ASCII/Unicode ordering and output format
2025-01-26 22:29:57 +01:00
Andreas Koepf (aider)
557fac66bf
refactor: Change word sorting answer format from list string to comma-separated string
2025-01-26 22:23:18 +01:00
Andreas Koepf
c3b6af35f0
min python 3.11 to support StrEnum
2025-01-26 22:17:43 +01:00
Andreas Koepf
ad9f0d265c
fix unit tests, lower python dependency to 3.9
2025-01-26 16:55:17 +01:00
Andreas Koepf (aider)
56f422d74e
fix: Import missing 're' module for regex word extraction
2025-01-26 16:14:23 +01:00
Andreas Koepf (aider)
7df974a753
feat: Add word sorting task generation with text transformations
2025-01-26 16:14:10 +01:00
Andreas Koepf (aider)
187df2bf7b
feat: Add word sorting dataset with configurable text transformations
2025-01-26 16:11:32 +01:00
Andreas Koepf (aider)
8e92025cf7
refactor: Update default sentence length constraints to 3-20 words
2025-01-26 15:57:02 +01:00
Andreas Koepf (aider)
65c60b2afa
refactor: Update sentence extraction regex to preserve ending punctuation
2025-01-26 15:56:03 +01:00
Andreas Koepf (aider)
028d5ccf96
refactor: Rename num_of_words_in_sentence and add max_words_in_sentence config
2025-01-26 15:46:21 +01:00
Andreas Koepf
d3fe900889
refactor: Update sentence reordering prompt to be more descriptive
2025-01-26 15:46:19 +01:00
Andreas Köpf
add29c2fcb
Merge branch 'main' into koko/scramble
2025-01-26 15:41:25 +01:00
Andreas Koepf
cdf08d9d5b
rename word_reversal.py -> word_sequence_reversal.py
2025-01-26 11:57:50 +01:00
Andreas Koepf (aider)
e9ac50a6fc
refactor: Update import for word sequence reversal module
2025-01-26 11:53:48 +01:00
Andreas Koepf (aider)
be00e0bab2
fix: Correct WordReversalConfig references to WordSequenceReversalConfig
2025-01-26 11:52:25 +01:00
Andreas Koepf (aider)
c641b25508
refactor: Rename WordReversalDataset to WordSequenceReversalDataset
2025-01-26 11:52:15 +01:00
Andreas Koepf (aider)
4d582387de
feat: Add SpellBackward imports and exports to algorithmic package
2025-01-26 11:48:18 +01:00
Andreas Koepf (aider)
9b93ac2a1f
feat: Add spell_backward.py module for word reversal task generation
2025-01-26 11:46:07 +01:00
Andreas Koepf (aider)
696d60dd2b
refactor: Move SpellBackwardDataset to separate file
2025-01-26 11:44:27 +01:00
Andreas Koepf (aider)
b18bede2bf
feat: Add SpellBackwardDataset with word reversal and length filtering
2025-01-26 11:40:47 +01:00
abdulhakeem
3fabb319ab
Make more tiny correction
2025-01-25 23:25:55 -06:00
abdulhakeem
b13d0762d6
Correct logic for number of words in sentence
2025-01-25 23:22:16 -06:00
abdulhakeem
4d50cfd514
Add parameters to _generate_sentence_dataset
2025-01-25 23:17:39 -06:00
abdulhakeem
384a00ec71
Ensure only words are considered
2025-01-25 23:08:41 -06:00
abdulhakeem
c7c12269ad
Add assertion to ensure number of words in sentence is positive
2025-01-25 23:02:17 -06:00
abdulhakeem
a72629c28f
Add sentence reordering and unit tests to validate it
2025-01-25 22:52:35 -06:00
Andreas Koepf
6af2231d84
formatting
2025-01-25 18:51:28 +01:00
Andreas Koepf
66666cf2bf
remove old files
2025-01-25 18:51:07 +01:00
Andreas Koepf (aider)
f185fba00e
refactor: Rename UnscrambleWordsDataset to LetterJumbleDataset
2025-01-25 18:37:42 +01:00
Andreas Koepf (aider)
7ec77bf619
feat: Add consecutive words option and ensure minimum word swap in UnscrambleWords
2025-01-25 18:29:02 +01:00
Andreas Koepf (aider)
d57a88a5b3
feat: Add unscramble_words dataset with configurable word scrambling
2025-01-25 18:21:31 +01:00
Andreas Koepf (aider)
2108a2c3e9
feat: Add Caesar cipher import to algorithmic module
2025-01-25 18:05:06 +01:00
Andreas Koepf (aider)
a11c6bfb26
feat: Add Caesar cipher dataset generator with encryption and decryption tasks
2025-01-25 18:00:13 +01:00
Andreas Koepf
519e411fa5
add reasoning_gym.create_dataset({name}, ...) global factory function
2025-01-25 00:58:34 +01:00
Andreas Koepf
0d2d8ba6a0
pass config to ProceduralDataset base
2025-01-25 00:23:05 +01:00
Andreas Koepf
5c5d46b4bd
formatting, cleanup
2025-01-24 17:12:42 +01:00
Andreas Koepf (aider)
d413717eff
refactor: Inherit NumberSortingDataset from ProceduralDataset
2025-01-24 11:18:10 +01:00
Andreas Koepf (aider)
190c2aafa4
refactor: Inherit WordReversalDataset from ProceduralDataset
2025-01-24 11:16:15 +01:00
Andreas Koepf (aider)
981ff73ed7
refactor: Inherit NumberFilteringDataset and LetterCountingDataset from ProceduralDataset
2025-01-24 11:13:32 +01:00
Andreas Koepf (aider)
3dbbfaf330
refactor: Inherit BaseConversionDataset from ProceduralDataset
2025-01-24 11:12:29 +01:00
Andreas Koepf
aaabc05ace
formatting
2025-01-24 10:34:07 +01:00
Andreas Koepf
0e9250bce0
Rename ArithmeticDataset to BasicArithmeticDataset
2025-01-24 10:31:26 +01:00
Andreas Koepf (aider)
ecb7d1bca1
feat: Add NumberSortingDataset to algorithmic package with configuration and tests
2025-01-23 23:28:15 +01:00
Andreas Koepf (aider)
562dfb1813
refactor: Rename chain_sum to chain_sum_dataset for consistency
2025-01-23 22:27:48 +01:00
Andreas Koepf (aider)
cc16eaf60c
feat: Add arithmetic dataset functions to algorithmic package __init__
2025-01-23 22:24:59 +01:00