Commit graph

257 commits

Author SHA1 Message Date
joesharratt1229
4b60c32978
Curr exp (#487)
* began curr exp

* added holdout words

* updated config

* added context

* updated base curriculum

* updaed

* updated curriculum

* updated

* updated

* updated automatic flag

* updated ray trainer

* update
2025-07-25 20:38:47 +01:00
Oliver Stanley
1a727ecf4e
support python 3.10 (#450)
* support python 3.10

* add 3.10 to tests

* new StrEnum
2025-06-04 10:34:01 +01:00
joesharratt1229
d0ef136d5b
Feat/intragen experiments (#414)
* added curriculum

* readapted readme

* corrected small errors

* Delete eval/eval/r1/algorithmic/word_sorting.json

* removed redundant argument

* added spell

* removed duplicated fit

* changed config

* added composite changes

* added composite changes

* updated yaml

* added spell backward

* updated read me

* added qwen2.5

* added

* Add files via upload

* updated missing trainer func

* updated curr

* updated spell back

* updated correctness score func

* updated configs

* added local evals

* added updates

* updated datasets

* added fsdp to hf utility

* added algorithmic qwen 3b yaml

* updated read me

* updated configs

* added preappend token

* updated with thinking token

* updated test score board

* resolved comments

* added evaluation scripts

* removed results from pr

* added config

* added partial reward scoring

* added evaluation composites

* added training configs

* added games eval

* added rubriks cube

* resolved merge cinflicts

* added games config

* added latest eval configs

* updated strucutre

* Delete training/evaluations/eval_graphs_composite.yaml

---------

Co-authored-by: joesharratt1229 <joesharrat1229@gmail.com>
2025-04-16 08:04:52 +02:00
Zafir Stojanovski
290bfc4fdd
(evals): Medium configs (#415)
* updated medium configs

* fix problematic curriculum values / small issues causing exceptions to be raised

* optimus alpha config

* all configs so far

* fix tests
2025-04-14 08:25:31 +02:00
Zafir Stojanovski
dced3bfc45
fix(curriculum): Make boundaries in curriculum more sensible (#407)
* init

* fix tests

* unify codeio

* filtered for libraries not present in reasoning-gym

* fix more bounds

* puzzle24

* knight swap curriculum

* fix number sorting

* fix attributes

* add validation of config in creation of dataset

* dry run for instantiating and validating the datasets

* remove unused imports

* fix curriculum tests to reference newly updated attribute names
2025-04-04 20:24:14 +02:00
joesharratt1229
43c739cb3e
Feat/curr adj (#394) 2025-04-02 06:39:14 +01:00
Zafir Stojanovski
ce0a6c4878
fix(envs): Add source dataset and index to metadata (#388)
* add source dataset and index to metadata

* fix typo

* fix coach class and its test
2025-03-20 11:12:14 +00:00
Oliver Stanley
7475a20700
include ranges rather than sampled values in difficulty metadata dicts (#387)
* update difficulty metadata for logic datasets

* update difficulty metadata for graph datasets

* update difficulty metadata for geometry datasets

* update difficulty metadata for games datasets

* update difficulty metadata for cognition datasets

* update difficulty metadata for arithmetic datasets

* update difficulty metadata for arc datasets

* update difficulty metadata for algorithmic datasets

* update difficulty metadata for algebra datasets

* use tuples

* update tests

* update tests
2025-03-20 10:27:03 +01:00
Andreas Koepf
1511c5e301 don't pass answer value to eval 2025-03-17 23:13:53 +01:00
Jean Kaddour
d6aad5a329
fix: add score_answer() to number_sorting (#380)
* fix: add score_answer() to number_sorting

* chore: run pre-commit

* fix: use json.loads()

* fix: run isort()
2025-03-17 23:04:13 +01:00
Andreas Köpf
d2c895f1d3
Refactor Curriculum Attributes (#335)
* remove min_value from AttributeDefinition
* remove type from AttributeDefinition
* Add CurriculumContext
* add ensure_interval option for RangeAttributes
* docs: Add legend explaining curriculum indicators in dataset gallery
* update GALLERY.md
2025-03-16 15:40:28 +01:00
Adefioye
8a0cacc054
Add jugs curriculum (#369) 2025-03-14 18:04:33 +01:00
Rich Jones
e7d05d6510
GoL-Halt Curricula (#366)
* GoL-Halt Curricula

* trivial
2025-03-14 16:15:45 +01:00
Oliver Stanley
b5651e5e2c
add word ladder curriculum (#361)
* add word ladder curriculum

* add to __init__.py
2025-03-14 16:10:52 +01:00
Adefioye
adea7a255e
Add gol curriculum (#354)
* Add gol curriculum

* Add difficulty

* Make levels of grid size of x and y be valid
2025-03-13 21:09:09 +01:00
Adefioye
ec3e414a8c
Cryptarithm curriculum (#346)
* Add curriculum for cryptarithm
* Add difficulty to metadata
2025-03-13 21:03:57 +01:00
Adefioye
4ec1154b47
Add curriculum to ab dataset (#345)
* Add curriculum to ab dataset

* Add difficulty to metadata
2025-03-13 21:03:02 +01:00
Zafir Stojanovski
aa6ccf1946
number filtering curriculum (#333) 2025-03-11 23:56:06 +01:00
Zafir Stojanovski
f204a848d9
spell backward curriculum (#327)
Co-authored-by: Andreas Köpf <andreas.koepf@xamla.com>
2025-03-11 00:22:28 +01:00
Zafir Stojanovski
a23c8c3d4e
sentence reordering curriculum (#326) 2025-03-11 00:21:41 +01:00
Zafir Stojanovski
9aeef4ebb0
palindrome generation curriculum (#322) 2025-03-11 00:19:11 +01:00
Zafir Stojanovski
ad48c551f9
feat(env): Number Sorting Curriculum (#321)
* number sorting curriculum

* metadata
2025-03-11 00:18:20 +01:00
Zafir Stojanovski
0bce1a6ae1
feat(env): Letter Jumble Curriculum (#319)
* base curriculum

* tests
2025-03-11 00:16:05 +01:00
Rich Jones
2b8f21c502
Correct Graph Coloring Difficulty (#318)
* correct gcolor difficulty

* refactor test
2025-03-11 00:14:38 +01:00
Rich Jones
d9ef4f4d14
Fix GoL-Halt Determinism (#317)
* test alt case

* fix determinism of gol-halt
2025-03-11 00:13:40 +01:00
Andreas Koepf
a49463c323 use file stem name of palindrome_generation dataset 2025-03-10 00:39:29 +01:00
Zafir Stojanovski
a1dc28aa73
feat(env): String Synthesis Curriculum (#308)
* string synthesis curriculum

* difficulty metadata
2025-03-10 00:27:03 +01:00
Zafir Stojanovski
037905667e
string splitting curriculum (#307) 2025-03-10 00:25:56 +01:00
Zafir Stojanovski
83cd34e21b
letter counting curriculum (#312) 2025-03-10 00:24:42 +01:00
Zafir Stojanovski
b88cadf75a
feat(env): Word Sequence Reversal curriculum (#313)
* word sequence reversal curriculum

* metadata
2025-03-10 00:24:05 +01:00
Zafir Stojanovski
54b216a5dc
string manipulation curriculum (#306) 2025-03-09 18:12:35 +01:00
Zafir Stojanovski
925283f342
string insertion curriculum (#305) 2025-03-09 18:11:29 +01:00
vncntt
af6120c095
add metadata for caesar cipher, graph coloring, decimal arithmetic (#304)
* add metadata for caesar cipher, graph coloring, decimal arithmetic

* delete comma

* clean up variables
2025-03-09 18:08:56 +01:00
vncntt
fc908d4cf4
Caesar cipher curriculum (#302)
* caesar cipher curriculum + tests
2025-03-09 08:23:32 +01:00
vncntt
e0f8ef061d
graph color curriculum (#303) 2025-03-09 08:20:47 +01:00
Zafir Stojanovski
2fca962847
ransom note curriculum (#300)
Co-authored-by: Andreas Köpf <andreas.koepf@xamla.com>
2025-03-08 21:00:13 +01:00
Zafir Stojanovski
bfa3a58829
palindrome partitioning curriculum (#299)
Co-authored-by: Andreas Köpf <andreas.koepf@xamla.com>
2025-03-08 20:58:59 +01:00
Zafir Stojanovski
194f08cad2
pool matrix curriculum (#298) 2025-03-08 20:57:22 +01:00
Zafir Stojanovski
5963cbd59e
rotten oranges curriculum (#297) 2025-03-08 20:56:46 +01:00
Zafir Stojanovski
6270e835bb
spiral matrix curriculum (#296) 2025-03-08 20:56:08 +01:00
Andreas Köpf
6615d8e662
Show curricula (#295)
* feat: Add debug_curricula.py script to generate CURRICULA.md with dataset curriculum details
2025-03-08 14:21:50 +01:00
Zafir Stojanovski
edab0389b6
rotate matrix curriculum (#294) 2025-03-08 01:58:54 +01:00
Zafir Stojanovski
8d4e9030c0
manipulate matrix curriculum (#293) 2025-03-08 01:57:37 +01:00
Zafir Stojanovski
e69ed78c26
feat(env): Isomorphic Strings Curriculum (#292)
* isomorphic strings curriculum

---------

Co-authored-by: Andreas Köpf <andreas.koepf@xamla.com>
2025-03-08 01:56:14 +01:00
joesharratt1229
af5a6533c8
added word sort curriculum (#289) 2025-03-08 01:50:13 +01:00
Zafir Stojanovski
2d05a48f9b
feat(env): Group Anagrams Curriculum (#288)
* group anagrams curriculum
2025-03-08 01:49:12 +01:00
Zafir Stojanovski
9fc9cf4597
feat(env): Count Primes Curriculum (#287)
* count primes curriculum
2025-03-08 01:48:00 +01:00
Zafir Stojanovski
adf8cd8f6d
base conversion curriculum (#286) 2025-03-08 01:46:32 +01:00
Zafir Stojanovski
25b8e35589
feat(env): Binary Matrix Curriculum (#279)
* binary matrix curriculum

* register BinaryMatrixCurriculum

---------

Co-authored-by: Andreas Koepf <andreas.koepf@provisio.com>
2025-03-07 22:58:47 +01:00
Zafir Stojanovski
a8e920b552
feat(env): Binary Alternation Curriculum (#278)
* binary alternation

---------

Co-authored-by: Andreas Koepf <andreas.koepf@provisio.com>
2025-03-07 22:44:32 +01:00