rishabhranawat
2d57beb517
commit formatting
2025-02-10 22:05:45 -08:00
rishabhranawat
6e3d049fed
[eval-v1] benchmark with 50 samples
2025-02-10 22:05:09 -08:00
rishabhranawat
06cabcfdee
[eval-v1] add a simple readme with some details
2025-02-10 21:57:00 -08:00
rishabhranawat
615c63d2f9
[eval-v1] pre commit formatting
2025-02-10 21:50:22 -08:00
rishabhranawat
88c875c00f
[eval-v1] add timer
2025-02-10 21:48:44 -08:00
rishabhranawat
be3d04e7cb
[eval-v1] async to speed up inference/evaluation
2025-02-10 21:35:46 -08:00
joesharratt1229
a3ea4449d1
added r1 evaluation logic
2025-02-11 03:46:56 +00:00
tohskai
7bad77b426
Improve support for multivariate polynomials
2025-02-11 01:58:07 +01:00
Dragan Jovanović
55b0226ccf
fix for isort
2025-02-11 00:20:46 +01:00
Dragan Jovanović
328d744780
initial draft for circuit_logic dataset generator
2025-02-11 00:09:00 +01:00
Andreas Koepf
eb25ab9656
update gallery, lower default config values for PowerFunctionDataset
2025-02-10 22:42:04 +01:00
Andreas Köpf
898dc0754a
Merge pull request #100 from zafstojano/env/matrix-manipulation
...
Matrix Manipulation Dataset
2025-02-10 22:37:37 +01:00
Zafir Stojanovski
f255831f1c
add more config params
2025-02-10 22:30:36 +01:00
Zafir Stojanovski
3e42d9588e
count bits ( #101 )
2025-02-10 22:12:50 +01:00
Andreas Koepf
690dc03131
add chain_sum curriculum unit test
2025-02-10 22:09:18 +01:00
Zafir Stojanovski
178895ab1b
Power Function ( #102 )
...
* power function dataset + tests
2025-02-10 22:04:58 +01:00
Zafir Stojanovski
696fdf8be7
Merge branch 'main' of https://github.com/open-thought/reasoning-gym into env/matrix-manipulation
2025-02-10 20:40:41 +01:00
Andreas Koepf
357a89fe8c
Add attributes for curriculum
...
Co-authored-by: EduardDurech <39579228+EduardDurech@users.noreply.github.com>
2025-02-10 18:58:07 +01:00
Adefioye
767c34297f
Add score_answer method to word_ladder ( #93 )
...
* Add score_answer method to word_ladder
* add unit test for WordLadderDataset::score_answer()
---------
Co-authored-by: Andreas Koepf <andreas.koepf@provisio.com>
2025-02-10 15:15:23 +01:00
Zafir Stojanovski
111f4c9170
matrix manipulation
2025-02-10 13:51:39 +01:00
Andreas Köpf
3150f9d9aa
Merge pull request #97 from rishabhranawat/eval-v1
...
[eval-basic] initial scripts for evaluating models on reasoning gym
2025-02-10 11:59:49 +01:00
rishabhranawat
03f87dbc07
[eval-basic] remove large results files, add gitignore, only leave summary
2025-02-09 22:52:10 -08:00
rishabhranawat
2308ed99fb
[eval-basic] run precommit formatting
2025-02-09 22:40:45 -08:00
rishabhranawat
94f07ed35d
[eval-basic] initial scripts for evaluating models on reasoning gym
2025-02-09 22:36:27 -08:00
Oliver
a53073278a
Remove rng param
2025-02-09 21:26:03 +00:00
Oliver
0627f2b02d
Greatly speed up solver
2025-02-09 21:23:53 +00:00
Andreas Koepf
c59db00196
reduce default zero probability for binary matrix
2025-02-09 20:05:56 +01:00
Andreas Köpf
0605c0cbe4
Merge pull request #91 from zafstojano/env/binary-matrix
...
Binary Matrix
2025-02-09 19:55:36 +01:00
Andreas Köpf
7444f774c6
Merge pull request #92 from rishabhranawat/poly-reward
...
Add score_answer() for PolynomialEquationsDataset
2025-02-09 19:30:24 +01:00
rishabhranawat
04b3323844
[poly-reward] run pre-commit hooks
2025-02-09 07:30:18 -08:00
Zafir Stojanovski
e5862371ed
update instruction and shuffle numbers
2025-02-09 13:00:46 +01:00
Andreas Köpf
70c731e9fb
Merge pull request #94 from zafstojano/fix/prime-factorization-scoring
...
fix(env): Prime Factorization scoring
2025-02-09 12:02:11 +01:00
Zafir Stojanovski
6cc5d0dd63
normalize answer and partial reward
2025-02-09 11:13:23 +01:00
rishabhranawat
7a6f7ea9da
[poly-reward] minor updates to the docstrings
2025-02-08 21:41:18 -08:00
rishabhranawat
0f4ab53bd3
Merge branch 'main' of https://github.com/rishabhranawat/reasoning-gym into poly-reward
2025-02-08 21:37:21 -08:00
rishabhranawat
0dd4c05897
[poly-reward] add a greedy strategy scoring function for polynomial equations
2025-02-08 21:36:21 -08:00
Zafir Stojanovski
89fd56f8e9
RotateMatrix typo
2025-02-09 01:11:06 +01:00
Zafir Stojanovski
f7836e17d0
binary matrix
2025-02-09 01:10:57 +01:00
Andreas Koepf
0c7fbb5001
bump version
2025-02-09 00:39:48 +01:00
Andreas Koepf
04bffd8f59
update GALLERY.md after merging knight_swap
2025-02-09 00:35:56 +01:00
Andreas Köpf
17ea7dd975
Merge pull request #89 from JeanKaddour/feat-swap-knights-puzzles
...
Feat swap knights puzzles
2025-02-09 00:33:48 +01:00
Andreas Köpf
f5a6dabb8b
Merge pull request #90 from open-thought/arc_agi_1_dataset
...
ARC-AGI-1 dataset with augmentations
2025-02-09 00:19:20 +01:00
Andreas Koepf (aider)
ec8036c099
feat: Add configurable rotation and mirror augmentation variants
2025-02-09 00:16:41 +01:00
Andreas Koepf
b73040b066
refactor: Reorganize ArcAgiConfig class attributes for better readability
2025-02-09 00:12:59 +01:00
Andreas Koepf
e56316ebb2
formatting
2025-02-09 00:04:42 +01:00
Andreas Koepf (aider)
8d8d85e6b2
fix: Add missing Callable import to arc_agi.py
2025-02-08 23:59:53 +01:00
Andreas Koepf (aider)
cdb9d8d8f8
feat: Add configurable augmentations to ArcAgiDataset with consistent application
2025-02-08 23:59:45 +01:00
Andreas Koepf
1795cd815c
add rotate, mirror & color-mapping augmentation functions
2025-02-08 23:51:38 +01:00
Andreas Koepf (aider)
f72bd8d6a5
test: Add comprehensive unit tests for ArcAgiDataset
2025-02-08 23:20:45 +01:00
Andreas Koepf
4e49806d22
add ArcAgiDataset class, fix score_entry() metadata params
2025-02-08 23:18:18 +01:00