rishabhranawat
fb40c8ca55
[eval-v1] add a simple readme with some details
2025-02-10 21:57:00 -08:00
rishabhranawat
9e4870125d
[eval-v1] pre commit formatting
2025-02-10 21:50:22 -08:00
rishabhranawat
df5438498e
[eval-v1] add timer
2025-02-10 21:48:44 -08:00
rishabhranawat
247464a47d
[eval-v1] async to speed up inference/evaluation
2025-02-10 21:35:46 -08:00
Andreas Koepf
4abcd1f1df
update gallery, lower default config values for PowerFunctionDataset
2025-02-10 22:42:04 +01:00
Andreas Köpf
51949fdee2
Merge pull request #100 from zafstojano/env/matrix-manipulation
...
Matrix Manipulation Dataset
2025-02-10 22:37:37 +01:00
Zafir Stojanovski
a0a5de3658
add more config params
2025-02-10 22:30:36 +01:00
Zafir Stojanovski
ed10111834
count bits ( #101 )
2025-02-10 22:12:50 +01:00
Zafir Stojanovski
a8c39ddcfb
Power Function ( #102 )
...
* power function dataset + tests
2025-02-10 22:04:58 +01:00
Zafir Stojanovski
ecdc85f2c2
Merge branch 'main' of https://github.com/open-thought/reasoning-gym into env/matrix-manipulation
2025-02-10 20:40:41 +01:00
Adefioye
bea9e6d96a
Add score_answer method to word_ladder ( #93 )
...
* Add score_answer method to word_ladder
* add unit test for WordLadderDataset::score_answer()
---------
Co-authored-by: Andreas Koepf <andreas.koepf@provisio.com>
2025-02-10 15:15:23 +01:00
Zafir Stojanovski
3d66cc6a7f
matrix manipulation
2025-02-10 13:51:39 +01:00
Andreas Köpf
f6060f4d97
Merge pull request #97 from rishabhranawat/eval-v1
...
[eval-basic] initial scripts for evaluating models on reasoning gym
2025-02-10 11:59:49 +01:00
rishabhranawat
0657222a8f
[eval-basic] remove large results files, add gitignore, only leave summary
2025-02-09 22:52:10 -08:00
rishabhranawat
c214724a46
[eval-basic] run precommit formatting
2025-02-09 22:40:45 -08:00
rishabhranawat
75cfd31ec2
[eval-basic] initial scripts for evaluating models on reasoning gym
2025-02-09 22:36:27 -08:00
Andreas Koepf
8c4400b18a
reduce default zero probability for binary matrix
2025-02-09 20:05:56 +01:00
Andreas Köpf
1472de02ea
Merge pull request #91 from zafstojano/env/binary-matrix
...
Binary Matrix
2025-02-09 19:55:36 +01:00
Andreas Köpf
7bd841d640
Merge pull request #92 from rishabhranawat/poly-reward
...
Add score_answer() for PolynomialEquationsDataset
2025-02-09 19:30:24 +01:00
rishabhranawat
40e5a7cffa
[poly-reward] run pre-commit hooks
2025-02-09 07:30:18 -08:00
Zafir Stojanovski
18cf71a4a7
update instruction and shuffle numbers
2025-02-09 13:00:46 +01:00
Andreas Köpf
ed37eae559
Merge pull request #94 from zafstojano/fix/prime-factorization-scoring
...
fix(env): Prime Factorization scoring
2025-02-09 12:02:11 +01:00
Zafir Stojanovski
ef2a412c8b
normalize answer and partial reward
2025-02-09 11:13:23 +01:00
rishabhranawat
8c6c7f9ca7
[poly-reward] minor updates to the docstrings
2025-02-08 21:41:18 -08:00
rishabhranawat
adfcf52bca
Merge branch 'main' of https://github.com/rishabhranawat/reasoning-gym into poly-reward
2025-02-08 21:37:21 -08:00
rishabhranawat
1cc55b3f96
[poly-reward] add a greedy strategy scoring function for polynomial equations
2025-02-08 21:36:21 -08:00
Zafir Stojanovski
7273ebb590
RotateMatrix typo
2025-02-09 01:11:06 +01:00
Zafir Stojanovski
afde78196c
binary matrix
2025-02-09 01:10:57 +01:00
Andreas Koepf
1f9d9d27ab
bump version
2025-02-09 00:39:48 +01:00
Andreas Koepf
72b37eba5a
update GALLERY.md after merging knight_swap
2025-02-09 00:35:56 +01:00
Andreas Köpf
8132cd6d90
Merge pull request #89 from JeanKaddour/feat-swap-knights-puzzles
...
Feat swap knights puzzles
2025-02-09 00:33:48 +01:00
Andreas Köpf
910889ea2c
Merge pull request #90 from open-thought/arc_agi_1_dataset
...
ARC-AGI-1 dataset with augmentations
2025-02-09 00:19:20 +01:00
Andreas Koepf (aider)
3137e0f433
feat: Add configurable rotation and mirror augmentation variants
2025-02-09 00:16:41 +01:00
Andreas Koepf
40f418bfb9
refactor: Reorganize ArcAgiConfig class attributes for better readability
2025-02-09 00:12:59 +01:00
Andreas Koepf
39b5599f40
formatting
2025-02-09 00:04:42 +01:00
Andreas Koepf (aider)
e8e918c9de
fix: Add missing Callable import to arc_agi.py
2025-02-08 23:59:53 +01:00
Andreas Koepf (aider)
f8e76b8048
feat: Add configurable augmentations to ArcAgiDataset with consistent application
2025-02-08 23:59:45 +01:00
Andreas Koepf
492570ff5c
add rotate, mirror & color-mapping augmentation functions
2025-02-08 23:51:38 +01:00
Andreas Koepf (aider)
1209e9df72
test: Add comprehensive unit tests for ArcAgiDataset
2025-02-08 23:20:45 +01:00
Andreas Koepf
127f505798
add ArcAgiDataset class, fix score_entry() metadata params
2025-02-08 23:18:18 +01:00
Andreas Koepf
60effc6e7a
move arc_1d into from cognition into arc folder
2025-02-08 19:37:26 +01:00
Andreas Koepf
4d9afcaba2
clarify number_filtering task
2025-02-08 19:32:45 +01:00
Andreas Koepf
f562737eef
update gallery spiral_matrix
2025-02-08 19:15:26 +01:00
Andreas Köpf
28e3545cf9
Merge pull request #85 from zafstojano/env/spiral-matrix
...
Spiral Matrix
2025-02-08 19:14:02 +01:00
Andreas Koepf
63cbb8722d
remove unnecessary newline from arc prompt
2025-02-08 19:12:41 +01:00
Andreas Koepf
d0ee809757
re-arc cleanup
2025-02-08 19:07:28 +01:00
Zafir Stojanovski
500bd12b61
signle digit numbers, better explanation, max_cols == max_rows == max_n
2025-02-08 18:53:25 +01:00
Zafir Stojanovski
3f5cfeed95
Merge branch 'main' of https://github.com/open-thought/reasoning-gym into env/spiral-matrix
2025-02-08 18:52:45 +01:00
Andreas Köpf
1108594518
Merge pull request #88 from joesharratt1229/feat/re-arc
...
Feat/re arc
2025-02-08 18:20:17 +01:00
Andreas Köpf
307a031146
Merge pull request #87 from zafstojano/env/rotate-matrix
...
Rotate Matrix k times
2025-02-08 17:46:58 +01:00