Andreas Koepf (aider)
|
e48c1f82cd
|
docs: Update installation instructions in eval README
|
2025-02-25 15:37:09 +01:00 |
|
Andreas Koepf (aider)
|
a1b0a0414e
|
docs: Add dependency installation step to eval README setup instructions
|
2025-02-25 15:19:38 +01:00 |
|
Andreas Koepf
|
574edb5c5b
|
remove eval results from main repo
|
2025-02-25 11:02:02 +01:00 |
|
Andreas Koepf (aider)
|
205174c532
|
docs: Add info about reasoning-gym-eval repository for evaluation results
|
2025-02-25 10:53:21 +01:00 |
|
Andreas Köpf
|
a4b767fa0e
|
Merge pull request #197 from open-thought/notice_txt_first_version
docs: Add NOTICE.txt file to project
|
2025-02-24 15:30:28 +01:00 |
|
Andreas Koepf
|
0bea658c94
|
docs: Add NOTICE.txt file to project
|
2025-02-24 12:57:28 +01:00 |
|
Andreas Köpf
|
3c589f99bd
|
Merge pull request #195 from open-thought/fix/eval
pinned provider to nebius fixes #192
|
2025-02-24 08:34:45 +01:00 |
|
joesharratt1229
|
cffbff935c
|
pinned provider to nebius
|
2025-02-24 05:01:22 +00:00 |
|
Andreas Koepf
|
b5f6f7d753
|
bump version, update gallery
|
2025-02-23 22:36:39 +01:00 |
|
Andreas Köpf
|
d115655f0a
|
Merge pull request #191 from zafstojano/env/shortest-path
feat(env): Shortest Path
|
2025-02-23 22:28:43 +01:00 |
|
Andreas Koepf
|
45e452bff6
|
reduce size of default shortest_path maze grid
|
2025-02-23 22:27:17 +01:00 |
|
Zafir Stojanovski
|
c5f37d5e9f
|
predict actual path
|
2025-02-23 18:24:23 +01:00 |
|
Andreas Köpf
|
eaa8f5253b
|
Merge pull request #194 from open-thought/190_fix_arc_1d_out_of_range
minor arc_1d tweaks
|
2025-02-23 16:40:09 +01:00 |
|
Andreas Koepf
|
469934d9b7
|
minor arc_1d tweaks
|
2025-02-23 16:37:40 +01:00 |
|
Andreas Köpf
|
8e4ed9bae9
|
Merge pull request #193 from open-thought/190_fix_arc_1d_out_of_range
Fix index out of range for arc_1d dataset
|
2025-02-23 13:20:08 +01:00 |
|
Andreas Koepf
|
ec3050a4f6
|
remove unnecessary checks, use tuples
|
2025-02-23 13:17:48 +01:00 |
|
Andreas Koepf
|
ba56aa0092
|
add arc_1d size range test
|
2025-02-23 12:58:51 +01:00 |
|
Andreas Koepf
|
7a45b14a49
|
fix index out of range of arc_1d dataset (#190)
|
2025-02-23 12:51:41 +01:00 |
|
Zafir Stojanovski
|
97b3097984
|
shortest path
|
2025-02-23 11:25:00 +01:00 |
|
Andreas Koepf
|
e4102a44f6
|
dev minor version one ahead of PyPI released version
|
2025-02-22 16:54:05 +01:00 |
|
Andreas Köpf
|
7a1e387d6e
|
Merge pull request #176 from olliestanley/codeio-experiments
Experiments with CodeI/O techniques for synthesising reasoning data
|
2025-02-22 16:24:17 +01:00 |
|
Oliver
|
081f84dec6
|
Add steps to synthesize CoTs with DeepSeekV3
|
2025-02-21 23:36:19 +00:00 |
|
Oliver
|
cce6002c70
|
Improve prompt for better LLM adherence
|
2025-02-21 23:00:48 +00:00 |
|
Andreas Koepf
|
eeb9fa31d5
|
more native type hints
|
2025-02-21 21:23:14 +01:00 |
|
Andreas Köpf
|
90a1181285
|
Merge pull request #185 from joesharratt1229/feat/emoji-mystery
Implements #173
|
2025-02-21 21:09:26 +01:00 |
|
Andreas Koepf
|
51808210aa
|
add markdown tripple backtick code block for emoji_mystry hint
|
2025-02-21 21:06:07 +01:00 |
|
Andreas Köpf
|
c56045b9a7
|
Merge branch 'main' into feat/emoji-mystery
|
2025-02-21 20:58:39 +01:00 |
|
Oliver
|
cb1f634078
|
Prompt tweak
|
2025-02-21 18:34:13 +00:00 |
|
joesharratt1229
|
1fb73011f8
|
added answer format spec in prompt
|
2025-02-21 18:03:05 +00:00 |
|
joesharratt1229
|
9b9554e489
|
added tests
|
2025-02-21 17:58:13 +00:00 |
|
joesharratt1229
|
5e64d1c24c
|
added emoji dataset
|
2025-02-21 17:57:41 +00:00 |
|
Oliver
|
a0ccfa5144
|
Merge branch 'main' into codeio-experiments
|
2025-02-21 17:25:08 +00:00 |
|
Andreas Koepf
|
97b30f5f53
|
update GALLERY.md
|
2025-02-21 17:30:33 +01:00 |
|
Andreas Köpf
|
1c6359f1f3
|
Merge pull request #181 from open-thought/rich/bitwise
Add Bitwise Arithmetic
|
2025-02-21 17:27:45 +01:00 |
|
Andreas Köpf
|
2947038557
|
Merge pull request #182 from zafstojano/env/binary-alternation
feat(env): Binary Alternation
|
2025-02-21 17:27:16 +01:00 |
|
Andreas Koepf (aider)
|
af4d79e947
|
fix: Handle negative hex number prefix variations in bitwise arithmetic test
|
2025-02-21 17:23:50 +01:00 |
|
Andreas Koepf (aider)
|
e846c53347
|
test: Update bitwise arithmetic difficulty levels to [1, 2, 3]
|
2025-02-21 17:22:36 +01:00 |
|
Andreas Koepf (aider)
|
660f7e6f03
|
test: Add comprehensive unit tests for BitwiseArithmeticDataset
|
2025-02-21 17:21:00 +01:00 |
|
Andreas Koepf (aider)
|
bae97aa795
|
docs: Add comment explaining automatic base detection in int() conversion
|
2025-02-21 17:16:11 +01:00 |
|
Andreas Koepf (aider)
|
5ff957a766
|
docs: Add detailed comments for BitwiseArithmeticConfig and BitwiseArithmeticDataset
|
2025-02-21 17:14:00 +01:00 |
|
Andreas Koepf
|
44f4cc08eb
|
refactor: Update type hints and remove unused imports in bitwise_arithmetic.py
|
2025-02-21 17:13:36 +01:00 |
|
Andreas Koepf (aider)
|
c91d13bd08
|
feat: Add typing hints and improve difficulty parameter documentation in bitwise_arithmetic.py
|
2025-02-21 17:11:40 +01:00 |
|
Rich Jones
|
1cf6821f17
|
lint
|
2025-02-21 17:09:19 +01:00 |
|
Rich Jones
|
c1b26cf184
|
ensure arbitrary bit depth and signed values
|
2025-02-21 16:52:26 +01:00 |
|
Andreas Köpf
|
700aab6114
|
Merge pull request #180 from Adefioye/list-functions
Add induction-based tasks for list functions
|
2025-02-21 16:20:49 +01:00 |
|
Andreas Köpf
|
bad2abf63e
|
Merge pull request #184 from AhmedSaif2/main
fix parameter name in compute_decimal_reward docstring
|
2025-02-21 16:15:05 +01:00 |
|
AhmedSaif2
|
5d3bfda677
|
fix parameter name in compute_decimal_reward docstring
|
2025-02-21 17:01:59 +02:00 |
|
abdulhakeem
|
a5e88dbd2e
|
clean up
|
2025-02-21 08:54:55 -06:00 |
|
Andreas Köpf
|
ef33dbc077
|
Merge pull request #183 from open-thought/rich/rdmepolish
Enhance README friendliness
|
2025-02-21 15:45:56 +01:00 |
|
Andreas Köpf
|
88474a98d9
|
Merge pull request #179 from joesharratt1229/fix/prop_logix
Fix/prop logix
|
2025-02-21 15:43:13 +01:00 |
|