Andreas Köpf
850c1cf6f4
Eval script consolidation ( #238 )
...
The script now supports:
- YAML and JSON configurations
- Dataset-specific parameters
- Overriding configuration via command line
- Detailed logging and error handling
2025-02-27 17:39:14 +01:00
Andreas Köpf
8a66d2a216
Merge pull request #237 from open-thought/rich/richmorevalfixes2
...
Fix graph color example template
2025-02-27 16:08:23 +01:00
Rich Jones
a6c90f40a1
rm typo
2025-02-27 13:44:33 +01:00
Rich Jones
1b95cd3206
fix graph color example template
2025-02-27 13:43:01 +01:00
Andreas Köpf
c98cc5fcd6
Merge pull request #220 from open-thought/rich/cubeinstructions
...
Make Rubiks Cube Output Format More Explicit
2025-02-27 12:16:09 +01:00
Rich Jones
253e49aecf
sm fixes
2025-02-27 11:54:04 +01:00
Rich Jones
633a1aa1ba
expand more
2025-02-27 10:41:30 +01:00
Andreas Koepf (aider)
941da618d8
feat: Add comprehensive unit tests for parse_string_to_complex() method
2025-02-26 21:44:32 +01:00
Andreas Koepf
6511725711
add markdown tripple backticks around tsumego board
2025-02-26 19:39:05 +01:00
Andreas Koepf
f97bf94caa
fix & simplify score_answer() of TsumegoDataset
2025-02-26 19:04:30 +01:00
Andreas Koepf
72233fc2ea
bump version, pypi release of 0.1.12
2025-02-26 18:25:16 +01:00
Oliver Stanley
ac4ce13369
Merge pull request #188 from olliestanley/codeio-sampler
...
Procedural dataset for generating reasoning problems from CodeI/O-style data
2025-02-26 16:51:45 +00:00
Andreas Köpf
5e1594da16
Merge pull request #231 from AhmedSaif2/count-primes
...
Fix primes representation in count_primes dataset metadata
2025-02-26 17:49:50 +01:00
Andreas Köpf
e351d302a3
Merge pull request #219 from open-thought/rich/fix_ccc
...
Fix Cube Rotation Scoring
2025-02-26 17:41:18 +01:00
AhmedSaif2
dcdc38b15d
Fix primes representation in count_primes dataset metadata
2025-02-26 14:58:21 +02:00
Rich Jones
f0ca949aaf
support expanded notation anyway
2025-02-26 13:17:03 +01:00
Rich Jones
285e2b20cc
rubiks cube instructions
2025-02-26 13:07:17 +01:00
Rich Jones
229086131a
fix CCC scoring
2025-02-26 12:54:40 +01:00
Oliver
5fa06c961f
Fix
2025-02-26 11:17:23 +00:00
Andreas Köpf
48f082663a
Fix PoolMatrixConfigs::score_answer(), add unit tests ( #215 )
2025-02-26 00:43:18 +01:00
Andreas Koepf
bba128ffd0
fix score_answer of pool_matrix (if -> elif), remove print
2025-02-25 23:43:29 +01:00
Andreas Koepf
f9e8f8b064
add try-except to GraphColorDataset.score_answer()
2025-02-25 23:43:29 +01:00
Andreas Koepf
65d17b9850
add None/empty check to score_answer of cryptarithm
2025-02-25 23:43:29 +01:00
Oliver
aa6759c160
Merge branch 'main' into codeio-sampler
2025-02-25 22:41:47 +00:00
Oliver
81c77a495d
Add note on code execution to CodeIODataset
2025-02-25 22:39:06 +00:00
Oliver
0252dd905f
Move data file & load into memory on first object creation
2025-02-25 22:36:38 +00:00
vncntt
5f01049607
Add KnightsKnavesDataset (knights_knaves)
...
Adapted code from https://github.com/AlphaPav/mem-kk-logic/blob/main/data_prep/lib_kk.py
---------
Co-authored-by: Andreas Koepf (aider) <andreas.koepf@provisio.com>
2025-02-25 20:15:38 +01:00
Oliver
fe502d5eb2
Register CodeIODataset
2025-02-24 18:28:35 +00:00
Oliver
43daec67ea
Initial scoring algo for codeio
2025-02-24 18:27:53 +00:00
Oliver
1795c8ea7a
Add tiny sample dataset & efficient sampling
2025-02-24 17:58:31 +00:00
Oliver
7b5a12a92c
Remove outdated comment
2025-02-23 22:24:13 +00:00
Oliver
e07287e1f9
Add validation
2025-02-23 22:23:45 +00:00
Andreas Koepf
b5f6f7d753
bump version, update gallery
2025-02-23 22:36:39 +01:00
Andreas Köpf
d115655f0a
Merge pull request #191 from zafstojano/env/shortest-path
...
feat(env): Shortest Path
2025-02-23 22:28:43 +01:00
Andreas Koepf
45e452bff6
reduce size of default shortest_path maze grid
2025-02-23 22:27:17 +01:00
Oliver
342902683f
Merge branch 'main' into codeio-sampler
2025-02-23 20:28:06 +00:00
Oliver
f787069fd2
Add input prediction
2025-02-23 20:27:27 +00:00
Zafir Stojanovski
c5f37d5e9f
predict actual path
2025-02-23 18:24:23 +01:00
Andreas Koepf
469934d9b7
minor arc_1d tweaks
2025-02-23 16:37:40 +01:00
Andreas Koepf
ec3050a4f6
remove unnecessary checks, use tuples
2025-02-23 13:17:48 +01:00
Andreas Koepf
7a45b14a49
fix index out of range of arc_1d dataset ( #190 )
2025-02-23 12:51:41 +01:00
Zafir Stojanovski
97b3097984
shortest path
2025-02-23 11:25:00 +01:00
Andreas Koepf
e4102a44f6
dev minor version one ahead of PyPI released version
2025-02-22 16:54:05 +01:00
Oliver
e718168428
Draft CodeIO-derived reasoning problems dataset
2025-02-22 00:56:52 +00:00
Oliver
563480329e
Outline CodeIO dataset classes
2025-02-22 00:21:17 +00:00
Andreas Koepf
eeb9fa31d5
more native type hints
2025-02-21 21:23:14 +01:00
Andreas Koepf
51808210aa
add markdown tripple backtick code block for emoji_mystry hint
2025-02-21 21:06:07 +01:00
Andreas Köpf
c56045b9a7
Merge branch 'main' into feat/emoji-mystery
2025-02-21 20:58:39 +01:00
joesharratt1229
1fb73011f8
added answer format spec in prompt
2025-02-21 18:03:05 +00:00
joesharratt1229
5e64d1c24c
added emoji dataset
2025-02-21 17:57:41 +00:00