Commit graph

32 commits

Author SHA1 Message Date
Ramiro R. C.
de2e89d21d
Codeio prompt fix (#513)
* prompr fix to request more specific JSON responses

* corrected gallery examples too
2025-11-13 11:48:20 +01:00
Zafir Stojanovski
71787c6a0e
fix(env): Remove custom score function in bf (#418)
* remove custom score function in bf

* pre commit
2025-04-14 11:26:18 +02:00
Zafir Stojanovski
dafdee621e
fix(env): Unify CodeIO datasets (#405)
* unify codeio

* filtered for libraries not present in reasoning-gym
2025-04-02 22:40:03 +02:00
Zafir Stojanovski
ce0a6c4878
fix(envs): Add source dataset and index to metadata (#388)
* add source dataset and index to metadata

* fix typo

* fix coach class and its test
2025-03-20 11:12:14 +00:00
Andreas Köpf
d2c895f1d3
Refactor Curriculum Attributes (#335)
* remove min_value from AttributeDefinition
* remove type from AttributeDefinition
* Add CurriculumContext
* add ensure_interval option for RangeAttributes
* docs: Add legend explaining curriculum indicators in dataset gallery
* update GALLERY.md
2025-03-16 15:40:28 +01:00
Oliver Stanley
f14662e213
Add a few new CodeI/O samples, resolve numeric answer scoring bug (#332)
* add handful of codeio samples

* scoring fix
2025-03-11 23:55:33 +01:00
Rich Jones
e62b45d61c
BF Curricula and More (#309)
* bf curricula
* modulo grid curricula
* minor changes to how difficulty is stored

---------

Co-authored-by: Andreas Koepf <andreas.koepf@provisio.com>
2025-03-09 18:22:22 +01:00
Oliver Stanley
f490b9f760
Tolerant scoring for CodeI/O based on edit distances (#277)
* add zss dep

* codeio edit distance-based scoring

* edit distance tweaks
2025-03-07 22:49:35 +01:00
Oliver Stanley
d1e505a8e9
First version of CodeI/O reasoning data (#264)
* notebook for prepping first set of raw code files
* updated codeio processing notebook for repo-level processing
* fix for edge case in codeio scoring
* Add reformat notebook
* filtering pass
* add non-determinism filtering
* Tweak CodeIODataset & include first real data
* add basic codeio test, metadata
2025-03-05 22:34:11 +01:00
Andreas Köpf
5d7fbac0ad
Minor question template & score_answer improvements (#261)
* math prompt improvements
* ignore brackets in complex_arithmetic results
* improve additional instruction in prompt of polynomial_equations
* more strict tests for score_answer in polynomial_equations
* simplify special reward handling
* fix test_intermediate_integration
* fix sokoban dataset
* add common dataset score_answer consistency test
2025-03-04 21:55:09 +01:00
Oliver
5fa06c961f Fix 2025-02-26 11:17:23 +00:00
Oliver
81c77a495d Add note on code execution to CodeIODataset 2025-02-25 22:39:06 +00:00
Oliver
0252dd905f Move data file & load into memory on first object creation 2025-02-25 22:36:38 +00:00
Oliver
fe502d5eb2 Register CodeIODataset 2025-02-24 18:28:35 +00:00
Oliver
43daec67ea Initial scoring algo for codeio 2025-02-24 18:27:53 +00:00
Oliver
1795c8ea7a Add tiny sample dataset & efficient sampling 2025-02-24 17:58:31 +00:00
Oliver
7b5a12a92c Remove outdated comment 2025-02-23 22:24:13 +00:00
Oliver
e07287e1f9 Add validation 2025-02-23 22:23:45 +00:00
Oliver
f787069fd2 Add input prediction 2025-02-23 20:27:27 +00:00
Oliver
e718168428 Draft CodeIO-derived reasoning problems dataset 2025-02-22 00:56:52 +00:00
Oliver
563480329e Outline CodeIO dataset classes 2025-02-22 00:21:17 +00:00
Andreas Koepf
3e7ff3b084 use native types List->list, Dict->dict, Set->set, Tuple->tuple 2025-02-21 15:15:38 +01:00
Oliver
eb708e88b3 Constrain reward 2025-02-17 19:20:45 +00:00
Oliver
1d0cad46f2 Formatting/scoring improvements for BF & family 2025-02-17 19:08:15 +00:00
Andreas Köpf
3f6b2fc807
Add Coaching & ScoreBoard class (result tracking) (#72)
* feat: Add Coach and ScoreBoard classes for performance tracking and difficulty adjustment
* feat: Add GroupedScores class to wrap aggregated scores
* refactor: Create ScoreStats class with tuple-based score statistics
* feat: Add unit test for Coach with CompositeDataset and multiple datasets
* fix: Add difficulty metadata to leg counting dataset
* feat: Add clear() method to ScoreBoard to reset all stored data
* feat: Add __len__ method to ScoreBoard to return number of scores
* feat: Add update_dataset_config method to CompositeDataset
* cleanup __init__ & imports
2025-02-06 23:15:28 +01:00
Andreas Koepf
ebb88e6c6a lint 2025-01-30 22:55:04 +01:00
Rich Jones
2f9224127d docstrings 2025-01-30 17:20:53 +01:00
Rich Jones
2d9b916f8b rm bad copypaste 2025-01-30 17:16:37 +01:00
Rich Jones
9d4f896329 init definitions 2025-01-30 17:15:48 +01:00
Rich Jones
2393ae0525 difficulty levels 2025-01-30 16:24:28 +01:00
Rich Jones
574df8de23 add contrib 2025-01-30 15:42:11 +01:00
Rich Jones
99bf648989 initial bf working, contrib not committed 2025-01-30 15:38:03 +01:00