reasoning-gym/reasoning_gym
Zafir Stojanovski 49b1dbbcce
Fix misleading instruction in shortest_path asking for "length" instead of path (#523)
The prompt asked to "find the length of the shortest path" but the expected
answer is a sequence of directions. This caused models to answer with a number
instead of directions, degrading evaluation results.

Closes #522

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-25 13:02:23 +01:00
..
algebra make task entries json serializable (#443) 2025-06-02 08:57:15 +02:00
algorithmic Fix/cryptarithm multiple solutions (#517) 2026-03-15 13:53:26 +01:00
arc (evals): Medium configs (#415) 2025-04-14 08:25:31 +02:00
arithmetic [fix #484] resolve basic_arithmetic fails when size is large (#485) 2025-07-07 09:46:23 +01:00
coaching Curr exp (#487) 2025-07-25 20:38:47 +01:00
code Codeio prompt fix (#513) 2025-11-13 11:48:20 +01:00
cognition fix color_cubes answer strings, update gallery with latest envs (#464) 2025-06-08 13:16:54 +02:00
data fix encoding to be able to run on win (#502) 2025-08-18 09:19:45 +01:00
games Add assertion for maze constraints and limit _random_floor_cell attempts (#515) 2026-01-16 09:56:39 +01:00
geometry make task entries json serializable (#443) 2025-06-02 08:57:15 +02:00
graphs Fix misleading instruction in shortest_path asking for "length" instead of path (#523) 2026-03-25 13:02:23 +01:00
induction fix(envs): Add source dataset and index to metadata (#388) 2025-03-20 11:12:14 +00:00
logic feat(curriculum): Knights and Knaves configs (#488) 2025-07-31 10:18:05 +02:00
probability Add probability dataset (initial: Coin Flip dataset + curriculum) (#505) 2025-09-06 15:59:23 +01:00
__init__.py fix: Register missing coin_flip (#507) 2025-09-15 14:23:30 +02:00
composite.py Feat/curr adj (#394) 2025-04-02 06:39:14 +01:00
dataset.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
factory.py Feat: expose score_answer function without needing to instantiate a dataset (#422) 2025-04-18 10:36:44 +02:00
utils.py support python 3.10 (#450) 2025-06-04 10:34:01 +01:00
version_manager.py use native types List->list, Dict->dict, Set->set, Tuple->tuple 2025-02-21 15:15:38 +01:00