reasoning-gym

mirror of https://github.com/open-thought/reasoning-gym.git synced 2026-04-19 12:58:07 +00:00

Author	SHA1	Message	Date
Andreas Köpf	b59ccdefa2	Merge pull request #178 from olliestanley/feature/unsloth-train Add minimal working GRPO training example with Unsloth	2025-02-21 15:37:24 +01:00
Andreas Koepf	3e7ff3b084	use native types List->list, Dict->dict, Set->set, Tuple->tuple	2025-02-21 15:15:38 +01:00
Oliver	e26161713e	Answer scoring fixes to address edge cases	2025-02-20 22:04:01 +00:00
Andreas Koepf	3f98afe47d	more tolerant parsing of futoshiki answers	2025-02-16 14:23:40 +01:00
Oliver	b14ab8ad6f	Add more instruction to generated questions	2025-02-15 13:47:54 +00:00
Oliver	fe55ea21dc	formatting	2025-02-13 19:00:18 +00:00
Oliver	c018e3697a	Correct string formatting	2025-02-13 18:52:48 +00:00
Oliver	8daebcd1a8	Remove rng param	2025-02-09 21:26:03 +00:00
Oliver	dce5d9367d	Greatly speed up solver	2025-02-09 21:23:53 +00:00
Oliver	145ceeb109	Finish first draft futoshiki solver/gen	2025-02-07 00:09:35 +00:00
Oliver	238a41db43	Revert "Experiment with alternative solving/generation approach" This reverts commit `06c75ce1a9`.	2025-02-07 00:06:38 +00:00
Oliver	06c75ce1a9	Experiment with alternative solving/generation approach	2025-02-06 23:58:09 +00:00
Oliver	a369e26448	Initial draft of Futoshiki generator	2025-02-04 17:42:57 +00:00