reasoning-gym

mirror of https://github.com/open-thought/reasoning-gym.git synced 2026-04-25 17:10:51 +00:00

History

Andreas Köpf 28dc0932c4 Merge pull request #178 from olliestanley/feature/unsloth-train Add minimal working GRPO training example with Unsloth		2025-02-21 15:37:24 +01:00
..
OpenRLHF	use native types List->list, Dict->dict, Set->set, Tuple->tuple	2025-02-21 15:15:38 +01:00
trl	docs: Update TRL README with GRPO example details and usage instructions (#76 )	2025-02-07 07:56:22 +01:00
unsloth	Better progress tracking	2025-02-20 23:32:54 +00:00
veRL	use native types List->list, Dict->dict, Set->set, Tuple->tuple	2025-02-21 15:15:38 +01:00
word_ladder	use native types List->list, Dict->dict, Set->set, Tuple->tuple	2025-02-21 15:15:38 +01:00