reasoning-gym

mirror of https://github.com/open-thought/reasoning-gym.git synced 2026-04-27 17:23:19 +00:00

History

joesharratt1229 a8e11e71be Test training with trl (#70 ) * first trl grpo implementation * added config yaml file * added read me and dependencies * updated reward format func		2025-02-07 07:42:32 +01:00
..
OpenRLHF	lint, seed & size for figlet	2025-01-30 00:58:34 +01:00
trl	Test training with trl (#70 )	2025-02-07 07:42:32 +01:00
veRL	reduce veRL example size	2025-02-01 23:56:11 +00:00
word_ladder	lint	2025-02-03 11:35:30 +00:00