reasoning-gym/examples
joesharratt1229 a8e11e71be
Test training with trl (#70)
* first trl grpo implementation
* added config yaml file
* added read me and dependencies
* updated reward format func
2025-02-07 07:42:32 +01:00
..
OpenRLHF lint, seed & size for figlet 2025-01-30 00:58:34 +01:00
trl Test training with trl (#70) 2025-02-07 07:42:32 +01:00
veRL reduce veRL example size 2025-02-01 23:56:11 +00:00
word_ladder lint 2025-02-03 11:35:30 +00:00