reasoning-gym/examples/trl
joesharratt1229 a8e11e71be
Test training with trl (#70)
* first trl grpo implementation
* added config yaml file
* added read me and dependencies
* updated reward format func
2025-02-07 07:42:32 +01:00
..
config Test training with trl (#70) 2025-02-07 07:42:32 +01:00
grpo_config.py Test training with trl (#70) 2025-02-07 07:42:32 +01:00
main_grpo_reward.py Test training with trl (#70) 2025-02-07 07:42:32 +01:00
README.md Test training with trl (#70) 2025-02-07 07:42:32 +01:00
requirements.txt Test training with trl (#70) 2025-02-07 07:42:32 +01:00

  1. Install the requirements in the txt file
pip install -r requirements.txt