mirror of
https://github.com/open-thought/reasoning-gym.git
synced 2026-04-26 17:13:17 +00:00
* first trl grpo implementation * added config yaml file * added read me and dependencies * updated reward format func |
||
|---|---|---|
| .. | ||
| config | ||
| grpo_config.py | ||
| main_grpo_reward.py | ||
| README.md | ||
| requirements.txt | ||
- Install the requirements in the txt file
pip install -r requirements.txt