reasoning-gym/examples/OpenRLHF
2025-01-28 14:40:06 +00:00
..
.gitignore add first example with OpenRLHF 2025-01-28 14:40:06 +00:00
custom_reward.py add first example with OpenRLHF 2025-01-28 14:40:06 +00:00
custom_reward_ppo.sh add first example with OpenRLHF 2025-01-28 14:40:06 +00:00