reasoning-gym/examples/veRL/chain_sum
Oliver Stanley bd13b1b92a
Fix chain sum veRL example for latest veRL (#371)
* fixes for latest verl

* add balance_batch cofg

* 1 -> 2 gpu

* tweaks

* also add raw ids to server script
2025-03-14 20:15:54 +01:00
..
config Fix chain sum veRL example for latest veRL (#371) 2025-03-14 20:15:54 +01:00
launch_on_2gpu_server.sh Basic curriculum (#198) 2025-03-07 11:22:12 +01:00
launch_on_4gpu.sh Basic curriculum (#198) 2025-03-07 11:22:12 +01:00
main_ppo_custom_reward.py Fix chain sum veRL example for latest veRL (#371) 2025-03-14 20:15:54 +01:00
main_ppo_custom_reward_server.py Fix chain sum veRL example for latest veRL (#371) 2025-03-14 20:15:54 +01:00
train_grpo.sh Basic curriculum (#198) 2025-03-07 11:22:12 +01:00
train_grpo_server.sh Basic curriculum (#198) 2025-03-07 11:22:12 +01:00
train_ppo.sh Basic curriculum (#198) 2025-03-07 11:22:12 +01:00