reasoning-gym/examples/veRL/chain_sum/config
Oliver Stanley bd13b1b92a
Fix chain sum veRL example for latest veRL (#371)
* fixes for latest verl

* add balance_batch cofg

* 1 -> 2 gpu

* tweaks

* also add raw ids to server script
2025-03-14 20:15:54 +01:00
..
grpo_trainer.yaml Fix chain sum veRL example for latest veRL (#371) 2025-03-14 20:15:54 +01:00
ppo_trainer.yaml Fix chain sum veRL example for latest veRL (#371) 2025-03-14 20:15:54 +01:00