reasoning-gym/training/configs
2025-03-26 21:05:37 +00:00
..
llama3.1_1b_grpo.yaml initial verl training codebase (#389) 2025-03-20 15:04:57 +00:00
qwen2.5_1.5b_grpo.yaml initial verl training codebase (#389) 2025-03-20 15:04:57 +00:00
qwen2.5_3b_grpo.yaml allow longer outputs 2025-03-26 21:05:37 +00:00