reasoning-gym/training/qwen-math/recipes/Qwen2.5-3B-Instruct/grpo
Zafir Stojanovski 0cda6b1205
qwen math training code (#435)
* qwen math training code

* pre-commit
2025-05-16 13:19:19 +02:00
..
model_curated_rg_math.yaml qwen math training code (#435) 2025-05-16 13:19:19 +02:00