mirror of
https://github.com/open-thought/reasoning-gym.git
synced 2026-04-24 17:05:03 +00:00
* fixes for latest verl * composite dataset training experiment * use stateful dataloaders to match verl changes * training readme * add formatting reward * length reward impl * standalone reasoning_gym config section * curriculum learning, new length reward, more config
3 lines
75 B
Python
3 lines
75 B
Python
from .ray_grpo_trainer import RayGRPOTrainer
|
|
|
|
__all__ = ["RayGRPOTrainer"]
|