initial verl training codebase (#389)

* fixes for latest verl * composite dataset training experiment * use stateful dataloaders to match verl changes * training readme * add formatting reward * length reward impl * standalone reasoning_gym config section * curriculum learning, new length reward, more config
2026-04-26 17:13:17 +00:00 · 2025-03-20 15:04:57 +00:00 · 2025-03-20 15:04:57 +00:00 · eb69916c1b
commit eb69916c1b
parent ce0a6c4878
8 changed files with 910 additions and 0 deletions
--- a/training/utils/init.py
+++ b/training/utils/init.py
@ -0,0 +1,3 @@
+from .datasets import ReasoningGymDataset, make_dataset
+
+__all__ = ["ReasoningGymDataset", "make_dataset"]