reasoning-gym

open-thought/reasoning-gym

Fork 0

mirror of https://github.com/open-thought/reasoning-gym.git synced 2026-04-19 12:58:07 +00:00

Commit graph

Author	SHA1	Message	Date
Oliver Stanley	eb69916c1b	initial verl training codebase (#389 ) * fixes for latest verl * composite dataset training experiment * use stateful dataloaders to match verl changes * training readme * add formatting reward * length reward impl * standalone reasoning_gym config section * curriculum learning, new length reward, more config	2025-03-20 15:04:57 +00:00

Author

SHA1

Message

Date

Oliver Stanley

eb69916c1b

initial verl training codebase (#389 )

* fixes for latest verl
* composite dataset training experiment
* use stateful dataloaders to match verl changes
* training readme
* add formatting reward
* length reward impl
* standalone reasoning_gym config section
* curriculum learning, new length reward, more config

2025-03-20 15:04:57 +00:00

1 commit