mirror of
https://github.com/open-thought/reasoning-gym.git
synced 2026-04-19 12:58:07 +00:00
cfg
This commit is contained in:
parent
374760577e
commit
f211b8544f
2 changed files with 2 additions and 2 deletions
|
|
@ -38,7 +38,7 @@ First, activate the virtual environment you prepared.
|
|||
Example GRPO training usage:
|
||||
|
||||
```bash
|
||||
python3 -u train_grpo.py --config-path configs/external_generalisation --config-name math_curriculum_qwen_7b $@ 2>&1 | tee verl_output.log
|
||||
python3 -u train_grpo.py --config-path configs/external_generalisation --config-name math_qwen_3b $@ 2>&1 | tee verl_output.log
|
||||
```
|
||||
|
||||
Then, having saved this as a bash script such as `train.sh`, run it:
|
||||
|
|
|
|||
|
|
@ -54,7 +54,7 @@ data:
|
|||
actor_rollout_ref:
|
||||
hybrid_engine: True
|
||||
model:
|
||||
path: Qwen/Qwen2.5-7B-Instruct
|
||||
path: Qwen/Qwen2.5-3B-Instruct
|
||||
external_lib: null
|
||||
override_config: { }
|
||||
enable_gradient_checkpointing: True
|
||||
Loading…
Add table
Add a link
Reference in a new issue