This commit is contained in:
Oliver 2025-04-29 19:15:33 +01:00
parent 374760577e
commit f211b8544f
2 changed files with 2 additions and 2 deletions

View file

@ -38,7 +38,7 @@ First, activate the virtual environment you prepared.
Example GRPO training usage:
```bash
python3 -u train_grpo.py --config-path configs/external_generalisation --config-name math_curriculum_qwen_7b $@ 2>&1 | tee verl_output.log
python3 -u train_grpo.py --config-path configs/external_generalisation --config-name math_qwen_3b $@ 2>&1 | tee verl_output.log
```
Then, having saved this as a bash script such as `train.sh`, run it:

View file

@ -54,7 +54,7 @@ data:
actor_rollout_ref:
hybrid_engine: True
model:
path: Qwen/Qwen2.5-7B-Instruct
path: Qwen/Qwen2.5-3B-Instruct
external_lib: null
override_config: { }
enable_gradient_checkpointing: True