cfg

2026-04-19 12:58:07 +00:00 · 2025-04-29 19:15:33 +01:00 · 2025-04-29 19:15:33 +01:00 · f211b8544f
commit f211b8544f
parent 374760577e
2 changed files with 2 additions and 2 deletions
--- a/training/README.md
+++ b/training/README.md
@ -38,7 +38,7 @@ First, activate the virtual environment you prepared.
 Example GRPO training usage:

 ```bash
-python3 -u train_grpo.py --config-path configs/external_generalisation --config-name math_curriculum_qwen_7b $@ 2>&1 | tee verl_output.log
+python3 -u train_grpo.py --config-path configs/external_generalisation --config-name math_qwen_3b $@ 2>&1 | tee verl_output.log
 ```

 Then, having saved this as a bash script such as `train.sh`, run it:
--- a/training/configs/external_generalisation/math_qwen_3b.yaml
+++ b/training/configs/external_generalisation/math_qwen_3b.yaml
@ -54,7 +54,7 @@ data:
 actor_rollout_ref:
  hybrid_engine: True
  model:
-    path: Qwen/Qwen2.5-7B-Instruct
+    path: Qwen/Qwen2.5-3B-Instruct
    external_lib: null
    override_config: { }
    enable_gradient_checkpointing: True