mirror of
https://github.com/open-thought/reasoning-gym.git
synced 2026-04-19 12:58:07 +00:00
769 B
769 B
Setup
Prepare virtual environment, e.g.
python -m venv venv
source venv/bin/activate
Install dependencies
pip install -r requirements.txt
pip install flash-attn --no-build-isolation
Login to W&B and HuggingFace if desired
wandb login
huggingface-cli login
Training
Here we assume two GPUs, with one used for inference (vLLM) and the other for training (accelerate). You may need to adjust some settings for different GPU configs.
Run the vLLM server for inference:
CUDA_VISIBLE_DEVICES=0 vf-vllm --model Qwen/Qwen2.5-1.5B-Instruct --tensor-parallel-size 1
Run the training script using accelerate:
CUDA_VISIBLE_DEVICES=1 accelerate launch --config-file zero3.yaml --num-processes 1 vf_rg.py