mirror of
https://github.com/NousResearch/atropos.git
synced 2026-04-27 17:23:08 +00:00
python versioning problems
This commit is contained in:
parent
bab3d85d85
commit
d0b097974b
4 changed files with 5 additions and 105 deletions
|
|
@ -1,19 +1,11 @@
|
|||
#!/usr/bin/env python3
|
||||
"""
|
||||
GRPO Trainer - Multi-Mode Entry Point
|
||||
|
||||
Supports three training modes:
|
||||
- none (legacy): Periodic checkpoint saves + vLLM restarts
|
||||
- shared_vllm: Single-copy mode with CUDA IPC weight sharing
|
||||
- lora_only: LoRA adapter training
|
||||
|
||||
For the unified single-command experience (shared_vllm + auto vLLM launch),
|
||||
use run.py instead:
|
||||
python example_trainer/run.py --model Qwen/Qwen3-4B --training-steps 20
|
||||
|
||||
This script requires vLLM to be running separately (except for legacy mode
|
||||
which manages vLLM internally).
|
||||
|
||||
Usage:
|
||||
# Legacy mode (manages vLLM internally)
|
||||
python -m example_trainer.grpo --model-name Qwen/Qwen2.5-3B-Instruct
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue