mirror of
https://github.com/NousResearch/atropos.git
synced 2026-04-19 12:57:58 +00:00
Update README.md
This commit is contained in:
parent
87c3e918d2
commit
4029acdfb1
1 changed files with 2 additions and 2 deletions
|
|
@ -1,6 +1,6 @@
|
|||
# GRPO Example Trainer
|
||||
|
||||
This directory contains an example script (`grpo.py`) demonstrating how to integrate a custom training loop with the Atropos API for reinforcement learning using the GRPO (Generalized Reinforcement Policy Optimization) algorithm.
|
||||
This directory contains an example script (`grpo.py`) demonstrating how to integrate a custom training loop with the Atropos API for reinforcement learning using the GRPO (Group Relative Policy Optimization) algorithm.
|
||||
|
||||
This example uses `vLLM` for efficient inference during the (simulated) data generation phase and `transformers` for the training phase.
|
||||
|
||||
|
|
@ -69,4 +69,4 @@ Once the prerequisites are met and configuration is set:
|
|||
pip install -r example_trainer/requirements.txt
|
||||
|
||||
# Run the trainer directly (basic test)
|
||||
python example_trainer/grpo.py
|
||||
python example_trainer/grpo.py
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue