Update README.md

This commit is contained in:
dmahan93 2025-04-29 17:57:22 -05:00 committed by GitHub
parent 87c3e918d2
commit 4029acdfb1
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -1,6 +1,6 @@
# GRPO Example Trainer
This directory contains an example script (`grpo.py`) demonstrating how to integrate a custom training loop with the Atropos API for reinforcement learning using the GRPO (Generalized Reinforcement Policy Optimization) algorithm.
This directory contains an example script (`grpo.py`) demonstrating how to integrate a custom training loop with the Atropos API for reinforcement learning using the GRPO (Group Relative Policy Optimization) algorithm.
This example uses `vLLM` for efficient inference during the (simulated) data generation phase and `transformers` for the training phase.
@ -69,4 +69,4 @@ Once the prerequisites are met and configuration is set:
pip install -r example_trainer/requirements.txt
# Run the trainer directly (basic test)
python example_trainer/grpo.py
python example_trainer/grpo.py