Update README.md

This commit is contained in:
Teknium 2025-05-07 22:22:26 -07:00 committed by GitHub
parent a282604baa
commit 3863ece98b
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -2,6 +2,8 @@
This directory contains an example script (`grpo.py`) demonstrating how to integrate a custom training loop with the Atropos API for reinforcement learning using the GRPO (Group Relative Policy Optimization) algorithm.
**Note: Example trainer does not support multimodal training out of the box. As other trainers add support for Atropos, we will list them in the main readme, some of which may support multimodal RL - please check the main repo readme for any updates.**
This example uses `vLLM` for efficient inference during the (simulated) data generation phase and `transformers` for the training phase.
**Note:** This script is intended as a *reference example* for API integration and basic training setup. It is not optimized for large-scale, efficient training.