Test training with trl (#70)

* first trl grpo implementation
* added config yaml file
* added read me and dependencies
* updated reward format func
This commit is contained in:
joesharratt1229 2025-02-07 06:42:32 +00:00 committed by GitHub
parent 3f6b2fc807
commit a8e11e71be
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
5 changed files with 287 additions and 0 deletions

View file

@ -0,0 +1,10 @@
torch --index-url https://download.pytorch.org/whl/cu124
torchvision --index-url https://download.pytorch.org/whl/cu124
torchaudio --index-url https://download.pytorch.org/whl/cu124
datasets
peft
transformers
trl
wandb
huggingface_hub
flash-attn --no-build-isolation