Test training with trl (#70)

* first trl grpo implementation * added config yaml file * added read me and dependencies * updated reward format func
2026-04-22 16:49:06 +00:00 · 2025-02-07 06:42:32 +00:00 · 2025-02-07 06:42:32 +00:00 · d61db3772a
commit d61db3772a
parent a607db79f7
5 changed files with 287 additions and 0 deletions
--- a/examples/trl/requirements.txt
+++ b/examples/trl/requirements.txt
@ -0,0 +1,10 @@
+torch --index-url https://download.pytorch.org/whl/cu124
+torchvision --index-url https://download.pytorch.org/whl/cu124
+torchaudio --index-url https://download.pytorch.org/whl/cu124
+datasets
+peft
+transformers
+trl
+wandb
+huggingface_hub
+flash-attn --no-build-isolation