Commit graph

2 commits

Author SHA1 Message Date
Andreas Köpf
7b72c3470b
docs: Update TRL README with GRPO example details and usage instructions (#76) 2025-02-07 07:56:22 +01:00
joesharratt1229
a8e11e71be
Test training with trl (#70)
* first trl grpo implementation
* added config yaml file
* added read me and dependencies
* updated reward format func
2025-02-07 07:42:32 +01:00