mirror of
https://github.com/open-thought/reasoning-gym.git
synced 2026-04-23 16:55:05 +00:00
* first trl grpo implementation * added config yaml file * added read me and dependencies * updated reward format func
10 lines
267 B
Text
10 lines
267 B
Text
torch --index-url https://download.pytorch.org/whl/cu124
|
|
torchvision --index-url https://download.pytorch.org/whl/cu124
|
|
torchaudio --index-url https://download.pytorch.org/whl/cu124
|
|
datasets
|
|
peft
|
|
transformers
|
|
trl
|
|
wandb
|
|
huggingface_hub
|
|
flash-attn --no-build-isolation
|