reasoning-gym/examples/trl/grpo_config.py
joesharratt1229 a8e11e71be
Test training with trl (#70)
* first trl grpo implementation
* added config yaml file
* added read me and dependencies
* updated reward format func
2025-02-07 07:42:32 +01:00

18 lines
438 B
Python

from dataclasses import dataclass
from typing import Optional
@dataclass
class ScriptArguments:
"""
Arguments for the training script.
"""
dataset_name: str
dataset_config: Optional[str] = None
dataset_train_split: str = "train"
dataset_test_split: str = "test"
gradient_checkpointing_use_reentrant: bool = False
ignore_bias_buffers: bool = False
train_size: int = 1000
eval_size: int = 100