* v0
* 2 gpu setup
* improve parsing from yaml
* update yaml dataset example
* remove restriction on flash attn
* more comments
* first version of the readme
* pin torch
* simplify requirements
* just flash attn
* use set env instead
* simpler set env
* readme
* add wandb project to setup
* update template
* update model id
* post init to capture the config and weight
* extract metadata
* update config
* update dataset config
* move env for wandb project
* pre-commit
* remove qwen-math from training
* more instructions
* unused import
* remove trl old
* warmup ratio
* warmup ratio
* change model id
* change model_id
* add info about CUDA_VISIBLE_DEVICES
2025-06-21 00:01:31 +02:00
Renamed from training/qwen-math/scripts/set/set_env.sh (Browse further)