Commit graph

43 commits

Author SHA1 Message Date
Jai Suphavadeeprasit
3de03d6db3 single copy 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
5ba06c7d4a threading 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
ca1ec60869 improve default 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
eed13670de better debugging 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
3ac4a64f6f patching problem 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
5af1a4a974 basic changes 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
007f4f275d changes 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
80f67f979a error handling 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
9e53076a82 param locations update 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
197fce640f daemon errors 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
3995e0af7d monkey patch fixes 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
7b975f3adc changes based on torchtitan 2 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
53b29472b4 changes based on torchtitan 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
078dd4a333 Cleanup 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
39e94c4278 weight updates async 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
b3874b658a vllm underlying weights 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
b0d35be8a4 IPC updates 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
e278978fa1 health changes 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
f51ae77f54 add missing parameter 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
d6f389f86f readme updates 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
88ccaa0ea5 standardize the training approach 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
ebdbc54842 tracking 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
9498d9576f training bug 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
d978eff127 smol changes 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
adc3ae712b design choice - LoRA and shared vLLM through the bridge 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
22648bd912 gradient checkpointing issue for LoRAs 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
1e7b7cf841 stuff 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
db7414329b generate endpoint with logprobs 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
e956af11a2 changes 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
426c0fac4c local version 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
7b143a7d68 correction 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
3ed23058c3 initial commit 2026-02-13 11:26:22 -05:00
Ridwannurudeen
5e2e84835b [docs] Clarify prerequisites, fix Python version inconsistency, and add troubleshooting section 2026-02-01 23:39:37 +01:00
Dakota
e6ac3abdcb add managed vllm server 2025-11-07 13:06:49 -06:00
ropresearch
6a20b90549 added gen params for latest examples endpoint 2025-10-01 13:05:37 -04:00
ropresearch
1a68f691f6 linting fixes 2025-09-25 17:17:44 -04:00
ropresearch
c3fc68879c group temps, sample temps, and logprob api params 2025-09-25 16:41:58 -04:00
Brawn
eb179e7fca
Update grpo.py 2025-08-14 20:20:41 +03:00
Brawn
6dccdcc67e
fix: division-by-zero in gradient calculation 2025-08-14 14:33:46 +03:00
dmahan93
40b12dae60 run pre-commit on all files 2025-05-09 09:54:20 -05:00
Teknium
3863ece98b
Update README.md 2025-05-07 22:22:26 -07:00
dmahan93
4029acdfb1
Update README.md 2025-04-29 17:57:22 -05:00
Dakota Nous
621d00dd80 first commit 2025-04-29 12:10:10 -07:00