Commit graph

61 commits

Author SHA1 Message Date
Jai Suphavadeeprasit
a6138b3c65 lora restart saving gradient changes 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
9ba6c0e7bb ditching lora nccl 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
2501e33ae3 nccl loras 2026-02-13 11:26:25 -05:00
pre-commit-ci[bot]
11f495a381 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
e932369777 linting 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
d0b097974b python versioning problems 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
c8884348c7 cleanup 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
58403dd052 major refactor 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
806888d9d3 pipelineRL 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
e891a7f808 testing scripts 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
347f9ea363 hot swap adapter 2026-02-13 11:26:25 -05:00
pre-commit-ci[bot]
c86b36844b [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
c70635ef71 LORA 1 2026-02-13 11:26:25 -05:00
pre-commit-ci[bot]
ea1a7e7482 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
8250ba97bb linter 2026-02-13 11:26:25 -05:00
pre-commit-ci[bot]
47d2bb0ecd [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
989a3ea159 keep debugging flags for future use 2026-02-13 11:26:25 -05:00
pre-commit-ci[bot]
897ef4723c [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
0bc4f1556f readme updates 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
a6faaee71d vllm weight bridge 2026-02-13 11:26:25 -05:00
pre-commit-ci[bot]
e1aca5ecf5 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
f0468e620e single copy now working as expected 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
dfa87df1f1 prints 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
0f7713a575 clean up 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
b1eccaa597 fused memory 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
ffcd9367f8 other changes 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
99488ab3fe adjusting buffers 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
bf50ed37d9 buffer efficiency 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
dff4065982 debugging 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
7eb5381262 pass all the informaiton 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
b12d0575e1 unsure 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
2225b4623f main changes 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
2dc1c2a981 pipeline changes 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
f3c6275263 streamline process 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
19b3116b84 serialization errors 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
3de03d6db3 single copy 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
3ac4a64f6f patching problem 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
5af1a4a974 basic changes 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
007f4f275d changes 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
80f67f979a error handling 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
9e53076a82 param locations update 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
53b29472b4 changes based on torchtitan 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
b0d35be8a4 IPC updates 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
e278978fa1 health changes 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
f51ae77f54 add missing parameter 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
88ccaa0ea5 standardize the training approach 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
ebdbc54842 tracking 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
9498d9576f training bug 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
d978eff127 smol changes 2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
adc3ae712b design choice - LoRA and shared vLLM through the bridge 2026-02-13 11:26:25 -05:00