Commit graph

61 commits

Author SHA1 Message Date
Jai Suphavadeeprasit
ab8d2f2dac Bloat reduction 2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
01af0777bc readme update 2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
2384ab3dcd clean up 2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
906802299c fused memory 2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
9a95ec5aa1 other changes 2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
6efec3f1c5 adjusting buffers 2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
a92c935fba buffer efficiency 2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
5bba112244 debugging 2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
1f79e86ba0 pass all the informaiton 2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
96871e0724 unsure 2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
09bbfac574 pytorch underbelly 2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
f4e66705ea patched 2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
17e93cbda4 main changes 2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
e2006b4015 pipeline changes 2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
df3651990d streamline process 2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
d3ef94ef11 serialization errors 2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
c94a432341 single copy 1 2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
fcd426e934 single copy 2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
9f3ddc4d98 threading 2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
64cf8e3f82 improve default 2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
9df62a8f64 better debugging 2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
fad8e77be2 patching problem 2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
80d2608c4e basic changes 2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
14ebf7a492 changes 2026-03-02 11:18:51 -05:00
Jai Suphavadeeprasit
5640d7de25 error handling 2026-03-02 11:18:51 -05:00
Jai Suphavadeeprasit
ff8eaf9e3c param locations update 2026-03-02 11:18:51 -05:00
Jai Suphavadeeprasit
e2c99f7f97 daemon errors 2026-03-02 11:18:51 -05:00
Jai Suphavadeeprasit
4348345dac monkey patch fixes 2026-03-02 11:18:51 -05:00
Jai Suphavadeeprasit
0d71de18d8 changes based on torchtitan 2 2026-03-02 11:18:51 -05:00
Jai Suphavadeeprasit
27b122a415 changes based on torchtitan 2026-03-02 11:18:51 -05:00
Jai Suphavadeeprasit
67e27def11 Cleanup 2026-03-02 11:18:51 -05:00
Jai Suphavadeeprasit
9512177d0a weight updates async 2026-03-02 11:18:51 -05:00
Jai Suphavadeeprasit
e033e24c64 vllm underlying weights 2026-03-02 11:18:51 -05:00
Jai Suphavadeeprasit
533f0bf286 IPC updates 2026-03-02 11:18:51 -05:00
Jai Suphavadeeprasit
78ea8bc3e7 health changes 2026-03-02 11:18:51 -05:00
Jai Suphavadeeprasit
3b469f2445 add missing parameter 2026-03-02 11:18:51 -05:00
Jai Suphavadeeprasit
12c182f3d4 readme updates 2026-03-02 11:18:51 -05:00
Jai Suphavadeeprasit
689055f0ec standardize the training approach 2026-03-02 11:18:51 -05:00
Jai Suphavadeeprasit
b1b9943473 tracking 2026-03-02 11:18:51 -05:00
Jai Suphavadeeprasit
e4fc514763 training bug 2026-03-02 11:18:51 -05:00
Jai Suphavadeeprasit
c336d981ce smol changes 2026-03-02 11:18:51 -05:00
Jai Suphavadeeprasit
a1725e4ae2 design choice - LoRA and shared vLLM through the bridge 2026-03-02 11:18:51 -05:00
Jai Suphavadeeprasit
e202e2c288 gradient checkpointing issue for LoRAs 2026-03-02 11:18:51 -05:00
Jai Suphavadeeprasit
a7bdc0270d stuff 2026-03-02 11:18:51 -05:00
Jai Suphavadeeprasit
f5c847d39d generate endpoint with logprobs 2026-03-02 11:18:51 -05:00
Jai Suphavadeeprasit
2b240bbd2e changes 2026-03-02 11:18:51 -05:00
Jai Suphavadeeprasit
79842edba7 local version 2026-03-02 11:18:51 -05:00
Jai Suphavadeeprasit
2d3c07dcae correction 2026-03-02 11:18:51 -05:00
Jai Suphavadeeprasit
61221dd1a2 initial commit 2026-03-02 11:18:49 -05:00
Jai Suphavadeeprasit
836c346406 narrow down scope further 2026-02-27 13:15:23 -05:00