Jai Suphavadeeprasit
|
3de03d6db3
|
single copy
|
2026-02-13 11:26:25 -05:00 |
|
Jai Suphavadeeprasit
|
5ba06c7d4a
|
threading
|
2026-02-13 11:26:25 -05:00 |
|
Jai Suphavadeeprasit
|
ca1ec60869
|
improve default
|
2026-02-13 11:26:25 -05:00 |
|
Jai Suphavadeeprasit
|
eed13670de
|
better debugging
|
2026-02-13 11:26:25 -05:00 |
|
Jai Suphavadeeprasit
|
3ac4a64f6f
|
patching problem
|
2026-02-13 11:26:25 -05:00 |
|
Jai Suphavadeeprasit
|
5af1a4a974
|
basic changes
|
2026-02-13 11:26:25 -05:00 |
|
Jai Suphavadeeprasit
|
007f4f275d
|
changes
|
2026-02-13 11:26:25 -05:00 |
|
Jai Suphavadeeprasit
|
80f67f979a
|
error handling
|
2026-02-13 11:26:25 -05:00 |
|
Jai Suphavadeeprasit
|
9e53076a82
|
param locations update
|
2026-02-13 11:26:25 -05:00 |
|
Jai Suphavadeeprasit
|
197fce640f
|
daemon errors
|
2026-02-13 11:26:25 -05:00 |
|
Jai Suphavadeeprasit
|
3995e0af7d
|
monkey patch fixes
|
2026-02-13 11:26:25 -05:00 |
|
Jai Suphavadeeprasit
|
7b975f3adc
|
changes based on torchtitan 2
|
2026-02-13 11:26:25 -05:00 |
|
Jai Suphavadeeprasit
|
53b29472b4
|
changes based on torchtitan
|
2026-02-13 11:26:25 -05:00 |
|
Jai Suphavadeeprasit
|
078dd4a333
|
Cleanup
|
2026-02-13 11:26:25 -05:00 |
|
Jai Suphavadeeprasit
|
39e94c4278
|
weight updates async
|
2026-02-13 11:26:25 -05:00 |
|
Jai Suphavadeeprasit
|
b3874b658a
|
vllm underlying weights
|
2026-02-13 11:26:25 -05:00 |
|
Jai Suphavadeeprasit
|
b0d35be8a4
|
IPC updates
|
2026-02-13 11:26:25 -05:00 |
|
Jai Suphavadeeprasit
|
e278978fa1
|
health changes
|
2026-02-13 11:26:25 -05:00 |
|
Jai Suphavadeeprasit
|
f51ae77f54
|
add missing parameter
|
2026-02-13 11:26:25 -05:00 |
|
Jai Suphavadeeprasit
|
d6f389f86f
|
readme updates
|
2026-02-13 11:26:25 -05:00 |
|
Jai Suphavadeeprasit
|
88ccaa0ea5
|
standardize the training approach
|
2026-02-13 11:26:25 -05:00 |
|
Jai Suphavadeeprasit
|
ebdbc54842
|
tracking
|
2026-02-13 11:26:25 -05:00 |
|
Jai Suphavadeeprasit
|
9498d9576f
|
training bug
|
2026-02-13 11:26:25 -05:00 |
|
Jai Suphavadeeprasit
|
d978eff127
|
smol changes
|
2026-02-13 11:26:25 -05:00 |
|
Jai Suphavadeeprasit
|
adc3ae712b
|
design choice - LoRA and shared vLLM through the bridge
|
2026-02-13 11:26:25 -05:00 |
|
Jai Suphavadeeprasit
|
22648bd912
|
gradient checkpointing issue for LoRAs
|
2026-02-13 11:26:25 -05:00 |
|
Jai Suphavadeeprasit
|
1e7b7cf841
|
stuff
|
2026-02-13 11:26:25 -05:00 |
|
Jai Suphavadeeprasit
|
db7414329b
|
generate endpoint with logprobs
|
2026-02-13 11:26:25 -05:00 |
|
Jai Suphavadeeprasit
|
e956af11a2
|
changes
|
2026-02-13 11:26:25 -05:00 |
|
Jai Suphavadeeprasit
|
426c0fac4c
|
local version
|
2026-02-13 11:26:25 -05:00 |
|
Jai Suphavadeeprasit
|
7b143a7d68
|
correction
|
2026-02-13 11:26:25 -05:00 |
|
Jai Suphavadeeprasit
|
3ed23058c3
|
initial commit
|
2026-02-13 11:26:22 -05:00 |
|
Ridwannurudeen
|
5e2e84835b
|
[docs] Clarify prerequisites, fix Python version inconsistency, and add troubleshooting section
|
2026-02-01 23:39:37 +01:00 |
|
Dakota
|
e6ac3abdcb
|
add managed vllm server
|
2025-11-07 13:06:49 -06:00 |
|
ropresearch
|
6a20b90549
|
added gen params for latest examples endpoint
|
2025-10-01 13:05:37 -04:00 |
|
ropresearch
|
1a68f691f6
|
linting fixes
|
2025-09-25 17:17:44 -04:00 |
|
ropresearch
|
c3fc68879c
|
group temps, sample temps, and logprob api params
|
2025-09-25 16:41:58 -04:00 |
|
Brawn
|
eb179e7fca
|
Update grpo.py
|
2025-08-14 20:20:41 +03:00 |
|
Brawn
|
6dccdcc67e
|
fix: division-by-zero in gradient calculation
|
2025-08-14 14:33:46 +03:00 |
|
dmahan93
|
40b12dae60
|
run pre-commit on all files
|
2025-05-09 09:54:20 -05:00 |
|
Teknium
|
3863ece98b
|
Update README.md
|
2025-05-07 22:22:26 -07:00 |
|
dmahan93
|
4029acdfb1
|
Update README.md
|
2025-04-29 17:57:22 -05:00 |
|
Dakota Nous
|
621d00dd80
|
first commit
|
2025-04-29 12:10:10 -07:00 |
|