pre-commit-ci[bot]
60fb6cae11
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2026-02-20 04:58:47 +00:00
Jai Suphavadeeprasit
ccdd5a1ca6
linting
2026-02-19 23:57:47 -05:00
Jai Suphavadeeprasit
c1a80205cc
update readme
2026-02-19 23:50:05 -05:00
Jai Suphavadeeprasit
11fabaa7f7
gsm8k trial
2026-02-19 21:55:33 -05:00
Jai Suphavadeeprasit
809b88bf30
gsm8k trial
2026-02-19 21:32:40 -05:00
Jai Suphavadeeprasit
bbbfaf1680
gsm8k trial
2026-02-19 21:17:49 -05:00
Jai Suphavadeeprasit
0dcc9156d2
change OPD style
2026-02-19 19:19:23 -05:00
Jai Suphavadeeprasit
527433b5bc
change OPD style
2026-02-19 17:08:27 -05:00
Jai Suphavadeeprasit
33f5696171
Merge branch 'pipelineRL' into OnPolicyDistillation
2026-02-19 16:39:21 -05:00
Jai Suphavadeeprasit
01f090af0d
script test
2026-02-19 16:07:26 -05:00
Jai Suphavadeeprasit
0b266758db
script test
2026-02-19 16:00:50 -05:00
Jai Suphavadeeprasit
ef9f29dbde
readme fix
2026-02-19 15:49:29 -05:00
Jai Suphavadeeprasit
657945fa1d
sanity_check
2026-02-19 15:18:32 -05:00
Jai Suphavadeeprasit
00908ec366
packageification
2026-02-19 15:16:24 -05:00
Jai Suphavadeeprasit
d438702ee4
packageification
2026-02-19 13:37:33 -05:00
Jai Suphavadeeprasit
301d0b2699
model layer stuff
2026-02-18 10:52:20 -05:00
Jai Suphavadeeprasit
fae3f5b09e
readme fixes
2026-02-17 13:44:48 -05:00
Jai Suphavadeeprasit
366ea72384
readme fixes
2026-02-17 12:24:07 -05:00
Jai Suphavadeeprasit
bc0f9ee625
debug changes
2026-02-17 08:15:07 -05:00
Jai Suphavadeeprasit
f52de7441c
found bug
2026-02-16 21:26:44 -05:00
Jai Suphavadeeprasit
573221497d
base env debugging
2026-02-16 21:23:54 -05:00
Jai Suphavadeeprasit
7a90f34d85
base env debugging
2026-02-16 21:20:33 -05:00
Jai Suphavadeeprasit
b0658f6327
base env debugging
2026-02-16 21:05:57 -05:00
Jai Suphavadeeprasit
0e81c62e90
on policy changes
2026-02-16 17:39:37 -05:00
Jai Suphavadeeprasit
becadb54b0
Fix math_server_zero.py to support CLI OpenAI arguments
...
Change ServerBaseline to APIServerConfig in config_init() so that
--openai.base_url and other CLI arguments work for on-policy distillation.
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-16 17:18:01 -05:00
Jai Suphavadeeprasit
cc9b891eba
initial commit
2026-02-16 11:46:20 -05:00
Jai Suphavadeeprasit
43cc71e070
cleanup 3
2026-02-13 12:39:37 -05:00
Jai Suphavadeeprasit
39d307b440
cleanup 2
2026-02-13 11:56:55 -05:00
Jai Suphavadeeprasit
da0583f47d
cleanup
2026-02-13 11:26:26 -05:00
Jai Suphavadeeprasit
95de25aa37
restart issues 3
2026-02-13 11:26:26 -05:00
Jai Suphavadeeprasit
813ac83195
restart issues 3
2026-02-13 11:26:26 -05:00
Jai Suphavadeeprasit
8ee92651b4
restart issues 2
2026-02-13 11:26:26 -05:00
Jai Suphavadeeprasit
e0f77d43d0
restart issues 2
2026-02-13 11:26:26 -05:00
Jai Suphavadeeprasit
0fab717427
restart issues
2026-02-13 11:26:26 -05:00
Jai Suphavadeeprasit
2169943066
math zero 32k
2026-02-13 11:26:26 -05:00
Jai Suphavadeeprasit
6f8ebc99fe
math zero 32k
2026-02-13 11:26:26 -05:00
Jai Suphavadeeprasit
e0d5c989be
math zero
2026-02-13 11:26:26 -05:00
Jai Suphavadeeprasit
f88a00331e
kill old
2026-02-13 11:26:26 -05:00
Jai Suphavadeeprasit
33844c374b
gradient flow fix
2026-02-13 11:26:26 -05:00
Jai Suphavadeeprasit
1c8bb34bc1
wandb integration
2026-02-13 11:26:26 -05:00
Jai Suphavadeeprasit
36ecce6ccb
vllm restart 2
2026-02-13 11:26:26 -05:00
Jai Suphavadeeprasit
26cc509164
vllm restart 1
2026-02-13 11:26:26 -05:00
Jai Suphavadeeprasit
0f0a3f8afc
vllm restart
2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
6d4d705271
enforce eager check 32k context length
2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
f37de35195
enforce eager check
2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
6ce875afbe
visibility fix
2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
a6138b3c65
lora restart saving gradient changes
2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
aa0c5792df
ditching lora nccl 2
2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
9ba6c0e7bb
ditching lora nccl
2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
28bf3d9d60
testing lora
2026-02-13 11:26:25 -05:00