pre-commit-ci[bot]
60fb6cae11
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2026-02-20 04:58:47 +00:00
Jai Suphavadeeprasit
ccdd5a1ca6
linting
2026-02-19 23:57:47 -05:00
Jai Suphavadeeprasit
11fabaa7f7
gsm8k trial
2026-02-19 21:55:33 -05:00
Jai Suphavadeeprasit
0dcc9156d2
change OPD style
2026-02-19 19:19:23 -05:00
Jai Suphavadeeprasit
527433b5bc
change OPD style
2026-02-19 17:08:27 -05:00
Jai Suphavadeeprasit
33f5696171
Merge branch 'pipelineRL' into OnPolicyDistillation
2026-02-19 16:39:21 -05:00
Jai Suphavadeeprasit
00908ec366
packageification
2026-02-19 15:16:24 -05:00
Jai Suphavadeeprasit
bc0f9ee625
debug changes
2026-02-17 08:15:07 -05:00
Jai Suphavadeeprasit
0e81c62e90
on policy changes
2026-02-16 17:39:37 -05:00
Jai Suphavadeeprasit
becadb54b0
Fix math_server_zero.py to support CLI OpenAI arguments
...
Change ServerBaseline to APIServerConfig in config_init() so that
--openai.base_url and other CLI arguments work for on-policy distillation.
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-16 17:18:01 -05:00
Jai Suphavadeeprasit
2501e33ae3
nccl loras
2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
950be6f0d4
math server
2026-02-13 11:26:25 -05:00
pre-commit-ci[bot]
11f495a381
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
33505fe981
math zero work arounds
2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
4cf0416e78
evals errors
2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
abe4cec824
evals errors
2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
7326bec25c
evals erros
2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
5d4baf8c76
evals
2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
67b322353d
evals
2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
806888d9d3
pipelineRL
2026-02-13 11:26:25 -05:00
Jai Suphavadeeprasit
407a22ba12
Save the eval to the disk
2026-02-13 11:25:49 -05:00
dmahan93
1580ab5934
Merge pull request #365 from alireza78a/fix/replace-debug-prints-with-logger
...
fix: replace debug print statements with logger
2026-02-09 21:01:38 -08:00
Alireza
6b92ee16ec
fix duplicate code + add safety checks
2026-02-09 10:58:49 +03:30
alireza78a
1303cb59e8
fix: replace debug print statements with logger in dataset_env and infinimath_env
2026-02-07 14:51:33 +00:00
Teknium
462abbebf7
Merge pull request #339 from VolodymyrBg/bg
...
chore: fix typos
2026-01-31 09:03:17 -08:00
Teknium
efc85528bc
Merge pull request #338 from windlgrass/fix-init-current-item
...
fix: initialize current_item in __init__ to prevent AttributeError
2026-01-31 09:02:06 -08:00
Teknium
8b22416dd4
Merge branch 'main' into fix-duplicate-code
2026-01-31 08:52:43 -08:00
VolodymyrBg
f285bbd417
Update refusalbench_environment.py
2026-01-29 12:43:15 +02:00
VolodymyrBg
94f29eac18
Update simpleqa_eval.py
2026-01-29 12:42:28 +02:00
VolodymyrBg
347edc9188
Update instructions.py
2026-01-29 12:31:52 +02:00
VolodymyrBg
466fd96b41
Update patient.py
2026-01-29 12:16:31 +02:00
VolodymyrBg
39f3509965
Update instruction_following_algorithm_environment.py
2026-01-29 11:22:05 +02:00
Wind
eb5be87f81
Update dataset_env.py
2026-01-29 15:16:34 +07:00
Wind
6c2f1ac408
Update dataset_env.py
2026-01-29 15:16:05 +07:00
Wind
2607942ffa
Update dataset_env.py
2026-01-29 15:11:31 +07:00
dmahan93
e8fd85429f
Merge pull request #323 from NousResearch/pre-commit-ci-update-config
...
[pre-commit.ci] pre-commit autoupdate
2026-01-26 11:02:44 -08:00
dmahan93
b8ec055942
Merge pull request #324 from DeVikingMark/fix/gradient-quantile-prefix
...
fix: use correct prefix for gradient quantiles with NaN/Inf
2026-01-26 11:01:36 -08:00
dmahan93
cf2b280d52
Merge pull request #325 from crStiv/typo
...
fix: multiple typos of different importance
2026-01-26 11:00:44 -08:00
pre-commit-ci[bot]
2be7442dd5
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2026-01-26 16:41:26 +00:00
Wind
42601e2325
Update instructions_utils.py
2026-01-26 17:24:12 +07:00
Wind
7feb826fed
Update instructions_registry.py
2026-01-26 17:23:39 +07:00
Wind
883043de49
Update instructions.py
2026-01-26 17:14:57 +07:00
dmahan93
5af29933a7
Merge pull request #305 from alt-glitch/sid/verifiers
...
Verifiers Integration
2026-01-23 10:21:49 -08:00
balyan.sid@gmail.com
4ba69d3a80
revert to using evalbase
2026-01-23 23:41:32 +05:30
balyan.sid@gmail.com
5a20abdce7
switch eval to use managed server adapter impl. moved managed server
...
adapter
2026-01-23 23:26:29 +05:30
Siddharth Balyan
32d12c05c3
Merge branch 'main' into sid/verifiers
2026-01-23 21:57:13 +05:30
Wind
4f24688d18
Update coding_server.py
2026-01-22 15:19:28 +07:00
Teknium
faf84d241c
Merge branch 'main' into patch-1
2026-01-21 05:55:56 -08:00
crStiv
b44eca5a5e
Fix typo in TODO comment in plot.py
2026-01-20 00:14:51 +02:00
crStiv
8edfbe1de4
Fix typo in error message for resume type
2026-01-20 00:12:12 +02:00