Commit graph

736 commits

Author SHA1 Message Date
Wind
eb5be87f81
Update dataset_env.py 2026-01-29 15:16:34 +07:00
Wind
6c2f1ac408
Update dataset_env.py 2026-01-29 15:16:05 +07:00
Wind
2607942ffa
Update dataset_env.py 2026-01-29 15:11:31 +07:00
dmahan93
e8fd85429f
Merge pull request #323 from NousResearch/pre-commit-ci-update-config
[pre-commit.ci] pre-commit autoupdate
2026-01-26 11:02:44 -08:00
dmahan93
b8ec055942
Merge pull request #324 from DeVikingMark/fix/gradient-quantile-prefix
fix: use correct prefix for gradient quantiles with NaN/Inf
2026-01-26 11:01:36 -08:00
dmahan93
cf2b280d52
Merge pull request #325 from crStiv/typo
fix: multiple typos of different importance
2026-01-26 11:00:44 -08:00
pre-commit-ci[bot]
2be7442dd5 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2026-01-26 16:41:26 +00:00
Wind
883043de49
Update instructions.py 2026-01-26 17:14:57 +07:00
dmahan93
5af29933a7
Merge pull request #305 from alt-glitch/sid/verifiers
Verifiers Integration
2026-01-23 10:21:49 -08:00
balyan.sid@gmail.com
4ba69d3a80 revert to using evalbase 2026-01-23 23:41:32 +05:30
balyan.sid@gmail.com
5a20abdce7 switch eval to use managed server adapter impl. moved managed server
adapter
2026-01-23 23:26:29 +05:30
Siddharth Balyan
32d12c05c3
Merge branch 'main' into sid/verifiers 2026-01-23 21:57:13 +05:30
Wind
4f24688d18
Update coding_server.py 2026-01-22 15:19:28 +07:00
Teknium
faf84d241c
Merge branch 'main' into patch-1 2026-01-21 05:55:56 -08:00
crStiv
b44eca5a5e
Fix typo in TODO comment in plot.py 2026-01-20 00:14:51 +02:00
crStiv
8edfbe1de4
Fix typo in error message for resume type 2026-01-20 00:12:12 +02:00
crStiv
abc9ad3c73
Fix typos in comments for clarity 2026-01-20 00:07:50 +02:00
crStiv
ee97038408
Fix typos in instruction description methods
Corrected typos in the docstring for build_description and another function.
2026-01-19 23:58:55 +02:00
crStiv
3db6276299
Fix typo in README.md for GamePigeon 2026-01-19 23:50:20 +02:00
crStiv
31266ba5b9
Fix typo in fight commentator prompt 2026-01-19 23:45:52 +02:00
crStiv
d8f29a6026
Fix typo in fight commentator prompt 2026-01-19 23:45:39 +02:00
Ragnar
5c8ee88f0f
Update callbacks.py 2026-01-19 20:39:21 +02:00
Siddharth Balyan
7f28c52994
Merge branch 'main' into sid/verifiers 2026-01-16 11:50:27 +05:30
Teknium
9047f03109
Merge pull request #297 from NousResearch/add_reasoning_handling_draft
Add support for reasoning models and their variety of providers/endpo…
2026-01-15 19:43:17 -08:00
crStiv
7e12fa015c
Update README.md 2026-01-15 16:09:46 +02:00
crStiv
b624cbd246
Update plot.py 2026-01-15 16:09:00 +02:00
crStiv
14b82ae6cc
Update configs.py 2026-01-15 16:07:00 +02:00
crStiv
941fadd73c
Update run.py 2026-01-15 16:06:43 +02:00
crStiv
20992ed5d5
Update hpo.py 2026-01-15 16:05:27 +02:00
crStiv
d2fbe43e7e
Update lcb_modal_endpoint.py 2026-01-15 16:00:03 +02:00
balyan.sid@gmail.com
c56af35eaa switch to evalbase for verifiers_eval.py 2026-01-15 11:34:40 +05:30
teknium
00a0f5397a Merge branch 'add_reasoning_handling_draft' of https://github.com/NousResearch/atropos into add_reasoning_handling_draft 2026-01-14 13:38:08 +00:00
teknium
3a854cc3af fix linter 2026-01-14 13:38:04 +00:00
balyan.sid@gmail.com
6a27e88023 use managed server 2026-01-14 17:09:01 +05:30
balyan.sid@gmail.com
32320512e8 update verifiers_server to use tokenizer_for_trainer 2026-01-13 15:00:54 +05:30
pre-commit-ci[bot]
79a55ff186 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2026-01-13 07:30:33 +00:00
teknium
2a7dd49328 Merge branch 'add_reasoning_handling_draft' of https://github.com/NousResearch/atropos into add_reasoning_handling_draft 2026-01-13 07:29:48 +00:00
teknium
b33cb7f943 A bit more updates for robustness 2026-01-13 07:29:43 +00:00
Teknium
837fc237ee
Merge branch 'main' into add_reasoning_handling_draft 2026-01-12 09:45:38 -08:00
balyan.sid@gmail.com
a1d1e7d7fe fix env_args, dataset/prompt loading 2026-01-12 10:39:43 +05:30
pre-commit-ci[bot]
7907ffd0ad [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2026-01-12 05:05:11 +00:00
balyan.sid@gmail.com
9db6c0d1ed added better wandb logging 2026-01-12 10:34:05 +05:30
balyan.sid@gmail.com
dceb1d8fd8 parallelize verifiers_server: use generate() for SFT, parallel
ManagedServer contexts for RL
2026-01-12 10:34:05 +05:30
balyan.sid@gmail.com
24b4488c60 clean up eval, pin verifiers version 2026-01-12 10:34:05 +05:30
pre-commit-ci[bot]
d98bc6d9fc [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2026-01-12 10:34:05 +05:30
balyan.sid@gmail.com
cf636595d2 rework server and eval for rl rollout. add in asyncmanagedserver for
verifiers
2026-01-12 10:34:05 +05:30
pre-commit-ci[bot]
3449a4c23d [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2026-01-12 10:34:05 +05:30
balyan.sid@gmail.com
5b09ad86f4 update readme, add sft-datagen to verifiers_server 2026-01-09 19:20:41 +05:30
balyan.sid@gmail.com
636715bb08 add wandb to eval 2026-01-09 16:51:19 +05:30
balyan.sid@gmail.com
dda85430da fix docstrings 2026-01-09 16:25:44 +05:30