pre-commit-ci[bot]
|
cbec584202
|
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
|
2025-07-28 01:30:27 +00:00 |
|
teknium
|
1d523472cc
|
Merge branch 'add-arenahard-v1-environment' of https://github.com/NousResearch/atropos into add-arenahard-v1-environment
|
2025-07-28 01:29:52 +00:00 |
|
teknium
|
20565d8abc
|
update judge confs so it can use any judge model
|
2025-07-28 01:29:50 +00:00 |
|
pre-commit-ci[bot]
|
041a70d891
|
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
|
2025-07-27 03:39:42 +00:00 |
|
teknium
|
e6de7bb432
|
lint
|
2025-07-27 03:39:05 +00:00 |
|
pre-commit-ci[bot]
|
52b505296c
|
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
|
2025-07-27 02:52:39 +00:00 |
|
teknium
|
a0979eb08e
|
add readme section
|
2025-07-27 02:46:51 +00:00 |
|
teknium
|
31b0c6f66d
|
Add arena-hard v1 environment
|
2025-07-26 21:17:00 +00:00 |
|
pre-commit-ci[bot]
|
65682d160a
|
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
|
2025-07-26 21:13:05 +00:00 |
|
teknium
|
aa66b09c13
|
make linter happy
|
2025-07-26 21:12:30 +00:00 |
|
pre-commit-ci[bot]
|
a2e14cf50c
|
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
|
2025-07-26 21:08:01 +00:00 |
|
teknium
|
1039c3d360
|
improve dataloading, ctx len
|
2025-07-26 21:06:45 +00:00 |
|
dmahan93
|
6604a2255b
|
Merge pull request #195 from interstellarninja/feat/interleaved_tool_use
Interleaved Tool-Use Within Reasoning Blocks
|
2025-07-24 08:58:00 -05:00 |
|
pre-commit-ci[bot]
|
97ac993d07
|
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
|
2025-07-24 12:40:01 +00:00 |
|
interstellarninja
|
973864a7e8
|
resolving further
|
2025-07-24 08:39:20 -04:00 |
|
interstellarninja
|
09a6f174a8
|
resolve conflicts and apply hook auto-fixes
|
2025-07-24 08:23:17 -04:00 |
|
pre-commit-ci[bot]
|
0d05750841
|
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
|
2025-07-24 10:58:41 +00:00 |
|
interstellarninja
|
12d77a1e44
|
fixing precommit errors
|
2025-07-24 06:53:41 -04:00 |
|
Teknium
|
62cee8ac66
|
Merge pull request #209 from NousResearch/add-pairwise-judge-environment
Add LLM as a judge environment for eval and train based on RewardBench
|
2025-07-16 13:37:09 -07:00 |
|
pre-commit-ci[bot]
|
6455c305e6
|
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
|
2025-07-16 17:51:15 +00:00 |
|
teknium
|
542185bbcc
|
Merge branch 'add-pairwise-judge-environment' of https://github.com/NousResearch/atropos into add-pairwise-judge-environment
|
2025-07-16 17:48:44 +00:00 |
|
teknium
|
a43520e619
|
one last linter...
|
2025-07-16 17:48:43 +00:00 |
|
pre-commit-ci[bot]
|
eab2c938ea
|
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
|
2025-07-16 16:58:42 +00:00 |
|
teknium
|
18f228615d
|
linter stuff
|
2025-07-16 16:57:51 +00:00 |
|
pre-commit-ci[bot]
|
ffc210e470
|
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
|
2025-07-16 16:51:19 +00:00 |
|
teknium
|
2f37714e84
|
Merge branch 'add-pairwise-judge-environment' of https://github.com/NousResearch/atropos into add-pairwise-judge-environment
|
2025-07-16 16:50:04 +00:00 |
|
teknium
|
0113dc906b
|
add a bunch of extra debugging traces - configurable
|
2025-07-16 16:49:42 +00:00 |
|
Skylar Ray
|
e889324171
|
fix: correct quantum environment repository URL
|
2025-07-16 11:00:45 +03:00 |
|
pre-commit-ci[bot]
|
1af508b27f
|
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
|
2025-07-16 07:46:52 +00:00 |
|
teknium
|
10bb22f557
|
adding debugging
|
2025-07-16 07:46:17 +00:00 |
|
pre-commit-ci[bot]
|
7d980372d3
|
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
|
2025-07-15 18:40:26 +00:00 |
|
teknium
|
02ad3e8661
|
Merge branch 'add-pairwise-judge-environment' of https://github.com/NousResearch/atropos into add-pairwise-judge-environment
|
2025-07-15 18:39:52 +00:00 |
|
teknium
|
8aa540275b
|
add to the envs readme
|
2025-07-15 18:39:50 +00:00 |
|
pre-commit-ci[bot]
|
9f3e2ee460
|
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
|
2025-07-15 18:24:13 +00:00 |
|
teknium
|
856a8455b1
|
please the precommit gods
|
2025-07-15 18:20:44 +00:00 |
|
pre-commit-ci[bot]
|
c053a9f134
|
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
|
2025-07-15 11:40:22 +00:00 |
|
teknium
|
ce1f72059c
|
Merge branch 'add-pairwise-judge-environment' of https://github.com/NousResearch/atropos into add-pairwise-judge-environment
|
2025-07-15 11:39:46 +00:00 |
|
teknium
|
47c396c43f
|
switch to chat completions endpoint to eval closed lab stuff
|
2025-07-15 11:39:29 +00:00 |
|
pre-commit-ci[bot]
|
818ec9d7c1
|
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
|
2025-07-15 11:18:03 +00:00 |
|
teknium
|
982645ce73
|
Implement proper ties category scoring
|
2025-07-15 11:16:15 +00:00 |
|
pre-commit-ci[bot]
|
41c847ddf4
|
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
|
2025-07-14 09:43:20 +00:00 |
|
teknium
|
ef04098718
|
glitch
|
2025-07-14 09:42:44 +00:00 |
|
teknium
|
51d4d52765
|
Merge branch 'add-pairwise-judge-environment' of https://github.com/NousResearch/atropos into add-pairwise-judge-environment
|
2025-07-14 09:42:21 +00:00 |
|
teknium
|
9607880f3d
|
Lots of updates to the environment to cleanup, add more metrics, make more robust - ties has an issue though
|
2025-07-14 09:39:00 +00:00 |
|
pre-commit-ci[bot]
|
107809260d
|
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
|
2025-07-12 11:23:13 +00:00 |
|
teknium
|
e83d796c74
|
add pairwise judgement environment
|
2025-07-12 11:15:56 +00:00 |
|
hjc-puro
|
75a4264f8d
|
Merge pull request #208 from NousResearch/2025-07-08-evals
Add `evaluate_log` method, gsm8k example
|
2025-07-12 06:45:05 +08:00 |
|
hjc-puro
|
6e9baaf9d8
|
table
|
2025-07-11 09:52:19 +00:00 |
|
hjc-puro
|
352e1b8f88
|
comments
|
2025-07-11 03:55:16 +00:00 |
|
hjc-puro
|
b06332623d
|
move time import
|
2025-07-11 00:45:24 +00:00 |
|