Commit graph

26 commits

Author SHA1 Message Date
teknium
aa66b09c13 make linter happy 2025-07-26 21:12:30 +00:00
pre-commit-ci[bot]
a2e14cf50c [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-07-26 21:08:01 +00:00
teknium
1039c3d360 improve dataloading, ctx len 2025-07-26 21:06:45 +00:00
pre-commit-ci[bot]
6455c305e6 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-07-16 17:51:15 +00:00
teknium
542185bbcc Merge branch 'add-pairwise-judge-environment' of https://github.com/NousResearch/atropos into add-pairwise-judge-environment 2025-07-16 17:48:44 +00:00
teknium
a43520e619 one last linter... 2025-07-16 17:48:43 +00:00
pre-commit-ci[bot]
eab2c938ea [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-07-16 16:58:42 +00:00
teknium
18f228615d linter stuff 2025-07-16 16:57:51 +00:00
pre-commit-ci[bot]
ffc210e470 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-07-16 16:51:19 +00:00
teknium
2f37714e84 Merge branch 'add-pairwise-judge-environment' of https://github.com/NousResearch/atropos into add-pairwise-judge-environment 2025-07-16 16:50:04 +00:00
teknium
0113dc906b add a bunch of extra debugging traces - configurable 2025-07-16 16:49:42 +00:00
pre-commit-ci[bot]
1af508b27f [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-07-16 07:46:52 +00:00
teknium
10bb22f557 adding debugging 2025-07-16 07:46:17 +00:00
pre-commit-ci[bot]
9f3e2ee460 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-07-15 18:24:13 +00:00
teknium
856a8455b1 please the precommit gods 2025-07-15 18:20:44 +00:00
pre-commit-ci[bot]
c053a9f134 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-07-15 11:40:22 +00:00
teknium
ce1f72059c Merge branch 'add-pairwise-judge-environment' of https://github.com/NousResearch/atropos into add-pairwise-judge-environment 2025-07-15 11:39:46 +00:00
teknium
47c396c43f switch to chat completions endpoint to eval closed lab stuff 2025-07-15 11:39:29 +00:00
pre-commit-ci[bot]
818ec9d7c1 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-07-15 11:18:03 +00:00
teknium
982645ce73 Implement proper ties category scoring 2025-07-15 11:16:15 +00:00
pre-commit-ci[bot]
41c847ddf4 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-07-14 09:43:20 +00:00
teknium
ef04098718 glitch 2025-07-14 09:42:44 +00:00
teknium
51d4d52765 Merge branch 'add-pairwise-judge-environment' of https://github.com/NousResearch/atropos into add-pairwise-judge-environment 2025-07-14 09:42:21 +00:00
teknium
9607880f3d Lots of updates to the environment to cleanup, add more metrics, make more robust - ties has an issue though 2025-07-14 09:39:00 +00:00
pre-commit-ci[bot]
107809260d [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-07-12 11:23:13 +00:00
teknium
e83d796c74 add pairwise judgement environment 2025-07-12 11:15:56 +00:00