Jai Suphavadeeprasit
f76f9d1596
cleanup
2025-08-19 12:03:13 -04:00
pre-commit-ci[bot]
62b72589c6
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-08-19 12:03:13 -04:00
Jai Suphavadeeprasit
e55a7a0100
add_danger
2025-08-19 12:03:13 -04:00
teknium
bed7ddcb95
add more default categories
2025-08-19 12:03:13 -04:00
teknium
39f0103313
fix dataset
2025-08-19 12:03:13 -04:00
teknium
ff7a2569dc
update default max_toks
2025-08-19 12:03:13 -04:00
teknium
69135320b4
initial refusalbenchv2
2025-08-19 12:03:13 -04:00
hjc-puro
8c3ea257cd
Merge pull request #235 from NousResearch/bibtex
...
Update bibtex
2025-08-18 13:39:55 -04:00
dmahan93
83003d0988
Merge pull request #238 from NousResearch/pre-commit-ci-update-config
...
[pre-commit.ci] pre-commit autoupdate
2025-08-18 12:39:04 -05:00
pre-commit-ci[bot]
5b1fb70132
[pre-commit.ci] pre-commit autoupdate
...
updates:
- [github.com/astral-sh/ruff-pre-commit: v0.12.8 → v0.12.9](https://github.com/astral-sh/ruff-pre-commit/compare/v0.12.8...v0.12.9 )
2025-08-18 16:39:43 +00:00
dmahan93
4e3ad29fac
Merge pull request #237 from NousResearch/log-errors-from-collect-trajectories
...
add error logging to collect_trajectories so they don't fail silently
2025-08-15 16:56:37 -05:00
Dakota
11f1303da0
add error logging to collect_trajectories so they don't fail silently
2025-08-15 16:34:21 -05:00
dmahan93
628bd3d2ad
Merge pull request #236 from brawncode/patch-1
...
fix: division-by-zero in gradient calculation
2025-08-14 13:18:39 -05:00
Brawn
eb179e7fca
Update grpo.py
2025-08-14 20:20:41 +03:00
Brawn
6dccdcc67e
fix: division-by-zero in gradient calculation
2025-08-14 14:33:46 +03:00
hjc-puro
5faa2be188
update bibtex
2025-08-14 02:57:42 -04:00
dmahan93
9c472d8439
Merge pull request #231 from NousResearch/pre-commit-ci-update-config
...
[pre-commit.ci] pre-commit autoupdate
2025-08-12 10:59:00 -05:00
dmahan93
fa6ec96f48
Merge pull request #233 from rejected-l/main
...
build: update checkout action to v5
2025-08-12 10:58:37 -05:00
Rej Ect
6c34126cb9
build: update checkout action to v5
2025-08-12 15:20:27 +03:00
shannonsands
46f0602227
Diplomacy trainer env ( #227 )
...
* minimal implementation, simplified challenge registry
* need game save logic
* fixed challenge gen, works with local test
* updated challenge gen with wider ranges, working with local script
* runs working correctly, wandb stats look ok
* linting
* Add diplomacy environment with AI_Diplomacy submodule
- Add diplomacy_env_minimal.py for diplomacy game environment
- Add atropos_client_minimal.py for client interface
- Add diplomacy_local_server.py for local game server
- Add AI_Diplomacy submodule from GoodStartLabs/AI_Diplomacy
- Fix import ordering and remove unused imports
* test file working, moving to cluster to test training
* updated gitignore
* removed logs
* minor fixes, training running now
* readded proxy reg and queue system
* linting
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* queue gameid bug, refactored
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* cleaned up configs & allowed for openrouter models to be easily used
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* linting
* Remove duplicate dependencies from diplomacy requirements.txt
Only keep AI_Diplomacy-specific dependencies that aren't already in the main project
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-08-12 09:02:16 +10:00
pre-commit-ci[bot]
ed97dd112a
[pre-commit.ci] pre-commit autoupdate
...
updates:
- [github.com/pre-commit/pre-commit-hooks: v5.0.0 → v6.0.0](https://github.com/pre-commit/pre-commit-hooks/compare/v5.0.0...v6.0.0 )
- [github.com/astral-sh/ruff-pre-commit: v0.12.7 → v0.12.8](https://github.com/astral-sh/ruff-pre-commit/compare/v0.12.7...v0.12.8 )
2025-08-11 16:42:09 +00:00
dmahan93
4fe67e698d
Merge pull request #228 from NousResearch/pre-commit-ci-update-config
...
[pre-commit.ci] pre-commit autoupdate
2025-08-04 12:31:35 -05:00
pre-commit-ci[bot]
70043c4eb2
[pre-commit.ci] pre-commit autoupdate
...
updates:
- [github.com/astral-sh/ruff-pre-commit: v0.12.5 → v0.12.7](https://github.com/astral-sh/ruff-pre-commit/compare/v0.12.5...v0.12.7 )
2025-08-04 16:41:05 +00:00
shannonsands
47cb15745c
Textworld minimal ( #225 )
...
* minimal implementation, simplified challenge registry
* need game save logic
* fixed challenge gen, works with local test
* updated challenge gen with wider ranges, working with local script
* runs working correctly, wandb stats look ok
* linting
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* removed unused imports
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-08-01 10:16:35 +10:00
Teknium
1900a577d7
Update README.md
...
Remove hackathon event teaser
2025-07-30 13:56:11 -07:00
Teknium
be66e120d9
Merge pull request #219 from NousResearch/add-arenahard-v1-environment
...
Add arena-hard v1 environment
2025-07-30 09:35:14 -07:00
pre-commit-ci[bot]
65aea8bb21
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-07-30 15:10:36 +00:00
teknium
75f1cf6d2a
move eval envs to eval_environments and update readmes
2025-07-30 15:09:34 +00:00
shannonsands
9f23c732dd
qwen tokenizer wrapper & fixed jinja template for tool handling ( #224 )
...
* added qwen tokenizer wrapper & fixed jinja template for tool handling issues in the official HF one
* moved jinja template into it's own file
2025-07-30 11:57:15 +10:00
dmahan93
56fb50a503
Merge pull request #222 from NousResearch/pre-commit-ci-update-config
...
[pre-commit.ci] pre-commit autoupdate
2025-07-29 15:52:20 -05:00
dmahan93
9734a10290
Merge pull request #220 from Aboozle1/add-my-environment
...
Add Word Hunt environment
2025-07-28 12:02:01 -05:00
Aboozle1
3ce68aed38
Merge branch 'main' into add-my-environment
2025-07-28 11:50:50 -05:00
Abhaykhanna3
9d7bcc523f
Fix(PR): Address reviewer feedback
...
- Remove redundant requirements.txt
- Fix leading newline in prompt templates
2025-07-28 11:48:02 -05:00
pre-commit-ci[bot]
d65f7f842d
[pre-commit.ci] pre-commit autoupdate
...
updates:
- [github.com/astral-sh/ruff-pre-commit: v0.12.4 → v0.12.5](https://github.com/astral-sh/ruff-pre-commit/compare/v0.12.4...v0.12.5 )
2025-07-28 16:38:41 +00:00
Abhaykhanna3
b5234d4214
Add Word Hunt environment for training models on 4x4 letter grids
...
- Trie-based solver, official scoring, normalized rewards
- Configurable token limit and detailed README with dictionary download link
- Removes large Dictionary.txt from tracking and adds ignore rules
- All tests pass and pre-commit hooks are clean
2025-07-28 00:37:36 -05:00
pre-commit-ci[bot]
4c88a4bbb9
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-07-28 01:37:03 +00:00
teknium
aaebb8d6bb
linter linter
2025-07-28 01:35:49 +00:00
pre-commit-ci[bot]
cbec584202
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-07-28 01:30:27 +00:00
teknium
1d523472cc
Merge branch 'add-arenahard-v1-environment' of https://github.com/NousResearch/atropos into add-arenahard-v1-environment
2025-07-28 01:29:52 +00:00
teknium
20565d8abc
update judge confs so it can use any judge model
2025-07-28 01:29:50 +00:00
pre-commit-ci[bot]
041a70d891
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-07-27 03:39:42 +00:00
teknium
e6de7bb432
lint
2025-07-27 03:39:05 +00:00
pre-commit-ci[bot]
52b505296c
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-07-27 02:52:39 +00:00
teknium
a0979eb08e
add readme section
2025-07-27 02:46:51 +00:00
teknium
31b0c6f66d
Add arena-hard v1 environment
2025-07-26 21:17:00 +00:00
Teknium
b272b7bae9
Merge pull request #218 from NousResearch/pairwise-judgement-env-updates
...
Pairwise Judgement Environment - improve dataloading, ctx len
2025-07-26 14:14:24 -07:00
pre-commit-ci[bot]
65682d160a
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-07-26 21:13:05 +00:00
teknium
aa66b09c13
make linter happy
2025-07-26 21:12:30 +00:00
pre-commit-ci[bot]
a2e14cf50c
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-07-26 21:08:01 +00:00
teknium
1039c3d360
improve dataloading, ctx len
2025-07-26 21:06:45 +00:00