Commit graph

19 commits

Author SHA1 Message Date
shannonsands
46f0602227
Diplomacy trainer env (#227)
* minimal implementation, simplified challenge registry

* need game save logic

* fixed challenge gen, works with local test

* updated challenge gen with wider ranges, working with local script

* runs working correctly, wandb stats look ok

* linting

* Add diplomacy environment with AI_Diplomacy submodule

- Add diplomacy_env_minimal.py for diplomacy game environment
- Add atropos_client_minimal.py for client interface
- Add diplomacy_local_server.py for local game server
- Add AI_Diplomacy submodule from GoodStartLabs/AI_Diplomacy
- Fix import ordering and remove unused imports

* test file working, moving to cluster to test training

* updated gitignore

* removed logs

* minor fixes, training running now

* readded proxy reg and queue system

* linting

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* queue gameid bug, refactored

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* cleaned up configs & allowed for openrouter models to be easily used

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* linting

* Remove duplicate dependencies from diplomacy requirements.txt

Only keep AI_Diplomacy-specific dependencies that aren't already in the main project

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-08-12 09:02:16 +10:00
shannonsands
47cb15745c
Textworld minimal (#225)
* minimal implementation, simplified challenge registry

* need game save logic

* fixed challenge gen, works with local test

* updated challenge gen with wider ranges, working with local script

* runs working correctly, wandb stats look ok

* linting

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused imports

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-08-01 10:16:35 +10:00
Abhaykhanna3
b5234d4214 Add Word Hunt environment for training models on 4x4 letter grids
- Trie-based solver, official scoring, normalized rewards
- Configurable token limit and detailed README with dictionary download link
- Removes large Dictionary.txt from tracking and adds ignore rules
- All tests pass and pre-commit hooks are clean
2025-07-28 00:37:36 -05:00
Shannon Sands
a403b16ec4 commiting 2025-05-27 16:18:26 +10:00
Shannon Sands
8df34efc56 Resolve merge conflict in .gitignore 2025-05-27 15:56:22 +10:00
Shannon Sands
98d9ef87a2 Merge remote-tracking branch 'slyracoon23/main' into merge-slyracoon23-contributions
# Conflicts:
#	.gitignore
2025-05-26 16:23:41 +10:00
Shannon Sands
a70a8d7086 Merge remote-tracking branch 'tsadpbb/main' into merge-tsadpbb-contributions 2025-05-26 16:01:43 +10:00
Shannon Sands
441fd1036d Merge Karthik-Ragunath conversational style DPO environment contribution 2025-05-26 10:25:08 +10:00
Shannon Sands
a58562447f Merge branch 'joshuajerin-selcube' into merge-joshuajerin-contributions 2025-05-26 09:07:25 +10:00
hjc-puro
bef6a0b99a ignore uv.lock 2025-05-20 17:38:43 -04:00
Karthik-Ragunath
9125bd5f80 pushing file 2025-05-18 17:58:09 -07:00
Drew Sny
0d60e6c855 Add MeteorologyForecastRL environment for Atropos hackathon submission 2025-05-18 17:32:48 -07:00
Joshua Jerin
b49a441d46 32 2025-05-18 17:15:11 -07:00
Kirill Igumenshchev
45dc3f370d feat: Add .aider* to .gitignore and add environment file. 2025-05-18 17:05:50 -07:00
David van Vliet
f5787a6f1b Less verbose cards and choices for better UX 2025-05-18 17:03:08 -07:00
Alexander Speicher
02ff663ebe Patient Doctor Loop 2025-05-18 15:45:59 -07:00
Joshua Jerin
7e1de80695 add jsonl file 2025-05-18 15:27:37 -07:00
hjc-puro
9a8ae1630b import refactor 2025-05-02 01:00:04 -07:00
Dakota Nous
621d00dd80 first commit 2025-04-29 12:10:10 -07:00