Teknium
e75ce6ccce
Merge pull request #176 from emmanuel-ferdman/main
...
Display cat behaviors file path on error
2025-06-13 04:42:48 -07:00
Teknium
eeeb0f1cd2
Merge pull request #172 from NousResearch/improve-data-dumping-in-sweRL
...
add additional data dumping features
2025-06-13 04:40:11 -07:00
Teknium
1c98c2746b
Merge pull request #171 from NousResearch/add-more-sft-datagen-cli-args
...
add tasks_per_step arg to multiply by group_size for bs calculation
2025-06-13 04:39:45 -07:00
pre-commit-ci[bot]
dcb926b73f
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-06-13 11:39:36 +00:00
Teknium
32b739a757
Merge branch 'main' into add-format-following-environment
2025-06-13 04:39:06 -07:00
Teknium
1d7ccc80a5
Merge pull request #179 from NousResearch/letter-counting-environment
...
Letter counting environment - Update default config options
2025-06-13 04:38:34 -07:00
teknium1
ec6b9bb626
Merge branch 'letter-counting-environment' of https://github.com/NousResearch/atropos into letter-counting-environment
2025-06-13 04:27:32 -07:00
Teknium
e5e76d0dd0
Merge pull request #177 from NousResearch/letter-counting-environment
...
Letter counting environment
2025-06-12 11:20:13 -07:00
pre-commit-ci[bot]
2f9132ae63
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-06-12 15:20:13 +00:00
dmahan93
e43979cc3a
Merge pull request #178 from NousResearch/use-precommit-ci-not-action
...
switch to using precommit ci not action
2025-06-12 10:20:07 -05:00
Dakota
3bf55611f6
switch to using precommit ci not action
2025-06-12 10:17:58 -05:00
Dakota
d3e6ddddbc
fixed pre-commit :)
2025-06-12 10:12:49 -05:00
teknium1
81cb80982c
update some base config options
2025-06-12 00:41:55 -07:00
teknium1
7a89524345
add readme section for the environment
2025-06-12 00:36:03 -07:00
teknium1
4a7e5b2b7c
Many updates
2025-06-12 00:32:50 -07:00
Emmanuel Ferdman
7dd9bf9c5c
Display cat behaviors file path on error
...
Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
2025-06-11 16:09:20 -07:00
teknium1
199ae15d0b
initital letter counting environment
2025-06-11 15:27:21 -07:00
teknium1
54268a76ce
add additional data dumping features
2025-06-10 01:59:25 -07:00
teknium1
6d9523fe0b
add tasks_per_step arg to multiply by group_size for bs calculation
2025-06-10 01:54:52 -07:00
teknium1
71b1e7023b
Make default configs better
2025-06-10 01:30:40 -07:00
teknium1
7b91614d46
add more info on rejection sampling in readme
2025-06-10 01:25:39 -07:00
teknium1
8e1d160eef
add answer format environment for rejection sampling
2025-06-10 01:20:49 -07:00
dmahan93
a26794afd2
Merge pull request #168 from maximevtush/main
...
Minor Fixes: Typo Correction in README and Message Clarification in Tasks
2025-06-09 14:24:07 -05:00
dmahan93
84a1277abb
Merge pull request #169 from NousResearch/fix-messages-handling-api-sft
...
API Message + SFT fix
2025-06-09 14:05:26 -05:00
Dakota
e13526d308
Fix API to accept messages without reward field + comprehensive tests
...
- Made reward field truly optional in messages (no auto-addition)
- Accept custom roles (dog, cat, etc.) beyond standard ones
- Added 24 new tests for edge cases (tuples, unicode, large content)
- Reorganized test structure: moved from testing/ to atroposlib/tests/
- Fixed legacy API tests and removed tests requiring missing data files
All 43 tests pass\! Fixes message handling for SFT use cases.
🤖 Generated with [Claude Code](https://claude.ai/code )
Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-09 14:03:08 -05:00
Maxim Evtush
16bd33284f
Update tasks.py
2025-06-09 15:39:58 +02:00
Maxim Evtush
d0913d187b
Update README.md
2025-06-09 15:39:04 +02:00
paulsengh
b38f014b9f
feat: add pay-to-play environment with mixture of judges and micropayments
2025-06-08 23:36:50 -07:00
Teknium
24dd0a71b4
Merge pull request #165 from NousResearch/improve-complexity-reasoning-gym
...
add reasoning gym randomization for complexity as well as curriculum support
2025-06-08 14:59:27 -07:00
Teknium
f34f6c09f3
Merge pull request #166 from cypherpepe/main
...
Fix broken README links and minor typo in docs
2025-06-08 14:59:01 -07:00
Cypher Pepe
24e963d393
fixed typo envs/README.md
2025-06-08 16:50:35 +03:00
Cypher Pepe
5f3deae8d4
fixed dead links README.md
2025-06-08 16:43:11 +03:00
teknium1
f999f90627
add support for composite task
2025-06-08 04:39:50 -07:00
teknium1
398e3ddeaa
add randomization for complexity as well as curriculum support
2025-06-08 03:07:07 -07:00
Teknium
7537a6ef7f
Merge pull request #163 from NousResearch/add-reasoning-gym-environment
...
Add reasoning gym env
2025-06-06 17:24:48 -07:00
teknium1
a4b22c38d7
make eval vars config options
2025-06-06 15:24:00 -07:00
teknium1
be94857084
add seed to default configs for clarity
2025-06-06 14:56:55 -07:00
interstellarninja
60be1bbbe8
BaseConfigEnv subclass for experimental variables
2025-06-06 04:46:53 -04:00
teknium1
79188d8d6a
Add reasoning gym env
2025-06-05 17:30:25 -07:00
interstellarninja
c5b161764c
Fix tool calling turn filtering in multiturn environment
...
- Change filtering from >= to == MAX_TOOL_CALL_TURNS to ensure exact match
- Add VALIDATE_THINK_BLOCKS flag for optional <think> block validation
- Refactor data structure from flat expected_calls to turn-based expected_calls_by_turn
- Extract helper methods from collect_trajectories for better code organization
- Fix Turn 3 issue where prompts ended with tool responses instead of generating tool calls
🤖 Generated with [Claude Code](https://claude.ai/code )
Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-05 10:49:29 -04:00
interstellarninja
fdfe23ea39
creating multi-turn tool-use rl environment
2025-06-04 14:13:01 -04:00
dmahan93
26c0a39555
Merge pull request #159 from NousResearch/pytest-multiple-versions-actions
...
Add pytest workflow for Python 3.10 and 3.12
2025-06-04 11:31:14 -05:00
Dakota
61fdc37f61
Replace isort with ruff for import sorting
...
- Update pre-commit config to use ruff with --select=I for imports only
- Apply ruff import sorting to fix pre-commit issues
- Ruff and black work together without conflicts
2025-06-04 11:28:30 -05:00
Dakota
55cdb83cbf
Update pre-commit hooks to latest versions and fix issues
...
- Update pre-commit hooks: v5.0.0, black 25.1.0, isort 6.0.1, flake8 7.2.0
- Fix isort import ordering in lean_proof_env.py
- Fix flake8 F824 false positive in spatial_env.py with noqa comment
2025-06-04 10:58:37 -05:00
Dakota
f3bbc6a42d
Fix import ordering with isort
...
- Move typing_extensions import to proper location
- Satisfy pre-commit isort requirements
2025-06-04 10:40:41 -05:00
Dakota
0ff55bf2cf
Fix TypedDict import for Python 3.10 compatibility
...
- Use typing_extensions.TypedDict instead of typing.TypedDict
- Fixes Pydantic error on Python < 3.12
2025-06-04 10:37:51 -05:00
Dakota
02492c88b3
Fix pre-commit issues in check-no-torch.yml
...
- Remove trailing whitespace
- Add newline at end of file
2025-06-04 10:34:57 -05:00
Dakota
3af0f007e2
Remove torch from main deps after successful test
...
The torch detection workflow works perfectly! ✅
2025-06-04 10:32:25 -05:00
Dakota
02b94b2982
TEST: Add torch to main deps to test detection workflow
...
This commit should fail the torch check!
2025-06-04 10:29:42 -05:00
Dakota
f2d3060db6
Add workflow to prevent torch in main dependencies
...
- Checks pyproject.toml on every PR that modifies it
- Fails loudly if torch is in main dependencies
- Provides clear guidance on moving to optional deps
2025-06-04 10:26:05 -05:00