pre-commit-ci[bot]
0840c26e94
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-10-15 04:19:25 +00:00
ropresearch
e5b8fb8654
clean up
2025-10-10 11:50:39 -04:00
ropresearch
baf4b2d8a8
gzip compression for atropos api
2025-10-10 01:26:52 -04:00
dmahan93
36243bd3f4
Merge pull request #253 from NousResearch/rop/gen-params
...
group temps, sample temps, and logprob api params
2025-10-01 12:58:03 -05:00
ropresearch
6a20b90549
added gen params for latest examples endpoint
2025-10-01 13:05:37 -04:00
ropresearch
b9ecb0cc7f
docs update
2025-09-25 17:00:05 -04:00
ropresearch
c3fc68879c
group temps, sample temps, and logprob api params
2025-09-25 16:41:58 -04:00
pre-commit-ci[bot]
e02d2c373e
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-09-21 22:33:59 +00:00
Ragnar
60addb9a7d
Update server.py
2025-09-22 00:32:39 +02:00
shannonsands
1a808e2038
Revert "Fix multiple scored data groups ( #223 )"
...
This reverts commit 67b3144113 .
2025-08-29 17:55:45 +10:00
shannonsands
67b3144113
Fix multiple scored data groups ( #223 )
...
* removed changes to other files
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fail on scores empty
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-08-29 15:47:32 +10:00
Dakota
11f1303da0
add error logging to collect_trajectories so they don't fail silently
2025-08-15 16:34:21 -05:00
shannonsands
9f23c732dd
qwen tokenizer wrapper & fixed jinja template for tool handling ( #224 )
...
* added qwen tokenizer wrapper & fixed jinja template for tool handling issues in the official HF one
* moved jinja template into it's own file
2025-07-30 11:57:15 +10:00
Teknium
62cee8ac66
Merge pull request #209 from NousResearch/add-pairwise-judge-environment
...
Add LLM as a judge environment for eval and train based on RewardBench
2025-07-16 13:37:09 -07:00
pre-commit-ci[bot]
3d2d9e67fa
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-07-15 11:42:46 +00:00
Alexey Gorbatovski
53984580c8
Bug fix
2025-07-15 14:37:55 +03:00
hjc-puro
04e69d4a19
appease precommit
2025-07-12 22:51:39 +00:00
hjc-puro
a94e4c9bf0
autoscale metrics table
2025-07-12 22:41:14 +00:00
hjc-puro
6e9baaf9d8
table
2025-07-11 09:52:19 +00:00
hjc-puro
72210cf4ad
rename fn
2025-07-11 04:04:55 +00:00
hjc-puro
d133ba3867
comment
2025-07-11 03:54:03 +00:00
hjc-puro
ccb8eaf230
move table to util
2025-07-11 03:52:24 +00:00
hjc-puro
5e61331360
simplify schema
2025-07-11 03:49:49 +00:00
hjc-puro
0d4ce37b73
add eval types
2025-07-11 03:36:55 +00:00
hjc-puro
290e087fc5
remove some imports
2025-07-11 03:25:10 +00:00
hjc-puro
68da3809e2
move table to display util
2025-07-11 02:06:56 +00:00
hjc-puro
3e08c6d788
simplify schema
2025-07-11 00:52:09 +00:00
hjc-puro
6c64df0226
remove jsonlines dependency
2025-07-11 00:42:55 +00:00
hjc-puro
da0d64ae89
linting errors
2025-07-11 00:29:57 +00:00
hjc-puro
e601251893
gsm8k eval example
2025-07-11 00:22:36 +00:00
hjc-puro
eb926dc58b
working evals
2025-07-10 01:45:21 +00:00
hjc-puro
f4de3ad6f5
add printing
2025-07-09 23:35:26 +00:00
hjc-puro
a11af27298
add eval saving cli args
2025-07-09 03:12:13 +00:00
hjc-puro
5519f190d2
add evaluate subcommand to cli
2025-07-07 17:39:33 -04:00
dmahan93
58446dbcb1
Merge pull request #204 from NousResearch/multienv-enforce-mins
...
Multienv with enforced minimum samples in a batch
2025-07-07 08:53:43 -05:00
Dakota
08e14cc745
feat: add minimum batch allocation support for environments
...
- Add min_batch_allocation parameter to ensure environments contribute minimum proportion to each batch
- Implement grab_batch_with_minimum_allocations function with proper scaling when allocations exceed 100%
- Add mixed-size group buffering to handle variable-sized data submissions
- Update server to use minimum allocation logic when any env has min_batch_allocation set
- Add comprehensive tests for minimum allocation scenarios
- Update documentation in API README and CONFIG.md
- Update example environments to demonstrate the feature
This feature allows critical environments to guarantee they contribute at least a specified proportion (0.0-1.0) to each training batch, ensuring important data sources are always represented during training.
🤖 Generated with [Claude Code](https://claude.ai/code )
Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-07 08:50:28 -05:00
dmahan93
3b8d8a6f09
Merge pull request #202 from Myashka/main
...
Include run name in wandb initialization in BaseEnv
2025-07-07 08:05:47 -05:00
Alexey Gorbatovski
35c542328a
Fix infinite loop in wait_for_sem by updating semaphore values inside loop
2025-07-06 00:27:45 +03:00
pre-commit-ci[bot]
ee5257522a
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-07-04 14:34:37 +00:00
Alexey Gorbatovski
14c70c0e68
Include run name in wandb initialization in BaseEnv
2025-07-04 17:13:34 +03:00
Dakota
683559afd2
allow inf (<= 0 max_token_len) generations if trainer requests it, but raise a warning so that users can check their logs and get info if their trainers are doing something weird
2025-07-01 09:52:10 -05:00
Micke
af57208da2
fix error in function inference_node_wandb_watcher.py
2025-06-27 22:13:37 +02:00
crStiv
e9a547ce32
Update base.py
2025-06-19 22:52:26 +02:00
teknium1
6d9523fe0b
add tasks_per_step arg to multiply by group_size for bs calculation
2025-06-10 01:54:52 -07:00
Dakota
e13526d308
Fix API to accept messages without reward field + comprehensive tests
...
- Made reward field truly optional in messages (no auto-addition)
- Accept custom roles (dog, cat, etc.) beyond standard ones
- Added 24 new tests for edge cases (tuples, unicode, large content)
- Reorganized test structure: moved from testing/ to atroposlib/tests/
- Fixed legacy API tests and removed tests requiring missing data files
All 43 tests pass\! Fixes message handling for SFT use cases.
🤖 Generated with [Claude Code](https://claude.ai/code )
Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-09 14:03:08 -05:00
Cypher Pepe
24e963d393
fixed typo envs/README.md
2025-06-08 16:50:35 +03:00
Dakota
f3bbc6a42d
Fix import ordering with isort
...
- Move typing_extensions import to proper location
- Satisfy pre-commit isort requirements
2025-06-04 10:40:41 -05:00
Dakota
0ff55bf2cf
Fix TypedDict import for Python 3.10 compatibility
...
- Use typing_extensions.TypedDict instead of typing.TypedDict
- Fixes Pydantic error on Python < 3.12
2025-06-04 10:37:51 -05:00
Dakota
522e049d27
Remove unused config_handler.py and its import
...
- Deleted config_handler.py which had unused torch import
- Cleaned up utils/__init__.py to remove ConfigHandler import
2025-06-04 10:21:46 -05:00
hjc-puro
b5e7746c99
remove process defaults, respect config init
2025-06-02 21:19:45 -04:00