teknium
4738fabd57
convert fundamentals prediction env to use managed server
2025-11-14 09:48:56 +00:00
teknium
ff46cfff44
convert letter_counting_environment to use managed server
2025-11-14 09:44:20 +00:00
pre-commit-ci[bot]
aae4432a58
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-11-14 06:55:56 +00:00
teknium
76fec8b919
convert rlaif_server to managedserver
2025-11-14 06:53:16 +00:00
teknium
d8c68a93e3
convert tool_calling_server to managedserver
2025-11-14 06:48:07 +00:00
pre-commit-ci[bot]
0a3c15c7ad
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-11-14 06:14:21 +00:00
teknium
be74c759e5
convert swe_rl to managedserver
2025-11-14 06:13:02 +00:00
pre-commit-ci[bot]
9d3dbd1a73
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-11-14 00:10:43 +00:00
teknium
e28297b625
support managedserver in mcqa thinking
2025-11-14 00:10:04 +00:00
teknium
f0fee7fba6
revert zip change
2025-11-14 00:03:06 +00:00
pre-commit-ci[bot]
d5e6793f02
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-11-13 22:41:42 +00:00
teknium
28468bcae5
fix sampling temps
2025-11-13 22:41:04 +00:00
teknium
73e8ee2475
make evals also use managed
2025-11-13 22:39:21 +00:00
teknium
1ccf3b54e3
remove unused import
2025-11-13 08:40:30 +00:00
teknium
db1d094386
fix some issues
2025-11-13 08:33:03 +00:00
pre-commit-ci[bot]
b03b8d3808
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-11-13 08:04:40 +00:00
teknium
3f6265563f
convert gsm8k server
2025-11-13 08:03:00 +00:00
dmahan93
b4080a4f37
Merge pull request #273 from NousResearch/add-vllm-manager-fn
...
add managed vllm server
2025-11-07 14:22:07 -08:00
Dakota
e6ac3abdcb
add managed vllm server
2025-11-07 13:06:49 -06:00
dmahan93
b1e164eef5
Merge pull request #264 from NousResearch/add-logprob-server-manager-fn
...
add sglang specific token level logprob handling and server manager/b…
2025-10-29 13:53:39 -07:00
Dakota
5d6d6bb0dc
add docs :)
2025-10-29 11:26:43 -05:00
dmahan93
c22f8ca81b
Merge remote-tracking branch 'origin/add-logprob-server-manager-fn' into add-logprob-server-manager-fn
2025-10-24 23:18:37 -07:00
dmahan93
5d662bf1aa
add chat example and fix bug in managed_server
2025-10-24 23:15:56 -07:00
pre-commit-ci[bot]
0d80da5146
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-10-24 20:10:29 +00:00
dmahan93
7bf4cfbf80
add managed server to make grabbing logprobs easier w/ tokenized items
2025-10-24 13:09:46 -07:00
pre-commit-ci[bot]
1e6a745491
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-10-16 17:39:04 +00:00
Dakota
c36ec29656
add sglang specific token level logprob handling and server manager/baseline logprob/token fn
2025-10-16 12:38:03 -05:00
andrewshab
7e918dfd18
Update lean_env.py
2025-10-14 12:28:13 +02:00
andrewshab
eeabf16ff7
Update README.md
2025-10-14 12:27:03 +02:00
andrewshab
7318c70e41
Rename readme.md to README.md
2025-10-14 11:58:28 +02:00
ropresearch
b10b77c0ec
smolagent linting fixes
2025-10-01 13:28:01 -04:00
hjc-puro
c48d963c34
Merge branch 'main' into smolagents-environment
2025-09-30 10:35:18 -04:00
pre-commit-ci[bot]
47c68f06f2
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-09-30 14:03:43 +00:00
hjc-puro
dddfb30c5b
Fix smolagents ChatMessage compatibility and improve documentation
...
This commit fixes compatibility issues with smolagents 1.22.0 ChatMessage
objects and improves the documentation for easier setup.
Changes:
- Fix smolagents_model.py to handle ChatMessage objects (not just dicts)
in _extract_user_message() and _format_chat_messages()
- Fix smolagents_env.py to handle ChatMessage objects in trajectory
scoring and data group creation
- Update README.md with clearer installation instructions, Quick Start
section, and automatic GAIA dataset download documentation
- Add test_run.sh script for easy testing with OpenAI models
Tested with:
- smolagents 1.22.0
- gpt-4o-mini via OpenAI API
- Tavily web search tools
- Automatic GAIA dataset download
🤖 Generated with [Claude Code](https://claude.com/claude-code )
Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-29 21:27:52 +00:00
viktorking7
fc0b3e9a1a
Update tool_use_multiturn_server.py
2025-09-27 13:47:07 +02:00
viktorking7
6a6a9f60ef
Update README.md
2025-09-27 13:46:29 +02:00
pre-commit-ci[bot]
34cabbb30f
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-09-15 16:41:26 +00:00
dmahan93
89b59d489f
Merge branch 'main' into environments/bleuberi
2025-09-12 12:06:18 -05:00
dmahan93
02e2dcd49a
Merge pull request #160 from interstellarninja/feat/multiturn_tool_use_env
...
Multi-Turn Tool-Use RL Environment
2025-09-10 19:43:42 -05:00
pre-commit-ci[bot]
9d7c2772af
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-09-08 19:45:00 +00:00
Allan Niemerg
0f6c06bb56
Move BLEUBERI environment to community folder
...
- Moved environments/bleuberi to environments/community/bleuberi
- Updated .gitmodules to reflect new submodule path
- Fixed pre-commit formatting issues
- Cleaned up test output files
2025-09-08 14:38:43 -05:00
Allan Niemerg
532024d01e
remove unnecessary code, change log level
2025-09-08 11:22:08 -05:00
Allan Niemerg
1a2551c812
fixed formatting for HTML inclusion
2025-09-08 11:22:08 -05:00
Allan Niemerg
265e4cd69f
working HTML writing
2025-09-08 11:22:08 -05:00
Allan Niemerg
8997a1d750
working environment
2025-09-08 11:22:08 -05:00
Allan Niemerg
374f63acc0
remove unneeded dataset utils
2025-09-08 11:22:08 -05:00
Allan Niemerg
86473f9551
currently making complete rollouts
2025-09-08 11:22:08 -05:00
Allan Niemerg
64a82c4b4f
Fix BLEUBERI environment server integration
2025-09-08 11:22:08 -05:00
Allan Niemerg
3109fe349b
Update BLEUBERI README with OpenAI API instructions and remove redundant reward functions
2025-09-08 11:22:08 -05:00
Allan Niemerg
a520f5f663
Integrate BLEUBERI as a submodule with direct import of reference-based reward functions.
2025-09-08 11:22:08 -05:00