Commit graph

770 commits

Author SHA1 Message Date
viktorking7
fc0b3e9a1a
Update tool_use_multiturn_server.py 2025-09-27 13:47:07 +02:00
viktorking7
6a6a9f60ef
Update README.md 2025-09-27 13:46:29 +02:00
pre-commit-ci[bot]
34cabbb30f [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-09-15 16:41:26 +00:00
dmahan93
89b59d489f
Merge branch 'main' into environments/bleuberi 2025-09-12 12:06:18 -05:00
dmahan93
02e2dcd49a
Merge pull request #160 from interstellarninja/feat/multiturn_tool_use_env
Multi-Turn Tool-Use RL Environment
2025-09-10 19:43:42 -05:00
pre-commit-ci[bot]
9d7c2772af [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-09-08 19:45:00 +00:00
Allan Niemerg
0f6c06bb56 Move BLEUBERI environment to community folder
- Moved environments/bleuberi to environments/community/bleuberi
      - Updated .gitmodules to reflect new submodule path
      - Fixed pre-commit formatting issues
      - Cleaned up test output files
2025-09-08 14:38:43 -05:00
Allan Niemerg
532024d01e remove unnecessary code, change log level 2025-09-08 11:22:08 -05:00
Allan Niemerg
1a2551c812 fixed formatting for HTML inclusion 2025-09-08 11:22:08 -05:00
Allan Niemerg
265e4cd69f working HTML writing 2025-09-08 11:22:08 -05:00
Allan Niemerg
8997a1d750 working environment 2025-09-08 11:22:08 -05:00
Allan Niemerg
374f63acc0 remove unneeded dataset utils 2025-09-08 11:22:08 -05:00
Allan Niemerg
86473f9551 currently making complete rollouts 2025-09-08 11:22:08 -05:00
Allan Niemerg
64a82c4b4f Fix BLEUBERI environment server integration 2025-09-08 11:22:08 -05:00
Allan Niemerg
3109fe349b Update BLEUBERI README with OpenAI API instructions and remove redundant reward functions 2025-09-08 11:22:08 -05:00
Allan Niemerg
a520f5f663 Integrate BLEUBERI as a submodule with direct import of reference-based reward functions. 2025-09-08 11:22:08 -05:00
Allan Niemerg
5bb5bd2c3d Add BLEUBERI environment for reference-based RL 2025-09-08 11:21:27 -05:00
Alvarez
bad4fb84df
Update plot.py 2025-08-30 19:22:57 +02:00
pre-commit-ci[bot]
127b5736a5 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-08-28 18:08:26 +00:00
Jai Suphavadeeprasit
7462f45447 sampling params 2025-08-28 14:07:29 -04:00
Jai Suphavadeeprasit
3944e7ef9b linting 2025-08-28 12:54:08 -04:00
Jai Suphavadeeprasit
1bfe294414 Other major changes 2025-08-28 12:24:08 -04:00
Jai Suphavadeeprasit
ec09a1caee Other major changes 2025-08-28 12:04:42 -04:00
Jai Suphavadeeprasit
b56d03b25c changes linting 2025-08-28 03:53:12 -04:00
Jai Suphavadeeprasit
f6f3c04313 organized 2025-08-28 03:35:41 -04:00
Jai Suphavadeeprasit
0bcc406b02 race conditions 2025-08-28 03:35:41 -04:00
Jai Suphavadeeprasit
53710e95ec min@ 2025-08-28 03:35:41 -04:00
pre-commit-ci[bot]
8b0a70131b [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-08-20 07:28:38 +00:00
teknium
a8c3e67062 rebuild text reversal env 2025-08-20 07:27:58 +00:00
pre-commit-ci[bot]
dec92b2a6e [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-08-19 16:30:37 +00:00
Jai Suphavadeeprasit
6266748027 Other linting 2025-08-19 12:20:33 -04:00
Jai Suphavadeeprasit
4d404c0be6 os 2025-08-19 12:05:04 -04:00
Jai Suphavadeeprasit
aac9f5a926 linting 2025-08-19 12:03:13 -04:00
pre-commit-ci[bot]
c1d97b85a3 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-08-19 12:03:13 -04:00
Jai Suphavadeeprasit
8b55815e2f Linting fixes 2025-08-19 12:03:13 -04:00
pre-commit-ci[bot]
750489493f [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-08-19 12:03:13 -04:00
Jai Suphavadeeprasit
f76f9d1596 cleanup 2025-08-19 12:03:13 -04:00
pre-commit-ci[bot]
62b72589c6 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-08-19 12:03:13 -04:00
Jai Suphavadeeprasit
e55a7a0100 add_danger 2025-08-19 12:03:13 -04:00
teknium
bed7ddcb95 add more default categories 2025-08-19 12:03:13 -04:00
teknium
39f0103313 fix dataset 2025-08-19 12:03:13 -04:00
teknium
ff7a2569dc update default max_toks 2025-08-19 12:03:13 -04:00
teknium
69135320b4 initial refusalbenchv2 2025-08-19 12:03:13 -04:00
pre-commit-ci[bot]
8d81198b99 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-08-13 21:34:59 +00:00
teknium
65e5386de9 Merge branch 'reverse-text-env' of https://github.com/NousResearch/atropos into reverse-text-env 2025-08-13 21:33:57 +00:00
teknium
5d1854d330 add curriculum system 2025-08-13 21:33:52 +00:00
pre-commit-ci[bot]
781d4320e9 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-08-13 21:16:48 +00:00
teknium
37013e9ce4 Add length penalty 2025-08-13 21:16:09 +00:00
teknium
564cee80d9 Merge branch 'reverse-text-env' of https://github.com/NousResearch/atropos into reverse-text-env 2025-08-12 20:51:15 +00:00
teknium
64e2792ec9 add text reversal env section to readme 2025-08-12 20:51:09 +00:00