Commit graph

27 commits

Author SHA1 Message Date
Teknium
3ef206e013
Merge branch 'main' into reverse-text-env 2026-01-05 15:33:43 -08:00
andrewshab
eeabf16ff7
Update README.md 2025-10-14 12:27:03 +02:00
teknium
5d1854d330 add curriculum system 2025-08-13 21:33:52 +00:00
teknium
37013e9ce4 Add length penalty 2025-08-13 21:16:09 +00:00
teknium
64e2792ec9 add text reversal env section to readme 2025-08-12 20:51:09 +00:00
teknium
75f1cf6d2a move eval envs to eval_environments and update readmes 2025-07-30 15:09:34 +00:00
pre-commit-ci[bot]
52b505296c [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-07-27 02:52:39 +00:00
teknium
a0979eb08e add readme section 2025-07-27 02:46:51 +00:00
pre-commit-ci[bot]
7d980372d3 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-07-15 18:40:26 +00:00
teknium
8aa540275b add to the envs readme 2025-07-15 18:39:50 +00:00
teknium1
bf78ad44e3 Add optional solve flagging strategy 2025-06-14 12:32:27 -07:00
teknium1
ad1bdf7f80 Add cycling curriculum, difficulty threshold, update datadumps 2025-06-14 07:44:47 -07:00
teknium1
7a89524345 add readme section for the environment 2025-06-12 00:36:03 -07:00
shannonsands
ea304892ee
Integrate chinguun101 goofy math (#145)
* Add GoofyMath environment for fun, engaging math learning

* linting, moved to community folder

* linting

---------

Co-authored-by: chinguun101 <chinguun@uni.minerva.edu>
2025-05-28 12:11:02 +10:00
shannonsands
1a79132809
Integrate michaelwaves options iv (#144)
* options iv agent

* bug fix

* outputs

* linted and moved to community folder

* linting

---------

Co-authored-by: michaelwaves <michaelyu713705@gmail.com>
2025-05-28 10:57:24 +10:00
shannonsands
f21154ff49
Integrate aniemerg wikipedia (#143)
* initial commit

* initial draft of wikipedia article creation environment

* add openai for rollouts, update requirements, create script to run, etc.

* add configuration, add debugging, fix tool calls, prevent wikipedia access

* now creates html file

* fix output for html page

* check in Claude plan

* fixed formatting and other issues

* add zip file

* update README

* linting, moved to community folder

* linting

* linting

* linting

* linting

---------

Co-authored-by: Allan Niemerg <niemerg@gmail.com>
2025-05-28 10:22:11 +10:00
shannonsands
b774e97215
Integrate subrahmanyam cybersecurity (#142)
* cybersecurity env for offline RL trajectories

* output file addition

* jsonl outputs

* code cleanup

* pulled out outputs and fixing .gitignore

* removed zip file

* gitignore typo fix

* Integrate cybersecurity Sigma rule generation environment

---------

Co-authored-by: Subrahmanyam Arunachalam <subrahmanyam.arunachalam@FVFGK0VTQ05P.local>
2025-05-28 08:41:51 +10:00
teknium1
46d33bf0b2 manually implement readme update due to 2025-05-21 23:09:45 -07:00
Shannon Sands
edf2beaa32 linting 2025-05-16 20:40:15 -07:00
Shannon Sands
41caa05a1a remvoed merge error 2025-05-16 19:49:37 -07:00
Shannon Sands
9753d5a122 resolved conflict 2025-05-16 19:48:15 -07:00
teknium1
20d263a495 add citation to allenai 2025-05-16 19:34:51 -07:00
teknium1
287bbcd356 some cleanup for final merge 2025-05-16 19:24:50 -07:00
Shannon Sands
fd63c76a5c Added new env info 2025-05-16 16:44:33 -07:00
teknium1
1a9fa016b5 add dependencies to the env readme 2025-05-14 19:44:13 -07:00
teknium1
90e235a3e9 update environments readme 2025-05-14 19:40:32 -07:00
Dakota Nous
621d00dd80 first commit 2025-04-29 12:10:10 -07:00