pre-commit-ci[bot]
b6c655668e
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2026-04-13 22:23:35 +00:00
skyc1e
388137c3f1
fix: rename stale OpenaiConfig references to APIServerConfig
...
Three environment files still imported the old `OpenaiConfig` name,
which was renamed to `APIServerConfig`. This causes an ImportError
at module load time, making these environments unusable:
- environments/sft_loader_server.py
- environments/community/ufc_prediction_env/ufc_server.py
- environments/community/ufc_prediction_env/ufc_image_env.py
Also adds a lightweight import regression test.
2026-04-14 00:20:27 +02:00
Jai Suphavadeeprasit
79ff1642f8
revert gsm8k
2026-03-23 11:18:14 -07:00
Jai Suphavadeeprasit
a171358f2e
structural changes
2026-03-13 18:49:30 -04:00
pre-commit-ci[bot]
6c564799bc
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2026-03-13 21:02:08 +00:00
Jai Suphavadeeprasit
697c594c72
changes
2026-03-13 16:58:37 -04:00
Jai Suphavadeeprasit
a8cdb53a4d
address problems
2026-03-13 16:12:05 -04:00
Jai Suphavadeeprasit
322e7e6623
remove comments
2026-03-13 13:30:04 -04:00
Jai Suphavadeeprasit
a1b545c734
remove cross tokenization and fix location of configs
2026-03-13 13:19:28 -04:00
pre-commit-ci[bot]
d1b0dee8f7
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2026-03-13 15:14:09 +00:00
Jai Suphavadeeprasit
e79af5ff69
testing config
2026-03-13 11:06:02 -04:00
Jai Suphavadeeprasit
64794e7c72
sneaky bug
2026-03-13 11:06:00 -04:00
Jai Suphavadeeprasit
4f33ab8bf4
apparently not so easy
2026-03-13 11:04:57 -04:00
Jai Suphavadeeprasit
530fed2877
testing set up
2026-03-13 11:04:57 -04:00
dmahan93
c421582b6f
Merge pull request #408 from daspartho/verl-integration-fixes
...
fix: re-append stop string in math training path
2026-03-10 23:08:58 -05:00
Partho Das
632ab0161c
Revert "rm hardcoded same score check"
...
This reverts commit f02c24204d .
2026-03-10 01:42:44 +05:30
Partho Das
cd3a9163c7
Revert "eval max_token_length consistent with training config"
...
This reverts commit 5f52befd38 .
2026-03-08 04:42:02 +05:30
dmahan93
f4875c5dc6
make preserve thinking optional
2026-03-04 15:44:12 -06:00
dmahan93
12d61d197f
add env using the tool api stuff
2026-03-03 19:51:30 -06:00
Partho Das
5f52befd38
eval max_token_length consistent with training config
...
instead of hardcoding, follows other envs pattern
2026-03-03 18:03:04 +05:30
dmahan93
be73d92723
Merge branch 'main' into pipelineRL
2026-03-02 16:43:32 -06:00
Jai Suphavadeeprasit
585244559e
more readme changes
2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
4a7da8049f
README changes
2026-03-02 11:18:52 -05:00
pre-commit-ci[bot]
91afc9e46e
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
d2ea8cd612
remove KL
2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
dbf6026165
remove reqs and update community readme
2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
45708b4b25
packageification
2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
c33f9170c3
nccl loras
2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
2e5fe8bb44
math server
2026-03-02 11:18:52 -05:00
pre-commit-ci[bot]
5cfd1929f1
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
d07ab3e3ce
math zero work arounds
2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
119721ef3d
evals errors
2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
fb1d983757
evals errors
2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
00801646d7
evals erros
2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
dedb399911
evals
2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
f78c821b8b
evals
2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
58a3fb8b14
pipelineRL
2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit
6e975dd951
Save the eval to the disk
2026-03-02 11:17:44 -05:00
dmahan93
b3065841c1
add code-spell and secrects precommit
2026-02-27 20:17:19 -06:00
Alvarez
d762c229e2
Update instructions.py
2026-02-27 10:23:47 +01:00
dmahan93
7ceed9b6d9
Merge pull request #388 from milord12345/fix/replace-print-with-logger-reasoning-gym
...
refactor: replace print statements with self.logger in reasoning_gym_environment.py
2026-02-24 14:24:12 -06:00
Partho Das
adf075112c
re-append stop in math training path
2026-02-24 12:29:57 +05:30
Partho Das
f02c24204d
rm hardcoded same score check
2026-02-24 12:29:52 +05:30
dmahan93
329a233bba
Merge pull request #389 from CreeptoGengar/fix/validate-without-train
...
fix: handle validation without training
2026-02-23 14:21:40 -06:00
pre-commit-ci[bot]
a930d3db12
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2026-02-21 19:25:14 +00:00
VolodymyrBg
7e5ddbce06
fix: add try/finally to guarantee gym environment cleanup
2026-02-21 21:23:46 +02:00
pre-commit-ci[bot]
929980185d
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2026-02-21 13:54:38 +00:00
Gengar
34c8c87f0f
fix: handle validation without training
...
Added validation functionality to the training process and refactored validation method to use a dedicated validator instance.
2026-02-21 15:53:37 +02:00
pre-commit-ci[bot]
623dadc5cd
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2026-02-20 16:43:18 +00:00
milord1234
853703ffc5
refactor: replace print statements with self.logger in reasoning_gym_environment.py
...
Replace 20 print() calls with appropriate logging levels:
- Error messages -> self.logger.error()
- Warnings -> self.logger.warning()
- Info/status messages -> self.logger.info()
- Debug messages -> self.logger.debug()
Left 2 top-level print() calls untouched (no logger access).
2026-02-20 19:57:43 +03:30