alt-glitch
0ab46d65b0
Add streaming eval sample logging to BaseEnv
...
Introduces `log_eval_sample()` method for stream-writing individual
evaluation samples to `samples.jsonl` during evaluation, with lazy
writer initialization and automatic HTML generation on completion.
Updates GSM8k environment to use streaming approach instead of batching
samples.
2026-03-28 00:52:13 -07:00
pre-commit-ci[bot]
83a343d3a9
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2026-03-27 23:08:06 +00:00
Siddharth Balyan
63d717e4b4
Merge branch 'main' into sid/traj-saving-eval-mode
2026-03-28 04:36:31 +05:30
alt-glitch
7a4edb569c
add trajectory saving to eval mode.
2026-03-27 16:04:04 -07:00
Jai Suphavadeeprasit
75a032bf3e
revert openai server
2026-03-23 11:26:05 -07:00
Jai Suphavadeeprasit
295bb9c446
revert openai server
2026-03-23 11:25:28 -07:00
Jai Suphavadeeprasit
45f569f3af
clean
2026-03-18 09:20:08 -04:00
Jai Suphavadeeprasit
41947e98d6
clean
2026-03-17 12:25:38 -04:00
Jai Suphavadeeprasit
79baac1ea7
clean
2026-03-17 12:23:35 -04:00
Jai Suphavadeeprasit
7aba0d3fc8
fresh eyes check
2026-03-14 11:20:15 -04:00
Jai Suphavadeeprasit
805a0c0eac
revert to similar structure
2026-03-13 20:52:48 -04:00
Jai Suphavadeeprasit
9bd299b3ef
better logging for devex
2026-03-13 20:41:51 -04:00
pre-commit-ci[bot]
3a85ede8ba
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2026-03-13 22:51:58 +00:00
Jai Suphavadeeprasit
a171358f2e
structural changes
2026-03-13 18:49:30 -04:00
Jai Suphavadeeprasit
1b8ff075c4
adding tests
2026-03-13 17:23:59 -04:00
Jai Suphavadeeprasit
697c594c72
changes
2026-03-13 16:58:37 -04:00
pre-commit-ci[bot]
82964b6e48
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2026-03-13 20:13:35 +00:00
Jai Suphavadeeprasit
a8cdb53a4d
address problems
2026-03-13 16:12:05 -04:00
Jai Suphavadeeprasit
322e7e6623
remove comments
2026-03-13 13:30:04 -04:00
pre-commit-ci[bot]
994e9c287d
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2026-03-13 17:21:00 +00:00
Jai Suphavadeeprasit
a1b545c734
remove cross tokenization and fix location of configs
2026-03-13 13:19:28 -04:00
Jai Suphavadeeprasit
862cd3667d
clean logging
2026-03-13 12:38:52 -04:00
Jai Suphavadeeprasit
600c54f5f8
clean log
2026-03-13 12:12:33 -04:00
pre-commit-ci[bot]
d1b0dee8f7
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2026-03-13 15:14:09 +00:00
Jai Suphavadeeprasit
c26432b963
training kernel
2026-03-13 11:06:02 -04:00
Jai Suphavadeeprasit
2f371e03fc
tokenizer bug
2026-03-13 11:06:02 -04:00
Jai Suphavadeeprasit
78c0a6d082
tokenizer bug
2026-03-13 11:06:02 -04:00
Jai Suphavadeeprasit
09ad401995
sneaky bug logging
2026-03-13 11:06:02 -04:00
Jai Suphavadeeprasit
64794e7c72
sneaky bug
2026-03-13 11:06:00 -04:00
Jai Suphavadeeprasit
bb2736db4e
next
2026-03-13 11:05:40 -04:00
Jai Suphavadeeprasit
f44eb810bf
teacher env init
2026-03-13 11:04:57 -04:00
dmahan93
f198c1738e
Merge conflict commit
2026-03-09 23:13:43 -05:00
Jai Suphavadeeprasit
b91922082e
managed_Server pass through and centralize sem logic
2026-03-05 15:46:33 -05:00
dmahan93
f4875c5dc6
make preserve thinking optional
2026-03-04 15:44:12 -06:00
Jai Suphavadeeprasit
c85a3e5ee7
readme language
2026-03-03 23:44:29 -05:00
pre-commit-ci[bot]
efc90bfb1b
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2026-03-04 04:18:12 +00:00
Jai Suphavadeeprasit
1eeb31065f
fixing comments
2026-03-03 23:16:05 -05:00
pre-commit-ci[bot]
8f304d44fd
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2026-03-04 03:08:19 +00:00
Jai Suphavadeeprasit
5aaf7a346c
prompt logprobs simplicity
2026-03-03 22:06:49 -05:00
Jai Suphavadeeprasit
f1c20591b6
prompt logprobs
2026-03-03 21:58:05 -05:00
Jai Suphavadeeprasit
439b9b129b
prompt logprobs
2026-03-03 21:58:05 -05:00
dmahan93
12d61d197f
add env using the tool api stuff
2026-03-03 19:51:30 -06:00
dmahan93
c8eb63f33d
readme updates for tool calling
2026-03-03 12:22:10 -06:00
pre-commit-ci[bot]
e98100e5f6
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2026-03-03 17:21:00 +00:00
Jai Suphavadeeprasit
323a8a2601
readme updates
2026-03-03 12:19:55 -05:00
Jai Suphavadeeprasit
b9291aa29f
init commit
2026-03-03 11:32:09 -05:00
dmahan93
8f21bb57ed
add better warning message
2026-03-02 23:21:25 -06:00
dmahan93
add42a2afb
add tool call parsing based on vllm impl and an openai server endpoint
2026-03-02 23:17:13 -06:00
pre-commit-ci[bot]
216c1f5899
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2026-02-27 21:17:58 +00:00
Jai Suphavadeeprasit
35587cbdc0
logger changes
2026-02-27 16:17:03 -05:00