atropos

mirror of https://github.com/NousResearch/atropos.git synced 2026-04-24 17:04:55 +00:00

Author	SHA1	Message	Date
alt-glitch	0ab46d65b0	Add streaming eval sample logging to BaseEnv Introduces `log_eval_sample()` method for stream-writing individual evaluation samples to `samples.jsonl` during evaluation, with lazy writer initialization and automatic HTML generation on completion. Updates GSM8k environment to use streaming approach instead of batching samples.	2026-03-28 00:52:13 -07:00
pre-commit-ci[bot]	83a343d3a9	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2026-03-27 23:08:06 +00:00
alt-glitch	7a4edb569c	add trajectory saving to eval mode.	2026-03-27 16:04:04 -07:00
pre-commit-ci[bot]	216c1f5899	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2026-02-27 21:17:58 +00:00
Jai Suphavadeeprasit	35587cbdc0	logger changes	2026-02-27 16:17:03 -05:00
pre-commit-ci[bot]	64d3ee1bd6	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2026-02-27 18:16:06 +00:00
Jai Suphavadeeprasit	f343b24a6a	narrow down scope	2026-02-27 11:14:42 -05:00
Jai Suphavadeeprasit	55f7cbd091	dynamic system prompts	2026-02-20 03:14:05 -05:00
Jai Suphavadeeprasit	e615eb1f50	assertions	2026-02-20 02:16:49 -05:00
Jai Suphavadeeprasit	559d649a26	proper fallback	2026-02-20 01:45:41 -05:00
Jai Suphavadeeprasit	3910a58f9b	refactor base	2026-02-20 01:45:41 -05:00
Jai Suphavadeeprasit	1c90fc71b0	on policy clean up	2026-02-20 01:45:41 -05:00
Jai Suphavadeeprasit	79e392c446	post merge changes	2026-02-20 01:45:41 -05:00
Jai Suphavadeeprasit	c89854a350	debug changes	2026-02-20 01:45:41 -05:00
Jai Suphavadeeprasit	ea2b388435	base env debugging	2026-02-20 01:45:41 -05:00
Jai Suphavadeeprasit	e814007575	base env debugging	2026-02-20 01:45:41 -05:00
Jai Suphavadeeprasit	b492ac4fce	on policy changes	2026-02-20 01:45:41 -05:00
Jai Suphavadeeprasit	6bc962c746	initial commit	2026-02-20 01:45:41 -05:00
VolodymyrBg	dd02df0d76	Update base.py	2026-01-29 10:22:51 +02:00
teknium	62fa51240c	Add support for reasoning models and their variety of providers/endpoints	2025-12-30 00:23:00 +00:00
Dhyaneesh	39d5fb4452	feat: dump evaluate subcommand config to YAML in env save dir Automatically save the final merged evaluate configuration to evaluate_config.yaml in the data_dir_to_save_evals directory. This includes env config, OpenAI/server configs, and server manager settings, enabling reproducibility and easier debugging of evaluation runs. The config is saved after all merging (CLI args > YAML > defaults) to capture the exact configuration used for the evaluation.	2025-11-08 23:46:13 +05:30
bobtajson	6e2d36bd2a	Update base.py	2025-10-23 10:27:23 +02:00
ropresearch	e5b8fb8654	clean up	2025-10-10 11:50:39 -04:00
ropresearch	baf4b2d8a8	gzip compression for atropos api	2025-10-10 01:26:52 -04:00
ropresearch	c3fc68879c	group temps, sample temps, and logprob api params	2025-09-25 16:41:58 -04:00
shannonsands	1a808e2038	Revert "Fix multiple scored data groups (#223 )" This reverts commit `67b3144113`.	2025-08-29 17:55:45 +10:00
shannonsands	67b3144113	Fix multiple scored data groups (#223 ) * removed changes to other files * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fail on scores empty --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2025-08-29 15:47:32 +10:00
Dakota	11f1303da0	add error logging to collect_trajectories so they don't fail silently	2025-08-15 16:34:21 -05:00
pre-commit-ci[bot]	3d2d9e67fa	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2025-07-15 11:42:46 +00:00
Alexey Gorbatovski	53984580c8	Bug fix	2025-07-15 14:37:55 +03:00
hjc-puro	72210cf4ad	rename fn	2025-07-11 04:04:55 +00:00
hjc-puro	d133ba3867	comment	2025-07-11 03:54:03 +00:00
hjc-puro	ccb8eaf230	move table to util	2025-07-11 03:52:24 +00:00
hjc-puro	5e61331360	simplify schema	2025-07-11 03:49:49 +00:00
hjc-puro	290e087fc5	remove some imports	2025-07-11 03:25:10 +00:00
hjc-puro	68da3809e2	move table to display util	2025-07-11 02:06:56 +00:00
hjc-puro	3e08c6d788	simplify schema	2025-07-11 00:52:09 +00:00
hjc-puro	6c64df0226	remove jsonlines dependency	2025-07-11 00:42:55 +00:00
hjc-puro	da0d64ae89	linting errors	2025-07-11 00:29:57 +00:00
hjc-puro	e601251893	gsm8k eval example	2025-07-11 00:22:36 +00:00
hjc-puro	eb926dc58b	working evals	2025-07-10 01:45:21 +00:00
hjc-puro	f4de3ad6f5	add printing	2025-07-09 23:35:26 +00:00
hjc-puro	a11af27298	add eval saving cli args	2025-07-09 03:12:13 +00:00
hjc-puro	5519f190d2	add evaluate subcommand to cli	2025-07-07 17:39:33 -04:00
dmahan93	58446dbcb1	Merge pull request #204 from NousResearch/multienv-enforce-mins Multienv with enforced minimum samples in a batch	2025-07-07 08:53:43 -05:00
Dakota	08e14cc745	feat: add minimum batch allocation support for environments - Add min_batch_allocation parameter to ensure environments contribute minimum proportion to each batch - Implement grab_batch_with_minimum_allocations function with proper scaling when allocations exceed 100% - Add mixed-size group buffering to handle variable-sized data submissions - Update server to use minimum allocation logic when any env has min_batch_allocation set - Add comprehensive tests for minimum allocation scenarios - Update documentation in API README and CONFIG.md - Update example environments to demonstrate the feature This feature allows critical environments to guarantee they contribute at least a specified proportion (0.0-1.0) to each training batch, ensuring important data sources are always represented during training. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-07-07 08:50:28 -05:00
pre-commit-ci[bot]	ee5257522a	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2025-07-04 14:34:37 +00:00
Alexey Gorbatovski	14c70c0e68	Include run name in wandb initialization in BaseEnv	2025-07-04 17:13:34 +03:00
Dakota	683559afd2	allow inf (<= 0 max_token_len) generations if trainer requests it, but raise a warning so that users can check their logs and get info if their trainers are doing something weird	2025-07-01 09:52:10 -05:00
crStiv	e9a547ce32	Update base.py	2025-06-19 22:52:26 +02:00

1 2

96 commits