atropos

mirror of https://github.com/NousResearch/atropos.git synced 2026-04-19 12:57:58 +00:00

Author	SHA1	Message	Date
Jai Suphavadeeprasit	a171358f2e	structural changes	2026-03-13 18:49:30 -04:00
pre-commit-ci[bot]	6c564799bc	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2026-03-13 21:02:08 +00:00
Jai Suphavadeeprasit	697c594c72	changes	2026-03-13 16:58:37 -04:00
Jai Suphavadeeprasit	a8cdb53a4d	address problems	2026-03-13 16:12:05 -04:00
Jai Suphavadeeprasit	322e7e6623	remove comments	2026-03-13 13:30:04 -04:00
Jai Suphavadeeprasit	a1b545c734	remove cross tokenization and fix location of configs	2026-03-13 13:19:28 -04:00
pre-commit-ci[bot]	d1b0dee8f7	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2026-03-13 15:14:09 +00:00
Jai Suphavadeeprasit	e79af5ff69	testing config	2026-03-13 11:06:02 -04:00
Jai Suphavadeeprasit	64794e7c72	sneaky bug	2026-03-13 11:06:00 -04:00
Jai Suphavadeeprasit	4f33ab8bf4	apparently not so easy	2026-03-13 11:04:57 -04:00
Jai Suphavadeeprasit	530fed2877	testing set up	2026-03-13 11:04:57 -04:00
dmahan93	c421582b6f	Merge pull request #408 from daspartho/verl-integration-fixes fix: re-append stop string in math training path	2026-03-10 23:08:58 -05:00
Partho Das	632ab0161c	Revert "rm hardcoded same score check" This reverts commit `f02c24204d`.	2026-03-10 01:42:44 +05:30
Partho Das	cd3a9163c7	Revert "eval max_token_length consistent with training config" This reverts commit `5f52befd38`.	2026-03-08 04:42:02 +05:30
dmahan93	f4875c5dc6	make preserve thinking optional	2026-03-04 15:44:12 -06:00
dmahan93	12d61d197f	add env using the tool api stuff	2026-03-03 19:51:30 -06:00
Partho Das	5f52befd38	eval max_token_length consistent with training config instead of hardcoding, follows other envs pattern	2026-03-03 18:03:04 +05:30
dmahan93	be73d92723	Merge branch 'main' into pipelineRL	2026-03-02 16:43:32 -06:00
Jai Suphavadeeprasit	585244559e	more readme changes	2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit	4a7da8049f	README changes	2026-03-02 11:18:52 -05:00
pre-commit-ci[bot]	91afc9e46e	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit	d2ea8cd612	remove KL	2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit	dbf6026165	remove reqs and update community readme	2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit	45708b4b25	packageification	2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit	c33f9170c3	nccl loras	2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit	2e5fe8bb44	math server	2026-03-02 11:18:52 -05:00
pre-commit-ci[bot]	5cfd1929f1	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit	d07ab3e3ce	math zero work arounds	2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit	119721ef3d	evals errors	2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit	fb1d983757	evals errors	2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit	00801646d7	evals erros	2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit	dedb399911	evals	2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit	f78c821b8b	evals	2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit	58a3fb8b14	pipelineRL	2026-03-02 11:18:52 -05:00
Jai Suphavadeeprasit	6e975dd951	Save the eval to the disk	2026-03-02 11:17:44 -05:00
dmahan93	b3065841c1	add code-spell and secrects precommit	2026-02-27 20:17:19 -06:00
Alvarez	d762c229e2	Update instructions.py	2026-02-27 10:23:47 +01:00
dmahan93	7ceed9b6d9	Merge pull request #388 from milord12345/fix/replace-print-with-logger-reasoning-gym refactor: replace print statements with self.logger in reasoning_gym_environment.py	2026-02-24 14:24:12 -06:00
Partho Das	adf075112c	re-append stop in math training path	2026-02-24 12:29:57 +05:30
Partho Das	f02c24204d	rm hardcoded same score check	2026-02-24 12:29:52 +05:30
dmahan93	329a233bba	Merge pull request #389 from CreeptoGengar/fix/validate-without-train fix: handle validation without training	2026-02-23 14:21:40 -06:00
pre-commit-ci[bot]	a930d3db12	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2026-02-21 19:25:14 +00:00
VolodymyrBg	7e5ddbce06	fix: add try/finally to guarantee gym environment cleanup	2026-02-21 21:23:46 +02:00
pre-commit-ci[bot]	929980185d	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2026-02-21 13:54:38 +00:00
Gengar	34c8c87f0f	fix: handle validation without training Added validation functionality to the training process and refactored validation method to use a dedicated validator instance.	2026-02-21 15:53:37 +02:00
pre-commit-ci[bot]	623dadc5cd	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2026-02-20 16:43:18 +00:00
milord1234	853703ffc5	refactor: replace print statements with self.logger in reasoning_gym_environment.py Replace 20 print() calls with appropriate logging levels: - Error messages -> self.logger.error() - Warnings -> self.logger.warning() - Info/status messages -> self.logger.info() - Debug messages -> self.logger.debug() Left 2 top-level print() calls untouched (no logger access).	2026-02-20 19:57:43 +03:30
dmahan93	708b42a00f	Merge pull request #378 from johnh4098/add-regex-generation-env Add regex generation environment for community	2026-02-18 12:37:32 -08:00
pre-commit-ci[bot]	53a69d30e1	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2026-02-11 19:47:28 +00:00
johnh4098	86d5163316	Add regex generation environment for community	2026-02-11 23:04:47 +03:30

1 2 3 4 5 ...

801 commits