reasoning-gym

mirror of https://github.com/open-thought/reasoning-gym.git synced 2026-04-24 17:05:03 +00:00

Author	SHA1	Message	Date
joesharratt1229	c9f3c137ae	added self reference curr (#329 )	2025-03-11 00:23:26 +01:00
joesharratt1229	ded5da3daa	Added zebra curriculum (#328 ) * added zebra curriculum * added metadata	2025-03-11 00:22:54 +01:00
Andreas Köpf	b2904ccab9	Minor question template & score_answer improvements (#261 ) * math prompt improvements * ignore brackets in complex_arithmetic results * improve additional instruction in prompt of polynomial_equations * more strict tests for score_answer in polynomial_equations * simplify special reward handling * fix test_intermediate_integration * fix sokoban dataset * add common dataset score_answer consistency test	2025-03-04 21:55:09 +01:00
Zafir Stojanovski	2f9d94c1e7	fix: Unify Prompts (#254 ) * remove cot * fix prompt template * fix pool matrix * spiral matrix fixed	2025-03-03 21:55:53 +01:00
vncntt	8992037ecc	fixed problems in knights_knaves (#251 ) * remove unnecessary variables * added depth logic * add depth tests	2025-03-02 08:47:54 +01:00
vncntt	465db5c5c7	Add KnightsKnavesDataset (knights_knaves) Adapted code from https://github.com/AlphaPav/mem-kk-logic/blob/main/data_prep/lib_kk.py --------- Co-authored-by: Andreas Koepf (aider) <andreas.koepf@provisio.com>	2025-02-25 20:15:38 +01:00
Andreas Koepf	222d5ebf94	reactivate default imports for PropositionalLogicDataset	2025-02-21 15:41:04 +01:00
Andreas Köpf	78b2b518d9	Merge branch 'main' into fix/prop_logix	2025-02-21 15:38:29 +01:00
Andreas Koepf	ff5b210106	use native types List->list, Dict->dict, Set->set, Tuple->tuple	2025-02-21 15:15:38 +01:00
joesharratt1229	16c69b3b7a	moved trivial check	2025-02-21 00:20:00 +00:00
joesharratt1229	f61a4569ff	reimplemented prop logic	2025-02-20 23:59:31 +00:00
Andreas Koepf	6103bbe1b4	exclude PropositionalLogicDataset from auto-import (needs to be improved)	2025-02-20 12:08:48 +01:00
joesharratt1229	ab0fb7a58c	added zebra	2025-02-19 18:13:03 +00:00
Andreas Koepf	99b49f868f	fix question templates	2025-02-16 23:04:24 +01:00
Andreas Köpf	79758a31de	Merge pull request #105 from open-thought/circuit_logic initial draft for circuit_logic dataset generator	2025-02-16 22:54:43 +01:00
Andreas Koepf	5348d9ed69	fix comment: legend no longer part of metadata	2025-02-16 22:53:18 +01:00
Andreas Koepf (aider)	0051c266d4	feat: Add scoring method & unit tests for circuit logic dataset	2025-02-16 22:48:51 +01:00
joesharratt1229	230fffe0a4	formatted answer as str	2025-02-16 15:56:58 +00:00
Dragan Jovanović	55b0226ccf	fix for isort	2025-02-11 00:20:46 +01:00
Dragan Jovanović	328d744780	initial draft for circuit_logic dataset generator	2025-02-11 00:09:00 +01:00
Andreas Koepf	7bc0d00aa9	lint	2025-02-08 17:22:55 +01:00
Andreas Koepf (aider)	38d5caa928	feat: Add inversion probability and logical equivalence to syllogisms	2025-02-08 17:14:35 +01:00
Andreas Köpf	8b906dfc96	Fix syllogisms (#82 ) * let o1 write a new is_valid_syllogism() check * extend unit test * update gallery	2025-02-07 21:47:59 +01:00
Andreas Koepf	2a363c8610	bump version to 0.1.14	2025-02-07 18:28:06 +01:00
Rich Jones	dd53681724	add self-reference puzzles	2025-02-07 15:09:42 +01:00
Andreas Köpf	a607db79f7	Add Coaching & ScoreBoard class (result tracking) (#72 ) * feat: Add Coach and ScoreBoard classes for performance tracking and difficulty adjustment * feat: Add GroupedScores class to wrap aggregated scores * refactor: Create ScoreStats class with tuple-based score statistics * feat: Add unit test for Coach with CompositeDataset and multiple datasets * fix: Add difficulty metadata to leg counting dataset * feat: Add clear() method to ScoreBoard to reset all stored data * feat: Add __len__ method to ScoreBoard to return number of scores * feat: Add update_dataset_config method to CompositeDataset * cleanup __init__ & imports	2025-02-06 23:15:28 +01:00
Andreas Koepf	0561844779	update notice of 3rd party code import	2025-02-04 13:47:57 +01:00
Andreas Koepf	b07f91277d	minimize changes	2025-02-04 11:46:19 +01:00
Andreas Koepf	1142d9e6be	use sorted() and OrderedDict to make zebra puzzle clue order deterministic	2025-02-04 11:24:04 +01:00
Andreas Koepf	7e5c427aea	minor logic puzzle changes	2025-02-04 00:18:21 +01:00
Andreas Koepf	d541643ce4	remove solver graph folder	2025-02-04 00:07:01 +01:00
Andreas Koepf	94f877d17a	use explicit rng for zebra generation (not yet fully deterministic)	2025-02-04 00:00:54 +01:00
Rich Jones	6fdcb21bd2	cleanup	2025-02-03 16:47:29 +01:00
Rich Jones	90078684e3	precommit hook linting	2025-02-03 14:40:58 +01:00
Rich Jones	1490abb573	adds zebrapuzzles	2025-02-03 14:34:57 +01:00
Andreas Koepf (aider)	2409b3cda2	feat: Improve syllogism sentence formatting for natural language	2025-02-02 17:23:02 +01:00
Andreas Koepf	ccf282cc90	post merge lint	2025-02-02 10:04:18 +01:00
rishabhranawat	01eb611d5d	[aiw] remove output format template	2025-02-01 16:33:08 -08:00
rishabhranawat	dd4772cd09	[aiw] remove output format enum	2025-02-01 16:31:45 -08:00
rishabhranawat	ad73861fac	[aiw] remove output_formats style and change return type to a standard format	2025-02-01 16:30:05 -08:00
rishabhranawat	5cf1526368	[aiw] add colleague variation	2025-02-01 12:04:44 -08:00
rishabhranawat	60a5df0b2f	[aiw] basic version of alice-in-wonderland procedural dataset	2025-02-01 11:37:50 -08:00
Andreas Koepf	c3b6af35f0	min python 3.11 to support StrEnum	2025-01-26 22:17:43 +01:00
Andreas Koepf	ad9f0d265c	fix unit tests, lower python dependency to 3.9	2025-01-26 16:55:17 +01:00
Andreas Koepf	519e411fa5	add reasoning_gym.create_dataset({name}, ...) global factory function	2025-01-25 00:58:34 +01:00
Andreas Koepf	0d2d8ba6a0	pass config to ProceduralDataset base	2025-01-25 00:23:05 +01:00
Andreas Koepf	ac7dd69586	simplify simple_equation generation	2025-01-24 19:41:51 +01:00
Andreas Koepf	a3688a911d	reduce TERMS in SyllogismDataset	2025-01-24 18:45:22 +01:00
Andreas Koepf (aider)	3856bc9fb9	feat: Add hint to syllogisms question to clarify Yes/No answer format	2025-01-24 18:41:04 +01:00
Andreas Koepf (aider)	7634198df7	refactor: Enhance syllogism validation with comprehensive classical logic rules	2025-01-24 18:35:11 +01:00

1 2

54 commits