reasoning-gym

mirror of https://github.com/open-thought/reasoning-gym.git synced 2026-04-19 12:58:07 +00:00

Author	SHA1	Message	Date
Oliver Stanley	7475a20700	include ranges rather than sampled values in difficulty metadata dicts (#387 ) * update difficulty metadata for logic datasets * update difficulty metadata for graph datasets * update difficulty metadata for geometry datasets * update difficulty metadata for games datasets * update difficulty metadata for cognition datasets * update difficulty metadata for arithmetic datasets * update difficulty metadata for arc datasets * update difficulty metadata for algorithmic datasets * update difficulty metadata for algebra datasets * use tuples * update tests * update tests	2025-03-20 10:27:03 +01:00
Andreas Köpf	d2c895f1d3	Refactor Curriculum Attributes (#335 ) * remove min_value from AttributeDefinition * remove type from AttributeDefinition * Add CurriculumContext * add ensure_interval option for RangeAttributes * docs: Add legend explaining curriculum indicators in dataset gallery * update GALLERY.md	2025-03-16 15:40:28 +01:00
Zafir Stojanovski	f72ecaf411	circuit logic curriculum (#368 )	2025-03-14 16:17:00 +01:00
Zafir Stojanovski	d33c667c3d	feat(env): Propositional Logic Curriculum (#365 ) * propositional logic curriculum * lint * difficulty meta	2025-03-14 16:12:39 +01:00
joesharratt1229	d603d8b72b	added aiw curric (#356 ) added metadata	2025-03-13 21:10:52 +01:00
joesharratt1229	b497e35fb8	added self reference curr (#329 )	2025-03-11 00:23:26 +01:00
joesharratt1229	54074b17ef	Added zebra curriculum (#328 ) * added zebra curriculum * added metadata	2025-03-11 00:22:54 +01:00
Andreas Köpf	5d7fbac0ad	Minor question template & score_answer improvements (#261 ) * math prompt improvements * ignore brackets in complex_arithmetic results * improve additional instruction in prompt of polynomial_equations * more strict tests for score_answer in polynomial_equations * simplify special reward handling * fix test_intermediate_integration * fix sokoban dataset * add common dataset score_answer consistency test	2025-03-04 21:55:09 +01:00
Zafir Stojanovski	01e1c8f9af	fix: Unify Prompts (#254 ) * remove cot * fix prompt template * fix pool matrix * spiral matrix fixed	2025-03-03 21:55:53 +01:00
vncntt	3149edf2c4	fixed problems in knights_knaves (#251 ) * remove unnecessary variables * added depth logic * add depth tests	2025-03-02 08:47:54 +01:00
vncntt	5f01049607	Add KnightsKnavesDataset (knights_knaves) Adapted code from https://github.com/AlphaPav/mem-kk-logic/blob/main/data_prep/lib_kk.py --------- Co-authored-by: Andreas Koepf (aider) <andreas.koepf@provisio.com>	2025-02-25 20:15:38 +01:00
Andreas Koepf	7f30e711e5	reactivate default imports for PropositionalLogicDataset	2025-02-21 15:41:04 +01:00
Andreas Köpf	802b8c4bed	Merge branch 'main' into fix/prop_logix	2025-02-21 15:38:29 +01:00
Andreas Koepf	3e7ff3b084	use native types List->list, Dict->dict, Set->set, Tuple->tuple	2025-02-21 15:15:38 +01:00
joesharratt1229	5fb655e390	moved trivial check	2025-02-21 00:20:00 +00:00
joesharratt1229	39ee099a86	reimplemented prop logic	2025-02-20 23:59:31 +00:00
Andreas Koepf	147088051d	exclude PropositionalLogicDataset from auto-import (needs to be improved)	2025-02-20 12:08:48 +01:00
joesharratt1229	831822e899	added zebra	2025-02-19 18:13:03 +00:00
Andreas Koepf	2cbaab2918	fix question templates	2025-02-16 23:04:24 +01:00
Andreas Köpf	72fcb36cb4	Merge pull request #105 from open-thought/circuit_logic initial draft for circuit_logic dataset generator	2025-02-16 22:54:43 +01:00
Andreas Koepf	3335df8ad4	fix comment: legend no longer part of metadata	2025-02-16 22:53:18 +01:00
Andreas Koepf (aider)	63bd662acf	feat: Add scoring method & unit tests for circuit logic dataset	2025-02-16 22:48:51 +01:00
joesharratt1229	d0200d1dbe	formatted answer as str	2025-02-16 15:56:58 +00:00
Dragan Jovanović	719369bce6	fix for isort	2025-02-11 00:20:46 +01:00
Dragan Jovanović	60d0785a91	initial draft for circuit_logic dataset generator	2025-02-11 00:09:00 +01:00
Andreas Koepf	0c3b2c4fef	lint	2025-02-08 17:22:55 +01:00
Andreas Koepf (aider)	ac27508d09	feat: Add inversion probability and logical equivalence to syllogisms	2025-02-08 17:14:35 +01:00
Andreas Köpf	0c8752c7b1	Fix syllogisms (#82 ) * let o1 write a new is_valid_syllogism() check * extend unit test * update gallery	2025-02-07 21:47:59 +01:00
Andreas Koepf	d3752a0d76	bump version to 0.1.14	2025-02-07 18:28:06 +01:00
Rich Jones	bd8fc9beeb	add self-reference puzzles	2025-02-07 15:09:42 +01:00
Andreas Köpf	3f6b2fc807	Add Coaching & ScoreBoard class (result tracking) (#72 ) * feat: Add Coach and ScoreBoard classes for performance tracking and difficulty adjustment * feat: Add GroupedScores class to wrap aggregated scores * refactor: Create ScoreStats class with tuple-based score statistics * feat: Add unit test for Coach with CompositeDataset and multiple datasets * fix: Add difficulty metadata to leg counting dataset * feat: Add clear() method to ScoreBoard to reset all stored data * feat: Add __len__ method to ScoreBoard to return number of scores * feat: Add update_dataset_config method to CompositeDataset * cleanup __init__ & imports	2025-02-06 23:15:28 +01:00
Andreas Koepf	cd3c95baf0	update notice of 3rd party code import	2025-02-04 13:47:57 +01:00
Andreas Koepf	9bc92952f8	minimize changes	2025-02-04 11:46:19 +01:00
Andreas Koepf	f2b4c3d078	use sorted() and OrderedDict to make zebra puzzle clue order deterministic	2025-02-04 11:24:04 +01:00
Andreas Koepf	f5128207a6	minor logic puzzle changes	2025-02-04 00:18:21 +01:00
Andreas Koepf	2b6315474a	remove solver graph folder	2025-02-04 00:07:01 +01:00
Andreas Koepf	04cd81dd76	use explicit rng for zebra generation (not yet fully deterministic)	2025-02-04 00:00:54 +01:00
Rich Jones	4d950e562a	cleanup	2025-02-03 16:47:29 +01:00
Rich Jones	7274f79c50	precommit hook linting	2025-02-03 14:40:58 +01:00
Rich Jones	0c9094e9f4	adds zebrapuzzles	2025-02-03 14:34:57 +01:00
Andreas Koepf (aider)	56ded2c299	feat: Improve syllogism sentence formatting for natural language	2025-02-02 17:23:02 +01:00
Andreas Koepf	f396d3df60	post merge lint	2025-02-02 10:04:18 +01:00
rishabhranawat	5f6d615369	[aiw] remove output format template	2025-02-01 16:33:08 -08:00
rishabhranawat	f8696d6d22	[aiw] remove output format enum	2025-02-01 16:31:45 -08:00
rishabhranawat	3d42e84807	[aiw] remove output_formats style and change return type to a standard format	2025-02-01 16:30:05 -08:00
rishabhranawat	57a1b5c353	[aiw] add colleague variation	2025-02-01 12:04:44 -08:00
rishabhranawat	86525f6401	[aiw] basic version of alice-in-wonderland procedural dataset	2025-02-01 11:37:50 -08:00
Andreas Koepf	cae7f0f98b	min python 3.11 to support StrEnum	2025-01-26 22:17:43 +01:00
Andreas Koepf	ecbb155184	fix unit tests, lower python dependency to 3.9	2025-01-26 16:55:17 +01:00
Andreas Koepf	0dcff77b37	add reasoning_gym.create_dataset({name}, ...) global factory function	2025-01-25 00:58:34 +01:00

1 2

59 commits