reasoning-gym

mirror of https://github.com/open-thought/reasoning-gym.git synced 2026-04-26 17:13:17 +00:00

Author	SHA1	Message	Date
Andreas Köpf	850c1cf6f4	Eval script consolidation (#238 ) The script now supports: - YAML and JSON configurations - Dataset-specific parameters - Overriding configuration via command line - Detailed logging and error handling	2025-02-27 17:39:14 +01:00
AhmedSaif2	5d3bfda677	fix parameter name in compute_decimal_reward docstring	2025-02-21 17:01:59 +02:00
Andreas Koepf	acde58a200	use Decimal class for numeric comparison e.g. +0123.100 == 123.1	2025-02-21 15:36:06 +01:00
AhmedSaif2	5d02064b5a	add a helper function to handle redundant code	2025-02-21 15:54:00 +02:00
Zafir Stojanovski	427e3b4ff7	update system prompt	2025-02-15 17:41:05 +01:00
Andreas Koepf	fe231b4b31	incorporate prompt changes suggested by Miserlou	2025-02-14 15:44:00 +01:00
Andreas Koepf	0a660a3409	ignore single whitespace at beginning and end of answer, use reward = len(oracle_answer) / len(answer)	2025-02-14 15:40:12 +01:00
Zafir Stojanovski	3d84816f95	system prompt for structured output, and parse such outputs	2025-02-12 10:44:42 +01:00
Andreas Koepf	afb95508ef	gsm_symbolic generator changes	2025-02-05 20:58:01 +01:00
Andreas Koepf	c196d622e0	extract answer from last answer tag	2025-01-28 16:37:19 +00:00
Andreas Koepf	cc0312e446	add first example with OpenRLHF	2025-01-28 14:40:06 +00:00