Commit graph

5 commits

Author SHA1 Message Date
Andreas Köpf
5d7fbac0ad
Minor question template & score_answer improvements (#261)
* math prompt improvements
* ignore brackets in complex_arithmetic results
* improve additional instruction in prompt of polynomial_equations
* more strict tests for score_answer in polynomial_equations
* simplify special reward handling
* fix test_intermediate_integration
* fix sokoban dataset
* add common dataset score_answer consistency test
2025-03-04 21:55:09 +01:00
Andreas Köpf
24828e1889
Remove strip from ProceduralDataset::core score_answer() (#250)
* remove strip from ProceduralDataset::core score_answer(), strip in extract answer (optional, default=True)
* test: Move test_extract_answer() from test_dataset.py to test_utils.py
* refactor: Improve decimal reward computation with more flexible comparison
* fix: Implement rounding for format_number when round_if_needed is True
* test: Add test case for compute_decimal_reward with sign and zeros
2025-03-02 08:46:36 +01:00
Andreas Koepf
0a660a3409 ignore single whitespace at beginning and end of answer, use reward = len(oracle_answer) / len(answer) 2025-02-14 15:40:12 +01:00
Andreas Koepf
5a88cf2529 add simple dataset gallery generation script 2025-01-30 22:30:26 +01:00
Andreas Koepf (aider)
2ec8a1bcbb test: Add unit test for ReseedingDataset class 2025-01-30 22:05:47 +01:00