Remove strip from ProceduralDataset::core score_answer() (#250)

* remove strip from ProceduralDataset::core score_answer(), strip in extract answer (optional, default=True)
* test: Move test_extract_answer() from test_dataset.py to test_utils.py
* refactor: Improve decimal reward computation with more flexible comparison
* fix: Implement rounding for format_number when round_if_needed is True
* test: Add test case for compute_decimal_reward with sign and zeros
This commit is contained in:
Andreas Köpf 2025-03-02 08:46:36 +01:00 committed by GitHub
parent a66a7e7965
commit 24828e1889
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
6 changed files with 80 additions and 26 deletions

View file

@ -111,7 +111,8 @@ class ChainSumDataset(ProceduralDataset):
return expression, result
def score_answer(self, answer: Optional[str], entry: dict[str, Any]) -> float:
return utils.compute_decimal_reward(answer, oracle_answer=entry["answer"])
# tolerate sign, leading zeros and trailing decimals, strip commas "+01,000.00" == "1000"
return utils.compute_decimal_reward(answer, oracle_answer=entry["answer"], strip_commas=True)
class ChainSumCurriculum(BaseCurriculum):