Commit graph

7 commits

Author SHA1 Message Date
Zafir Stojanovski
b5f5733052 update system prompt 2025-02-15 17:41:05 +01:00
Andreas Koepf
4ddd04a825 incorporate prompt changes suggested by Miserlou 2025-02-14 15:44:00 +01:00
Andreas Koepf
2726caf2fe ignore single whitespace at beginning and end of answer, use reward = len(oracle_answer) / len(answer) 2025-02-14 15:40:12 +01:00
Zafir Stojanovski
52a56cbc4f system prompt for structured output, and parse such outputs 2025-02-12 10:44:42 +01:00
Andreas Koepf
3ca9a709e8 gsm_symbolic generator changes 2025-02-05 20:58:01 +01:00
Andreas Koepf
1bc56b8559 extract answer from last answer tag 2025-01-28 16:37:19 +00:00
Andreas Koepf
655de7a7f3 add first example with OpenRLHF 2025-01-28 14:40:06 +00:00