Zafir Stojanovski
a969d8ef05
feat(curriculum): Knights and Knaves configs ( #488 )
...
* configs
* reduce complexity of curriculum
* update lower bound
* add failure threshold
* update last_k
* update thresholds for success and failure
* update curriculum file as well
* update run name for noncurriculum
* lint
* dtype model eval
* return binary scoring
* set eval repeats to 3
* fix tests
2025-07-31 10:18:05 +02:00
vncntt
cd85c2d632
add knights knaves curriculum ( #401 )
...
* add knights knaves curriculum
* add metadata + width constraints
2025-04-01 12:20:58 +02:00
Andreas Köpf
5d7fbac0ad
Minor question template & score_answer improvements ( #261 )
...
* math prompt improvements
* ignore brackets in complex_arithmetic results
* improve additional instruction in prompt of polynomial_equations
* more strict tests for score_answer in polynomial_equations
* simplify special reward handling
* fix test_intermediate_integration
* fix sokoban dataset
* add common dataset score_answer consistency test
2025-03-04 21:55:09 +01:00
vncntt
3149edf2c4
fixed problems in knights_knaves ( #251 )
...
* remove unnecessary variables
* added depth logic
* add depth tests
2025-03-02 08:47:54 +01:00
vncntt
5f01049607
Add KnightsKnavesDataset (knights_knaves)
...
Adapted code from https://github.com/AlphaPav/mem-kk-logic/blob/main/data_prep/lib_kk.py
---------
Co-authored-by: Andreas Koepf (aider) <andreas.koepf@provisio.com>
2025-02-25 20:15:38 +01:00