Andreas Koepf
|
9b7eec2d64
|
add llama-3.3-70b-instruct algebra, algorithmic eval configs
|
2025-02-25 23:43:29 +01:00 |
|
joesharratt1229
|
e0e8bab09c
|
Merge remote-tracking branch 'origin/consolidate_eval_script' into fix/eval
|
2025-02-25 18:10:07 +00:00 |
|
joesharratt1229
|
93b95d748b
|
updated read me
|
2025-02-25 15:46:43 +00:00 |
|
Andreas Koepf
|
11fb7e0edf
|
move r1 configs into r1 yaml/r1 subfolder
|
2025-02-25 16:24:30 +01:00 |
|
Andreas Koepf
|
7f0047667f
|
consolidate eval scripts to have single eval.py
|
2025-02-25 16:13:22 +01:00 |
|
Andreas Köpf
|
de362fb76f
|
Merge pull request #182 from zafstojano/env/binary-alternation
feat(env): Binary Alternation
|
2025-02-21 17:27:16 +01:00 |
|
Andreas Koepf
|
ff5b210106
|
use native types List->list, Dict->dict, Set->set, Tuple->tuple
|
2025-02-21 15:15:38 +01:00 |
|
Zafir Stojanovski
|
0391a99446
|
include pre-parsed responses in json
|
2025-02-21 13:50:48 +01:00 |
|
Zafir Stojanovski
|
52a56cbc4f
|
system prompt for structured output, and parse such outputs
|
2025-02-12 10:44:42 +01:00 |
|
rishabhranawat
|
615c63d2f9
|
[eval-v1] pre commit formatting
|
2025-02-10 21:50:22 -08:00 |
|
rishabhranawat
|
88c875c00f
|
[eval-v1] add timer
|
2025-02-10 21:48:44 -08:00 |
|
rishabhranawat
|
be3d04e7cb
|
[eval-v1] async to speed up inference/evaluation
|
2025-02-10 21:35:46 -08:00 |
|
rishabhranawat
|
03f87dbc07
|
[eval-basic] remove large results files, add gitignore, only leave summary
|
2025-02-09 22:52:10 -08:00 |
|