mirror of
https://github.com/lilakk/BLEUBERI.git
synced 2026-04-19 12:58:12 +00:00
| .. | ||
| arena-hard | ||
| arena-hard-v2.0 | ||
| FastChat | ||
| WildBench | ||
| README.md | ||
| run_all_evals.sh | ||
| show_eval_results.sh | ||
To display benchmark results for models reported in the paper, run show_eval_results.sh.
To run a model on all benchmarks, see run_all_evals.sh.