yc-bench/scripts
2026-03-12 12:12:47 -07:00
..
bot_runner.py improved system design, more intuitive hparams, updated configs, greedy bot updates 2026-03-12 12:12:47 -07:00
notepad_gif.py Fix horizon bug, multi-provider support, add Sonnet vs Gemini benchmark results 2026-02-26 00:31:00 -08:00
plot_comparison.py Rename Greedy Bot to Human Devised Rule, remove other bot baselines from plots 2026-02-27 14:03:04 -08:00
plot_multi_model.py Fixed task difficulty with base reward & deadline change 2026-03-06 18:08:11 -08:00
plot_prestige_radar.py Updated backend to calculate employee tier with spiky skill distribution; simplified domain count to 4 2026-03-05 18:12:48 -08:00
plot_results.py Add multi-strategy client trust system with tiers, specialties, and idle-turn fix 2026-03-09 17:37:49 -07:00
plot_run.py Updated backend to calculate employee tier with spiky skill distribution; simplified domain count to 4 2026-03-05 18:12:48 -08:00
plot_single_run.py init 2026-03-08 17:40:10 -07:00
plot_sonnet_results.py Fix horizon bug, multi-provider support, add Sonnet vs Gemini benchmark results 2026-02-26 00:31:00 -08:00
run_benchmark.sh Calibrated domain prestge bump 2026-03-06 14:40:45 -08:00
watch_dashboard.py Add multi-strategy client trust system with tiers, specialties, and idle-turn fix 2026-03-09 17:37:49 -07:00
watch_run.py Add multi-strategy client trust system with tiers, specialties, and idle-turn fix 2026-03-09 17:37:49 -07:00