mirror of
https://github.com/collinear-ai/yc-bench.git
synced 2026-04-19 12:58:03 +00:00
Delete Sonnet results section from README
Removed Sonnet-only results section and associated image.
This commit is contained in:
parent
91455bbca2
commit
89065f3487
1 changed files with 0 additions and 4 deletions
|
|
@ -432,10 +432,6 @@ Common failure patterns across all bankrupt runs:
|
|||
3. **Late adaptation.** Sonnet correctly identifies problems in its scratchpad ("PRESTIGE CRISIS — MARKET LOCK") but only after payroll has consumed the runway. By turn 137 of hard seed 2, all tasks require prestige ≥ 2 but the company is stuck at 1.0 in 6 of 7 domains.
|
||||
4. **Inconsistent ETA reasoning.** Sonnet's medium seed 2 has a 49% win rate — essentially a coin flip. It understands throughput math in its scratchpad but doesn't consistently apply it when selecting tasks.
|
||||
|
||||
### Sonnet-only results by config
|
||||
|
||||

|
||||
|
||||
---
|
||||
|
||||
## Simulation rules
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue