yc-bench/system_design/02_data_models.md
AnandK27 ecd3d9e415 Add system design documentation for yc-bench
Comprehensive documentation covering all major subsystems:
simulation engine, data models, task system, prestige, finances,
employees, agent layer, CLI interface, configuration, and runner.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 13:42:41 -07:00

190 lines
8.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Data Models & Database Design
**Location**: `src/yc_bench/db/`
## Design Choice: SQLAlchemy ORM with SQLite
The benchmark uses SQLAlchemy's declarative ORM over SQLite for several reasons:
1. **Single-file persistence**: SQLite stores the entire game state in one file, making runs portable and inspectable
2. **Transactional safety**: ACID guarantees prevent partial state updates
3. **Query flexibility**: SQL allows complex queries for financial reports, task filtering, etc.
4. **Dual-backend support**: The same ORM works with PostgreSQL via `DATABASE_URL` environment variable for production/scaling scenarios
## Schema Overview
```
┌──────────────┐ ┌───────────────────┐
│ Company │────<│ CompanyPrestige │ (1 per domain × company)
└──────┬───────┘ └───────────────────┘
├────<┌──────────────┐ ┌──────────────────┐
│ │ Employee │────<│ EmployeeSkillRate │ (1 per domain × employee)
│ └──────┬───────┘ └──────────────────┘
│ │
│ │ ┌────────────────┐
│ └───<│ TaskAssignment │ (employee ↔ task junction)
│ └────────┬───────┘
│ │
├────<┌──────────┐────────┘
│ │ Task │────<┌─────────────────┐
│ └──────────┘ │ TaskRequirement │ (1 per domain × task)
│ └─────────────────┘
├────<┌──────────────┐
│ │ SimEvent │ (discrete events queue)
│ └──────────────┘
├────<┌──────────────┐
│ │ LedgerEntry │ (financial transactions)
│ └──────────────┘
├────<┌──────────────┐
│ │ SimState │ (simulation clock & counters)
│ └──────────────┘
└────<┌──────────────┐
│ Scratchpad │ (agent persistent memory)
└──────────────┘
```
## Model Details
### Company (`models/company.py`)
| Column | Type | Notes |
|--------|------|-------|
| `id` | UUID (PK) | Auto-generated |
| `name` | String | Company name |
| `funds_cents` | BigInteger | Financial balance in cents |
**Design choice**: Funds stored in cents (integer) to avoid floating-point rounding errors in financial calculations. BigInteger supports very large/negative values.
### CompanyPrestige (`models/company.py`)
| Column | Type | Notes |
|--------|------|-------|
| `company_id` | UUID (FK) | References Company |
| `domain` | String | research / inference / data_environment / training |
| `prestige_level` | Float | Range [1.0, 10.0] |
**Design choice**: Prestige is tracked per-domain rather than as a single score. This forces specialization trade-offs and creates a 4-dimensional progression space.
### Employee (`models/employee.py`)
| Column | Type | Notes |
|--------|------|-------|
| `id` | UUID (PK) | Auto-generated |
| `company_id` | UUID (FK) | References Company |
| `name` | String | Employee name |
| `tier` | String | junior / mid / senior |
| `work_hours_per_day` | Float | Hours available per business day |
| `salary_cents` | BigInteger | Monthly salary in cents |
### EmployeeSkillRate (`models/employee.py`)
| Column | Type | Notes |
|--------|------|-------|
| `employee_id` | UUID (FK) | References Employee |
| `domain` | String | One of 4 domains |
| `rate_domain_per_hour` | Float | Work units produced per hour |
**Design choice**: Skill rates are **hidden from the agent**. The agent sees tier and salary but not per-domain effectiveness. This creates an information asymmetry puzzle -- the agent must infer employee strengths from task outcomes.
### Task (`models/task.py`)
| Column | Type | Notes |
|--------|------|-------|
| `id` | UUID (PK) | Auto-generated |
| `company_id` | UUID (FK, nullable) | NULL = market task, set on acceptance |
| `status` | Enum | market → planned → active → completed_success / completed_fail / cancelled |
| `title` | String | Task description |
| `required_prestige` | Float | Minimum prestige needed in ALL task domains |
| `reward_funds_cents` | BigInteger | Payment on successful completion |
| `reward_prestige_delta` | Float | Prestige gained per domain on success |
| `skill_boost_pct` | Float | Employee skill rate increase on success |
| `accepted_at` | DateTime (nullable) | When task was accepted from market |
| `deadline` | DateTime (nullable) | Calculated at acceptance |
| `completed_at` | DateTime (nullable) | When task finished |
| `success` | Boolean (nullable) | True = on-time, False = late |
| `progress_milestone_pct` | Float | Tracks progress milestones (e.g., 50%) |
**Design choice**: `company_id` being nullable elegantly distinguishes market tasks (available for browsing) from accepted tasks (owned by the company).
### TaskRequirement (`models/task.py`)
| Column | Type | Notes |
|--------|------|-------|
| `task_id` | UUID (FK) | References Task |
| `domain` | String | Which domain this requirement covers |
| `required_qty` | Float | Total work units needed |
| `completed_qty` | Float | Work units completed so far |
**Design choice**: Multi-domain requirements make tasks a multi-dimensional optimization problem. A task might need work in 2-4 domains simultaneously.
### TaskAssignment (`models/task.py`)
| Column | Type | Notes |
|--------|------|-------|
| `task_id` | UUID (FK) | References Task |
| `employee_id` | UUID (FK) | References Employee |
| `assigned_at` | DateTime | When assigned |
**Design choice**: Many-to-many junction table. An employee can work on multiple tasks (throughput splits), and a task can have multiple employees (parallel progress).
### SimEvent (`models/event.py`)
| Column | Type | Notes |
|--------|------|-------|
| `id` | UUID (PK) | Deterministic (uuid5) |
| `company_id` | UUID (FK) | References Company |
| `event_type` | String | task_completed / bankruptcy / task_half / horizon_end |
| `scheduled_at` | DateTime | When event triggers |
| `payload` | JSON | Event-specific data |
| `dedupe_key` | String | Prevents duplicate events |
| `consumed` | Boolean | True after processing |
### LedgerEntry (`models/ledger.py`)
| Column | Type | Notes |
|--------|------|-------|
| `id` | UUID (PK) | Auto-generated |
| `company_id` | UUID (FK) | References Company |
| `occurred_at` | DateTime | Transaction timestamp |
| `category` | Enum | MONTHLY_PAYROLL / TASK_REWARD / TASK_FAIL_PENALTY / TASK_CANCEL_PENALTY |
| `amount_cents` | BigInteger | Signed amount (negative = cost) |
| `ref_type` | String (nullable) | Reference entity type |
| `ref_id` | UUID (nullable) | Reference entity ID |
**Design choice**: Immutable append-only ledger provides a complete financial audit trail. No entries are ever deleted or modified.
### SimState (`models/sim_state.py`)
| Column | Type | Notes |
|--------|------|-------|
| `company_id` | UUID (FK, PK) | References Company |
| `sim_time` | DateTime | Current simulation clock |
| `run_seed` | Integer | RNG seed for reproducibility |
| `horizon_end` | DateTime | When simulation ends |
| `replenish_counter` | Integer | Tracks market task replenishment |
### Scratchpad (`models/scratchpad.py`)
| Column | Type | Notes |
|--------|------|-------|
| `company_id` | UUID (FK) | References Company |
| `content` | Text | Free-form agent notes |
**Design choice**: Scratchpad survives LLM context truncation, giving the agent persistent memory across the full simulation.
## Session Management (`session.py`)
```python
session_scope(factory) context manager
```
- Creates a scoped session with automatic commit/rollback
- Supports both SQLite (default) and PostgreSQL (via `DATABASE_URL`)
- `init_db()` creates all tables from ORM metadata
**Design choice**: Context manager pattern ensures every database operation is properly transacted, preventing partial state updates that would corrupt the simulation.