mirror of https://github.com/GoodStartLabs/AI_Diplomacy.git synced 2026-04-19 12:58:09 +00:00

History

Tyler Marques 287d845d4c Merge pull request #61 from peregrinates/ordersdata patched for compatability with new data logging structure		2025-09-05 14:24:48 -07:00
..
__init__.py	some maintenance, documentation, and validation improvements	2025-08-30 20:18:15 -04:00
analysis_helpers.py	some maintenance, documentation, and validation improvements	2025-08-30 20:18:15 -04:00
make_all_analysis_data.py	some maintenance, documentation, and validation improvements	2025-08-30 20:18:15 -04:00
p1_make_longform_orders_data.py	some maintenance, documentation, and validation improvements	2025-08-30 20:18:15 -04:00
p2_make_convo_data.py	patched for compatability with new data logging structure	2025-08-15 12:02:13 -04:00
p3_make_phase_data.py	some maintenance, documentation, and validation improvements	2025-08-30 20:18:15 -04:00
readme.md	fixed typo	2025-08-30 20:19:06 -04:00
requirements-analysis.txt	some maintenance, documentation, and validation improvements	2025-08-30 20:18:15 -04:00
schemas.py	some maintenance, documentation, and validation improvements	2025-08-30 20:18:15 -04:00
statistical_game_analysis.py	proper fix for game score	2025-07-14 19:55:02 +10:00
validation.py	some maintenance, documentation, and validation improvements	2025-08-30 20:18:15 -04:00

readme.md

Analysis Pipeline

This folder contains the data processing pipeline for converting raw diplomacy game logs into structured analysis datasets.

Overview

The module contains pipelines transforms raw game logs data (stored as json/csv files) into four analytical datasets:

Orders Data - one row per order given by each power in each phase
Conversations Data - one row per conversation between two powers in each phase
Phase Data - one row per power per phase with aggregated state and action summaries
Game Data - Summary of overall game features

Main entry point

`make_all_analysis_data.py` - Primary orchestrator

main use case: process all games in a data folder, create corresponding orders, conversations, and phase datasets. Supports batch and individual processing.

# process all games in a folder
python analysis/make_all_analysis_data.py \
  --game_data_folder "/path/to/Game Data" \
  --output_folder "/path/to/Game Data - Analysis"

# process specific games
python analysis/make_all_analysis_data.py \
  --selected_game game1 game2 \
  --game_data_folder "/path/to/Game Data" \
  --output_folder "/path/to/Game Data - Analysis"

This script runs the three p1, p2 and p3 analysis scripts in sequence and saves outputs to organized subfolders.

Individual analysis scripts

`p1_make_longform_orders_data.py`

what it does: creates detailed order-level data with one row per order given key outputs:

order classification (move, support, hold, etc.)
unit locations and destinations
support relationships and outcomes
relationship matrices between powers
llm reasoning for order generation

python analysis/p1_make_longform_orders_data.py \
  --game_data_folder "/path/to/Game Data" \
  --analysis_folder "/path/to/output"

`p2_make_convo_data.py`

what it does: extracts conversation data between all pairs of powers key outputs:

message counts and streaks per party
conversation transcripts
relationship context for each conversation

python analysis/p2_make_convo_data.py \
  --game_data_folder "/path/to/Game Data" \
  --analysis_folder "/path/to/output"

`p3_make_phase_data.py`

what it does: creates power-phase level summaries combining state, actions, and conversations key outputs:

current state (units, centers, influence counts)
action summaries (command counts, outcomes)
conversation transcripts with each power
change metrics between phases
llm reasoning and diary entries

python analysis/p3_make_phase_data.py \
  --game_data_folder "/path/to/Game Data" \
  --analysis_folder "/path/to/output"

`statistical_game_analysis.py`

what it does: comprehensive statistical analysis of game results and llm performance key outputs:

game-level aggregated metrics and features
response success/failure rates by type
relationship dynamics and negotiation patterns
phase-level analysis with response-type granularity
comprehensive failure tracking and validation

# analyze single game folder
python analysis/statistical_game_analysis.py /path/to/game_folder

# batch analyze multiple games
python analysis/statistical_game_analysis.py /path/to/parent_folder --multiple

# specify output directory
python analysis/statistical_game_analysis.py /path/to/game_folder --output /path/to/output

note: this is a separate analysis tool that operates independently of the main pipeline

supporting modules

`analysis_helpers.py`

utility functions for:

loading game data from folders or zip files
mapping countries to their llm models
standardizing data loading across scripts

`schemas.py`

constants and regex patterns:

supply center lists and coastal variants
country names
order parsing regexes
phase naming patterns

expected input data structure

each game folder should contain:

overview.jsonl - maps countries to llm models
lmvsgame.json - full turn-by-turn game state and actions
llm_responses.csv - all llm prompts and responses

output structure

the pipeline creates organized subfolders:

output_folder/ ├── orders_data/ │ └── {game_name}orders_data.csv ├── conversations_data/ │ └── {game_name}conversations_data.csv └── phase_data/ │ └── {game_name}phase_data.csv

Use cases

game analysis: examine specific games in detail
model comparison: compare llm performance across games
relationship analysis: study diplomatic dynamics
order validation: check llm order generation success rates
conversation analysis: study negotiation patterns
phase progression: track game state evolution

readme.md

Analysis Pipeline

Overview

Main entry point

make_all_analysis_data.py - Primary orchestrator

Individual analysis scripts

p1_make_longform_orders_data.py

p2_make_convo_data.py

p3_make_phase_data.py

statistical_game_analysis.py