mirror of
https://github.com/GoodStartLabs/AI_Diplomacy.git
synced 2026-04-19 12:58:09 +00:00
Adding run with uv instructions
Signed-off-by: Tyler Marques <me@tylermarques.com>
This commit is contained in:
parent
46525efe6b
commit
0bd909b30b
1 changed files with 47 additions and 12 deletions
59
README.md
59
README.md
|
|
@ -1,4 +1,5 @@
|
|||
# AI Diplomacy: LLM-Powered Strategic Gameplay
|
||||
|
||||
Created by Alex Duffy @Alx-Ai & Tyler Marques @Tylermarques
|
||||
|
||||
## Overview
|
||||
|
|
@ -8,31 +9,37 @@ This repository extends the original [Diplomacy](https://github.com/diplomacy/di
|
|||
## Key Features
|
||||
|
||||
### 🤖 Stateful AI Agents
|
||||
|
||||
Each power is represented by a `DiplomacyAgent` with:
|
||||
|
||||
- **Dynamic Goals**: Strategic objectives that evolve based on game events
|
||||
- **Relationship Tracking**: Maintains relationships (Enemy/Unfriendly/Neutral/Friendly/Ally) with other powers
|
||||
- **Memory System**: Dual-layer memory with structured diary entries and consolidation
|
||||
- **Personality**: Power-specific system prompts shape each agent's diplomatic style
|
||||
|
||||
### 💬 Rich Negotiations
|
||||
|
||||
- Multi-round message exchanges (private and global)
|
||||
- Relationship-aware communication strategies
|
||||
- Message history tracking and analysis
|
||||
- Detection of ignored messages and non-responsive powers
|
||||
|
||||
### 🎯 Strategic Order Generation
|
||||
|
||||
- BFS pathfinding for movement analysis
|
||||
- Context-aware order selection with nearest threats/opportunities
|
||||
- Fallback logic for robustness
|
||||
- Support for multiple LLM providers (OpenAI, Claude, Gemini, DeepSeek, OpenRouter)
|
||||
|
||||
### 📊 Advanced Game Analysis
|
||||
|
||||
- Custom phase summaries with success/failure categorization
|
||||
- Betrayal detection through order/negotiation comparison
|
||||
- Strategic planning phases for high-level directives
|
||||
- Comprehensive logging of all LLM interactions
|
||||
|
||||
### 🧠 Memory Management
|
||||
|
||||
- **Private Diary**: Structured, phase-prefixed entries for LLM context
|
||||
- Negotiation summaries with relationship updates
|
||||
- Order reasoning and strategic justifications
|
||||
|
|
@ -219,6 +226,7 @@ graph TB
|
|||
#### Prompt Templates
|
||||
|
||||
The `ai_diplomacy/prompts/` directory contains customizable templates:
|
||||
|
||||
- Power-specific system prompts (e.g., `france_system_prompt.txt`)
|
||||
- Task-specific instructions (`order_instructions.txt`, `conversation_instructions.txt`)
|
||||
- Diary generation prompts for different game events
|
||||
|
|
@ -321,10 +329,10 @@ python3 experiment_runner.py \
|
|||
|
||||
*(All other command-line flags belong to `lm_game.py` and are forwarded unchanged.)*
|
||||
|
||||
|
||||
### Environment Setup
|
||||
|
||||
Create a `.env` file with your API keys:
|
||||
|
||||
```
|
||||
OPENAI_API_KEY=your_key_here
|
||||
ANTHROPIC_API_KEY=your_key_here
|
||||
|
|
@ -336,6 +344,7 @@ OPENROUTER_API_KEY=your_key_here
|
|||
### Model Configuration
|
||||
|
||||
Models can be assigned to powers in `ai_diplomacy/utils.py`:
|
||||
|
||||
```python
|
||||
def assign_models_to_powers() -> Dict[str, str]:
|
||||
return {
|
||||
|
|
@ -350,6 +359,7 @@ def assign_models_to_powers() -> Dict[str, str]:
|
|||
```
|
||||
|
||||
Supported models include:
|
||||
|
||||
- OpenAI: `gpt-4o`, `gpt-4.1`, `o3`, `o4-mini`
|
||||
- Anthropic: `claude-3-5-sonnet-20241022`, `claude-opus-4-20250514`
|
||||
- Google: `gemini-2.0-flash`, `gemini-2.5-pro-preview`
|
||||
|
|
@ -358,6 +368,7 @@ Supported models include:
|
|||
### Game Output and Analysis
|
||||
|
||||
Games are saved to the `results/` directory with timestamps. Each game folder contains:
|
||||
|
||||
- `lmvsgame.json` - Complete game data including phase summaries and agent relationships
|
||||
- `overview.jsonl` - Error statistics and model assignments
|
||||
- `game_manifesto.txt` - Strategic directives from planning phases
|
||||
|
|
@ -365,6 +376,7 @@ Games are saved to the `results/` directory with timestamps. Each game folder co
|
|||
- `llm_responses.csv` - Complete log of all LLM interactions
|
||||
|
||||
The game JSON includes special fields for AI analysis:
|
||||
|
||||
- `phase_summaries` - Categorized move results for each phase
|
||||
- `agent_relationships` - Diplomatic standings at each phase
|
||||
- `final_agent_states` - End-game goals and relationships
|
||||
|
|
@ -373,29 +385,32 @@ The game JSON includes special fields for AI analysis:
|
|||
|
||||
For detailed analysis of LLM interactions and order success rates, a two-step pipeline is used:
|
||||
|
||||
1. **Convert CSV to RL JSON**:
|
||||
1. **Convert CSV to RL JSON**:
|
||||
The `csv_to_rl_json.py` script processes `llm_responses.csv` files, typically found in game-specific subdirectories ending with "FULL_GAME" (e.g., `results/20250524_..._FULL_GAME/`). It converts this raw interaction data into a JSON format suitable for Reinforcement Learning (RL) analysis.
|
||||
|
||||
To process all relevant CSVs in batch:
|
||||
|
||||
```bash
|
||||
python csv_to_rl_json.py --scan_dir results/
|
||||
```
|
||||
|
||||
This command scans the `results/` directory for "FULL_GAME" subfolders, converts their `llm_responses.csv` files, and outputs all generated `*_rl.json` files into the `results/json/` directory.
|
||||
|
||||
2. **Analyze RL JSON Files**:
|
||||
2. **Analyze RL JSON Files**:
|
||||
The `analyze_rl_json.py` script then analyzes the JSON files generated in the previous step. It aggregates statistics on successful and failed convoy and support orders, categorized by model.
|
||||
|
||||
To run the analysis:
|
||||
|
||||
```bash
|
||||
python analyze_rl_json.py results/json/
|
||||
```
|
||||
|
||||
This command processes all `*_rl.json` files in the `results/json/` directory and generates two reports in the project's root directory:
|
||||
- `analysis_summary.txt`: A clean summary of order statistics.
|
||||
- `analysis_summary_debug.txt`: A detailed report including unique 'success' field values and other debug information.
|
||||
|
||||
This pipeline allows for a comprehensive understanding of LLM performance in generating valid and successful game orders.
|
||||
|
||||
|
||||
### Post-Game Analysis Tools
|
||||
|
||||
#### Strategic Moment Analysis
|
||||
|
|
@ -414,6 +429,7 @@ python analyze_game_moments.py results/game_folder --model claude-3-5-sonnet-202
|
|||
```
|
||||
|
||||
The analysis identifies:
|
||||
|
||||
- **Betrayals**: When powers explicitly promise one action but take contradictory action
|
||||
- **Collaborations**: Successfully coordinated actions between powers
|
||||
- **Playing Both Sides**: Powers making conflicting promises to different parties
|
||||
|
|
@ -421,6 +437,7 @@ The analysis identifies:
|
|||
- **Strategic Blunders**: Major mistakes that significantly weaken a position
|
||||
|
||||
Analysis outputs include:
|
||||
|
||||
- **Markdown Report** (`game_moments/[game]_report_[timestamp].md`)
|
||||
- AI-generated narrative of the entire game
|
||||
- Summary statistics (betrayals, collaborations, etc.)
|
||||
|
|
@ -433,6 +450,7 @@ Analysis outputs include:
|
|||
- Raw lie detection data for further analysis
|
||||
|
||||
Example output snippet:
|
||||
|
||||
```markdown
|
||||
## Power Models
|
||||
- **TURKEY**: o3
|
||||
|
|
@ -452,11 +470,13 @@ Example output snippet:
|
|||
#### Diplomatic Lie Detection
|
||||
|
||||
The analysis system can detect lies by comparing:
|
||||
|
||||
1. **Messages**: What powers promise to each other
|
||||
2. **Private Diaries**: What powers privately plan (from negotiation_diary entries)
|
||||
3. **Actual Orders**: What they actually do
|
||||
|
||||
Lies are classified as:
|
||||
|
||||
- **Intentional**: Diary shows planned deception (e.g., "mislead them", "while actually...")
|
||||
- **Unintentional**: No evidence of planned deception (likely misunderstandings)
|
||||
|
||||
|
|
@ -475,6 +495,7 @@ npm run dev
|
|||
```
|
||||
|
||||
Features:
|
||||
|
||||
- 3D map with unit movements and battles
|
||||
- Phase-by-phase playback controls
|
||||
- Chat window showing diplomatic messages
|
||||
|
|
@ -486,11 +507,13 @@ Features:
|
|||
Analysis of hundreds of AI games reveals interesting patterns:
|
||||
|
||||
#### Model Performance Characteristics
|
||||
|
||||
- **Invalid Move Rates**: Some models (e.g., o3) generate more invalid moves but play aggressively
|
||||
- **Deception Patterns**: Models vary dramatically in honesty (0-100% intentional lie rates)
|
||||
- **Strategic Styles**: From defensive/honest to aggressive/deceptive playstyles
|
||||
|
||||
#### Common Strategic Patterns
|
||||
|
||||
- **Opening Gambits**: RT Juggernaut (Russia-Turkey), Western Triple, Lepanto
|
||||
- **Mid-game Dynamics**: Stab timing, alliance shifts, convoy operations
|
||||
- **Endgame Challenges**: Stalemate lines, forced draws, kingmaking
|
||||
|
|
@ -508,7 +531,6 @@ Analysis of hundreds of AI games reveals interesting patterns:
|
|||
|
||||
---
|
||||
|
||||
|
||||
<p align="center">
|
||||
<img width="500" src="docs/images/map_overview.png" alt="Diplomacy Map Overview">
|
||||
</p>
|
||||
|
|
@ -522,6 +544,7 @@ The complete documentation is available at [diplomacy.readthedocs.io](https://di
|
|||
### 1. Strategic Moment Analysis (`analyze_game_moments.py`)
|
||||
|
||||
Comprehensive analysis of game dynamics:
|
||||
|
||||
```bash
|
||||
python analyze_game_moments.py results/game_folder [options]
|
||||
|
||||
|
|
@ -536,6 +559,7 @@ Options:
|
|||
### 2. Focused Lie Detection (`analyze_lies_focused.py`)
|
||||
|
||||
Detailed analysis of diplomatic deception:
|
||||
|
||||
```bash
|
||||
python analyze_lies_focused.py results/game_folder [--output report.md]
|
||||
```
|
||||
|
|
@ -543,6 +567,7 @@ python analyze_lies_focused.py results/game_folder [--output report.md]
|
|||
### 3. Game Results Statistics (`analyze_game_results.py`)
|
||||
|
||||
Aggregates win/loss statistics across all completed games:
|
||||
|
||||
```bash
|
||||
python analyze_game_results.py
|
||||
# Creates model_power_statistics.csv
|
||||
|
|
@ -553,6 +578,7 @@ Analyzes all `*_FULL_GAME` folders to show how many times each model played as e
|
|||
### 4. Game Visualization (`ai_animation/`)
|
||||
|
||||
Interactive 3D visualization of games:
|
||||
|
||||
```bash
|
||||
cd ai_animation
|
||||
npm install
|
||||
|
|
@ -564,14 +590,24 @@ npm run dev
|
|||
|
||||
### Installation
|
||||
|
||||
The latest version of the package can be installed with:
|
||||
This project uses [uv](https://github.com/astral-sh/uv) for Python dependency management.
|
||||
|
||||
```python3
|
||||
pip install diplomacy
|
||||
#### Setup Project Dependencies
|
||||
|
||||
```bash
|
||||
# Clone the repository
|
||||
git clone https://github.com/your-repo/AI_Diplomacy.git
|
||||
cd AI_Diplomacy
|
||||
|
||||
# Install dependencies and create virtual environment
|
||||
uv sync
|
||||
|
||||
# Activate the virtual environment
|
||||
source .venv/bin/activate # On Unix/macOS
|
||||
# or
|
||||
.venv\Scripts\activate # On Windows
|
||||
```
|
||||
|
||||
The package is compatible with Python 3.5, 3.6, and 3.7.
|
||||
|
||||
### Running a game
|
||||
|
||||
The following script plays a game locally by submitting random valid orders until the game is completed.
|
||||
|
|
@ -640,7 +676,7 @@ npm start
|
|||
python -m diplomacy.server.run
|
||||
```
|
||||
|
||||
The web interface will be accessible at http://localhost:3000.
|
||||
The web interface will be accessible at <http://localhost:3000>.
|
||||
|
||||
To login, users can use admin/password or username/password. Additional users can be created by logging in with a username that does not exist in the database.
|
||||
|
||||
|
|
@ -652,7 +688,6 @@ It is possible to visualize a game by using the "Load a game from disk" menu on
|
|||
|
||||

|
||||
|
||||
|
||||
## Network Game
|
||||
|
||||
It is possible to join a game remotely over a network using websockets. The script below plays a game over a network.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue