mirror of
https://github.com/GoodStartLabs/AI_Diplomacy.git
synced 2026-04-19 12:58:09 +00:00
iterating
This commit is contained in:
parent
6e5079fa02
commit
65f287df84
4 changed files with 135 additions and 44 deletions
14
README.md
14
README.md
|
|
@ -7,6 +7,7 @@ This repository is an extension of the original [Diplomacy](https://github.com/d
|
|||
- **Conversation & Negotiation**: Powers can have multi-turn negotiations with each other via `lm_game.py`. They can exchange private or global messages, allowing for more interactive diplomacy.
|
||||
- **Order Generation**: Each power can choose its orders (moves, holds, supports, etc.) using LLMs via `lm_service_versus.py`. Currently supports OpenAI, Claude, Gemini, DeepSeek
|
||||
- **Phase Summaries**: Modifications in the `game.py` engine allow the generation of "phase summaries," providing a succinct recap of each turn's events. This could help both human spectators and the LLMs themselves to understand the game state more easily.
|
||||
- **Agent State Architecture**: Powers are represented by DiplomacyAgent instances that maintain goals, relationships, and a journal tracking thoughts and decisions. This stateful design allows for more consistent and strategic play.
|
||||
- **Prompt Templates**: Prompts used by the LLMs are stored in `/prompts/`. You can edit these to customize how models are instructed for both orders and conversations.
|
||||
- **Experimental & WIP**: Ongoing development includes adding strategic goals for each power, more flexible conversation lengths, and a readiness check to advance the phase if all powers are done negotiating.
|
||||
|
||||
|
|
@ -17,21 +18,28 @@ This repository is an extension of the original [Diplomacy](https://github.com/d
|
|||
- Manages conversation rounds (currently up to 3 by default) and calls `get_conversation_reply()` for each power.
|
||||
- After negotiations, each power's orders are gathered concurrently (via threads), using `get_orders()` from the respective LLM client.
|
||||
- Calls `game.process()` to move to the next phase, optionally collecting phase summaries along the way.
|
||||
- Updates agent state after each phase to maintain continuity and strategic direction.
|
||||
|
||||
2. **`lm_service_versus.py`**
|
||||
- Defines a base class (`BaseModelClient`) for hitting any LLM endpoint.
|
||||
- Subclasses (`OpenAIClient`, `ClaudeClient`, etc.) implement `generate_response()` and `get_conversation_reply()` with the specifics of each LLM's API.
|
||||
- Handles prompt construction for orders and conversation, JSON extraction to parse moves or messages, and fallback logic for invalid LLM responses.
|
||||
|
||||
3. **Modifications in `game.py` (Engine)**
|
||||
3. **`agent.py`**
|
||||
- Implements the DiplomacyAgent class that maintains state for each power.
|
||||
- Tracks goals, relationships with other powers, and a private journal of thoughts.
|
||||
- Provides robust JSON parsing for LLM responses with case-insensitive validation.
|
||||
- Updates goals and relationships based on game events to maintain coherent strategies.
|
||||
|
||||
4. **Modifications in `game.py` (Engine)**
|
||||
- Added a `_generate_phase_summary()` method and `phase_summaries` dict to store short textual recaps of each phase.
|
||||
- Summaries can be viewed or repurposed for real-time commentary or as additional context fed back into the LLM.
|
||||
|
||||
### Future Explorations
|
||||
|
||||
- **Longer Conversation Phases**: Support for more than 3 message rounds, or an adaptive approach that ends negotiation early if all powers signal "ready."
|
||||
- **Strategic Goals**: Let each power maintain high-level goals (e.g., "ally with France," "defend Munich") that the LLM takes into account for orders and conversations.
|
||||
- **Enhanced Summaries**: Summaries could incorporate conversation logs or trending alliances, giving the LLM even richer context each turn.
|
||||
- **Enhanced Agent Memory**: Further develop agent memory and learning from past interactions to influence future decisions.
|
||||
- **Strategic Map Analysis**: Leverage the map graph structure to help agents make better tactical decisions.
|
||||
- **Live Front-End Integration**: Display phase summaries, conversation logs, and highlights of completed orders in a real-time UI. (an attempt to display phase summaries currently in progress)
|
||||
|
||||
---
|
||||
|
|
|
|||
|
|
@ -29,6 +29,35 @@
|
|||
- When animations complete, show phase summary (if available)
|
||||
- Advance to next phase and repeat
|
||||
|
||||
## Agent State Display
|
||||
The game now includes agent state data that can be visualized:
|
||||
|
||||
1. **Goals and Relationships**: Each power has strategic goals and relationships with other powers
|
||||
2. **Journal Entries**: Internal thoughts that help explain decision making
|
||||
|
||||
### JSON Format Expectations:
|
||||
- Agent state is stored in the game JSON with the following structure:
|
||||
```json
|
||||
{
|
||||
"powers": {
|
||||
"FRANCE": {
|
||||
"goals": ["Secure Belgium", "Form alliance with Italy"],
|
||||
"relationships": {
|
||||
"GERMANY": "Enemy",
|
||||
"ITALY": "Ally",
|
||||
"ENGLAND": "Neutral",
|
||||
"AUSTRIA": "Neutral",
|
||||
"RUSSIA": "Unfriendly",
|
||||
"TURKEY": "Neutral"
|
||||
},
|
||||
"journal": ["Suspicious of England's fleet movements"]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
- Relationship status must be one of: "Enemy", "Unfriendly", "Neutral", "Friendly", "Ally"
|
||||
- The code handles case variations but the display should normalize to title case
|
||||
|
||||
## Known Issues
|
||||
- Text-to-speech requires an ElevenLabs API key in `.env` file
|
||||
- Unit animations sometimes don't fire properly after messages
|
||||
|
|
|
|||
|
|
@ -33,11 +33,9 @@ This document provides an analysis of key Python modules within the `ai_diplomac
|
|||
* Unsuccessful moves by power with failure reasons
|
||||
* Optional sections for other move types
|
||||
|
||||
### PARTIALLY IMPLEMENTED MODULES:
|
||||
|
||||
#### 1.4. `agent.py` (PARTIAL)
|
||||
#### 1.4. `agent.py` (COMPLETE)
|
||||
**Goal:** To maintain stateful agent representation with personality, goals, and relationships.
|
||||
**Status:** Base class implemented but not fully integrated into planning/negotiation workflows.
|
||||
**Status:** Fully implemented and integrated with planning/negotiation workflows.
|
||||
|
||||
**Key Components:**
|
||||
* `DiplomacyAgent` class with:
|
||||
|
|
@ -46,26 +44,40 @@ This document provides an analysis of key Python modules within the `ai_diplomac
|
|||
* `goals`: List of strategic goals
|
||||
* `relationships`: Dict of relationships with other powers
|
||||
* `private_journal`: List of internal thoughts/reflections
|
||||
* `_extract_json_from_text`: Robust JSON extraction from LLM responses
|
||||
* `initialize_agent_state`: Sets initial goals and relationships
|
||||
* `analyze_phase_and_update_state`: Updates goals and relationships based on game events
|
||||
* Methods for plan generation, updating goals, and updating relationships
|
||||
|
||||
**Integration Points Needed:**
|
||||
* Connect agent state to context generation in `clients.py`
|
||||
* Define how personality affects planning and negotiations
|
||||
* Remove redundant order generation logic
|
||||
**Integration Points:**
|
||||
* Connected to context generation in `clients.py`
|
||||
* Influences planning and negotiations through goals and relationships
|
||||
* Case-insensitive validation of LLM-provided power names and relationship statuses
|
||||
* Robust error recovery with fallback defaults when LLM responses fail to parse
|
||||
|
||||
#### 1.5. `negotiations.py` (PARTIAL)
|
||||
#### 1.5. `negotiations.py` (COMPLETE)
|
||||
**Goal:** To orchestrate the communication phase among active AI powers.
|
||||
**Status:** Works but needs integration with DiplomacyAgent state.
|
||||
**Status:** Fully implemented and integrated with DiplomacyAgent state.
|
||||
|
||||
#### 1.6. `planning.py` (PARTIAL)
|
||||
#### 1.6. `planning.py` (COMPLETE)
|
||||
**Goal:** To allow each AI power to generate a high-level strategic directive or plan.
|
||||
**Status:** Works but needs integration with DiplomacyAgent state and map analysis.
|
||||
**Status:** Fully implemented and integrated with DiplomacyAgent state.
|
||||
|
||||
#### 1.7. `utils.py` (COMPLETE)
|
||||
**Goal:** To provide common utility functions used across other AI diplomacy modules.
|
||||
**Status:** Fully implemented.
|
||||
|
||||
#### 1.8. `clients.py` (COMPLETE BUT NEEDS EXTENSION)
|
||||
#### 1.8. `clients.py` (COMPLETE)
|
||||
**Goal:** To abstract and manage interactions with various LLM APIs.
|
||||
**Status:** Fully implemented with agent state integration.
|
||||
|
||||
### PARTIALLY IMPLEMENTED MODULES:
|
||||
|
||||
#### 1.9. `utils.py` (COMPLETE)
|
||||
**Goal:** To provide common utility functions used across other AI diplomacy modules.
|
||||
**Status:** Fully implemented.
|
||||
|
||||
#### 1.10. `clients.py` (COMPLETE BUT NEEDS EXTENSION)
|
||||
**Goal:** To abstract and manage interactions with various LLM APIs.
|
||||
**Status:** Works, but needs extension to incorporate agent state into context.
|
||||
|
||||
|
|
@ -73,41 +85,44 @@ This document provides an analysis of key Python modules within the `ai_diplomac
|
|||
|
||||
## 2. Integration Points
|
||||
|
||||
The following connections need to be established:
|
||||
The following connections have been established:
|
||||
|
||||
1. **Agent State → Context Building**
|
||||
* `BaseModelClient.build_context_prompt` needs to incorporate agent's personality, goals, and relationships
|
||||
* Modify `context_prompt.txt` to include sections for agent state
|
||||
* `BaseModelClient.build_context_prompt` incorporates agent's personality, goals, and relationships
|
||||
* Modified prompt templates include sections for agent state
|
||||
|
||||
2. **Map Analysis → Planning**
|
||||
* Use `DiplomacyGraph` and BFS search in `planning_phase` to identify strategic options
|
||||
* Incorporate territory accessibility analysis into strategic planning
|
||||
2. **Agent State → Negotiations**
|
||||
* Agent's personality, goals, and relationships influence message generation
|
||||
* Relationships are updated based on negotiation context and results
|
||||
|
||||
3. **Agent State → Negotiations**
|
||||
* Have agent's personality, goals, and relationships influence message generation
|
||||
* Update relationships based on negotiation context and results
|
||||
3. **Robust LLM Interaction**
|
||||
* Implemented multi-strategy JSON extraction to handle various LLM response formats
|
||||
* Added case-insensitive validation for power names and relationship statuses
|
||||
* Created fallback mechanisms for all LLM interactions
|
||||
|
||||
4. **Map Analysis → Order Generation**
|
||||
* Incorporate BFS search to help identify optimal movements and support actions
|
||||
* Analyze territory adjacency for attack planning
|
||||
4. **Error Recovery**
|
||||
* Added defensive programming throughout agent state updates
|
||||
* Implemented progressive fallback strategies for parsing LLM outputs
|
||||
* Used intelligent defaults to maintain consistent agent state
|
||||
|
||||
---
|
||||
|
||||
## 3. Next Steps (Implementation Plan)
|
||||
## 3. Future Work
|
||||
|
||||
1. **Phase 1: Agent Integration**
|
||||
* Enhance `BaseModelClient.build_context_prompt` to include agent state
|
||||
* Update prompt templates to utilize agent information
|
||||
* Add a regularly updating 'relationships' section to the context prompt
|
||||
* Add a regularly updating 'goals' section to the context prompt
|
||||
* Add a regularly updating 'journal' section to the context prompt
|
||||
|
||||
2. **Phase 2: Map Analysis Integration**
|
||||
1. **Map Analysis Integration**
|
||||
* Create utility functions to leverage BFS search for common strategic questions
|
||||
* Integrate these into planning phase
|
||||
* Add territory analysis to order generation context
|
||||
|
||||
3. **Phase 3: Test!
|
||||
2. **Enhanced Agent Adaptation**
|
||||
* Develop more sophisticated goal updating strategies based on game events
|
||||
* Implement memory of betrayals/alliances across multiple phases
|
||||
* Create feedback loops between relationship states and planning priorities
|
||||
|
||||
3. **UI Integration**
|
||||
* Expose agent states (goals, relationships) in the game visualization
|
||||
* Show evolving relationships between powers graphically
|
||||
* Integrate agent journal entries as commentary
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -131,6 +146,7 @@ The following connections need to be established:
|
|||
```
|
||||
|
||||
**Current Integration Status:**
|
||||
* `agent.py` and `map_utils.py` are implemented but not fully integrated with other modules
|
||||
* `phase_summary_callback` works in `lm_game.py` but is not integrated with agent state
|
||||
* Base message passing and planning works, but doesn't leverage agent personality or map analysis
|
||||
* `agent.py` is fully implemented and integrated with other modules
|
||||
* State updates work reliably between phases
|
||||
* Robust JSON parsing and case-insensitive validation ensure smooth operation
|
||||
* `map_utils.py` is implemented but not yet fully leveraged for strategic planning
|
||||
|
|
|
|||
|
|
@ -42,6 +42,7 @@
|
|||
| 5 | `TypeError` in `add_journal_entry` (wrong args); `JSONDecodeError` (LLM added extra text/markdown fences) | Fix args; Robust JSON parse | Partial Success* | -$100 |
|
||||
| 6 | `TypeError: wrong number of args` for state update call. | Helper fn; Sync loop; Fix | Failure | -$100 |
|
||||
| 7 | `AttributeError: 'Game' has no attribute 'get_board_state_str'/'current_year'` and JSON key mismatch | Create board_state_str from board_state; Extract year from phase name; Fix JSON key mismatches | Partial Success** | -$100 |
|
||||
| 8 | Case-sensitivity issues - power names in relationships not matching ALL_POWERS | Made relationship validation case-insensitive; Reduced log verbosity | Success | +$500 |
|
||||
|
||||
*Partial Success: Game ran 1 year, but failed during state update phase.
|
||||
**Partial Success: Game runs without crashing, but LLM responses still don't match expected JSON format.
|
||||
|
|
@ -59,7 +60,44 @@
|
|||
|
||||
**Observation:** Game now runs without crashing through basic state updates, but LLM responses don't use the expected JSON keys (they use "relationships"/"goals" while code expects "updated_relationships"/"updated_goals").
|
||||
|
||||
**Next Steps:** Fix the JSON key mismatch by either:
|
||||
1. Updating the state_update_prompt.txt to use "updated_goals" and "updated_relationships", or
|
||||
2. Modifying the agent.py code to look for "goals" and "relationships" keys and map them to the expected variables.
|
||||
## Experiment 8: Case-Insensitivity Fix
|
||||
|
||||
**Date:** 2025-04-08
|
||||
**Goal:** Fix case-sensitivity issues in relationship validation and key name mismatches.
|
||||
**Changes:**
|
||||
1. Added case-insensitive validation for power names (e.g., "Austria" → "AUSTRIA")
|
||||
2. Added case-insensitive validation for relationship statuses (e.g., "enemy" → "Enemy")
|
||||
3. Made the code look for alternative JSON key names ("goals"/"relationships" vs "updated_goals"/"updated_relationships")
|
||||
4. Reduced log noise by only showing first few validation warnings and a summary count for the rest
|
||||
5. Added fallback defaults in all error cases to ensure agent state is never empty
|
||||
|
||||
**Observation:** Game now runs successfully through multiple phases. The agent state is properly updated and maintained between phases. Logs are cleaner and more informative.
|
||||
|
||||
**Result:** Success (+$500, successfully running through all phases)
|
||||
|
||||
---
|
||||
|
||||
## Key Learnings & Best Practices
|
||||
|
||||
1. **Strong Defensive Programming**
|
||||
- Always implement fallback values when parsing LLM outputs
|
||||
- Use robust JSON extraction with multiple strategies (regex patterns, string cleaning)
|
||||
- Never assume case-sensitivity in LLM outputs - normalize all strings
|
||||
|
||||
2. **Adaptable Input Parsing**
|
||||
- Accept multiple key names for the same concept ("goals" vs "updated_goals")
|
||||
- Adopt flexible parsing approaches that can handle structural variations
|
||||
- Have clear default behaviors defined when expected data is missing
|
||||
|
||||
3. **Effective Logging**
|
||||
- Use debug logs liberally during development phases
|
||||
- Keep production logs high-signal and low-noise by limiting repeat warnings
|
||||
- Include contextual information in logs (power name, phase name) for easier debugging
|
||||
|
||||
4. **Robust Error Recovery**
|
||||
- Implement progressive fallback strategies: try parsing → try alternate formats → use defaults
|
||||
- Maintain coherent state even when errors occur - never leave agent in partial/invalid state
|
||||
- When unexpected errors occur, recover gracefully rather than crashing
|
||||
|
||||
These learnings have significantly improved the Agent architecture's reliability and are applicable to other LLM-integration contexts.
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue