AlxAI
4ee7c2f68a
adding prompts benchmark
2025-08-04 16:47:25 -04:00
AlxAI
3b5f3015c1
Added country specific prompts and more async to speed up
2025-08-02 14:48:03 -04:00
AlxAI
9fc25f2fec
Add comprehensive Diplomacy analysis with visualizations
...
- Added diplomacy_unified_analysis_final.py: Complete analysis script with CSV-only approach
- Added DIPLOMACY_ANALYSIS_DOCUMENTATION.md: Comprehensive project documentation
- Added visualization_experiments_log.md: Detailed development history
- Added visualization_results/: AAAI-quality visualizations showing model evolution
- Fixed old format success calculation bug (results keyed by unit location)
- Demonstrated AI evolution from passive to active play across 61 models
- Updated .gitignore to exclude results_alpha
🤖 Generated with [Claude Code](https://claude.ai/code )
Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-27 13:29:29 -04:00
sam-paech
b4a56126ec
state update fixes & streamline prompts
2025-07-12 10:17:17 +10:00
sam-paech
1f154a7073
fixes for state updates
2025-07-10 21:52:22 +10:00
sam-paech
70a876bcee
add relationship plots
2025-07-10 10:35:17 +10:00
sam-paech
e351aa3841
savegame fix + chart updates
2025-07-05 00:27:02 +10:00
sam-paech
7edc7c465f
fix prompt errors & add per-power prompt dir functionality
2025-07-04 11:31:57 +10:00
sam-paech
a241e34496
fix to respect model ids passed as args when resuming
2025-07-03 09:59:47 +10:00
sam-paech
1cb24f1884
set minimalist prompts as harness defaults
2025-07-03 07:27:48 +10:00
AlxAI
7b633a0ec8
making phase summaries and unformatted prompts params in lm_game as well as making the models changeable in utils
2025-06-30 12:03:37 +02:00
AlxAI
861a5a222f
fixed un-needed changes
2025-06-30 09:32:26 +02:00
sam-paech
b5a84867a1
add order history
2025-06-29 01:53:03 +10:00
sam-paech
ebf26cf8a6
add simplified prompts
2025-06-27 14:42:05 +10:00
sam-paech
840c6b0ad9
add experiment runner
2025-06-22 18:05:07 +10:00
sam-paech
7d50b31e34
add resuming support + critical state analysis mode
2025-06-22 14:41:29 +10:00
sam-paech
a4855caaae
add new args to main script: max_tokens & max_tokens_per_model
2025-06-20 05:58:47 +10:00
sam-paech
b405cf30c7
revamp diary consolidation; simplify negotiations prompt
2025-06-19 15:39:20 +10:00
AlxAI
9b24abef53
webhooks
2025-06-02 20:33:36 -04:00
AlxAI
4b92dd5af0
updating analysis with lie detection (it's not great yet)
2025-05-24 20:44:23 -04:00
AlxAI
742e260464
fixing eliminated powers
2025-05-21 21:27:49 -07:00
AlxAI
9322ada62b
analyze moments, run big models well
2025-05-20 20:04:19 -07:00
AlxAI
f36d5672ea
consolidation of diary!
2025-05-18 19:51:34 -04:00
AlxAI
db827de273
first diary
2025-05-18 17:23:47 -04:00
AlxAI
c50ac85758
analyze game moments attempt 1
2025-05-17 22:28:22 -04:00
AlxAI
bfcb9ce401
XML didn't work
2025-05-14 22:03:13 -04:00
AlxAI
b4bc48337e
fix relationships, good to go backend
2025-05-12 13:27:57 -04:00
AlxAI
3a935c0491
fixed diary
2025-05-12 10:37:34 -04:00
AlxAI
94313c16d9
fix relationships
2025-05-11 22:19:20 -04:00
AlxAI
0bd6428729
BIG UPDATES logging everything, better structure of moves, everything runs fast af
2025-05-11 19:10:18 -04:00
sam-paech
0c7b0157b5
add private diary summaries
2025-05-11 18:38:37 +10:00
AlxAI
53e6a8fd6a
saving logs
2025-05-05 10:46:34 -04:00
AlxAI
1dc25702b6
relationships work! Everything is ready for big runs
2025-05-04 11:21:51 -04:00
AlxAI
e6ba8bfbf1
summaries working not statistical summary though
2025-04-30 23:16:57 -04:00
AlxAI
02118dc98b
async!!
2025-04-29 22:28:53 -04:00
AlxAI
eeddac0ef9
its working!
2025-04-29 20:51:46 -04:00
AlxAI
6e5079fa02
working with agent, relationships, and goals (seemingly)
2025-04-09 22:24:10 -07:00
AlxAI
70f4438b2e
state!
2025-04-07 17:25:12 -07:00
AlxAI
0242d7446b
Revert "Merge branch 'main' into animation"
...
This reverts commit d7f93f587a , reversing
changes made to d505c7ea6c .
2025-03-04 20:31:14 -08:00
AlxAI
d7f93f587a
Merge branch 'main' into animation
2025-03-04 20:26:35 -08:00
Oam Patel
1f8ac5ae20
add optional planning phase
2025-02-27 02:10:48 +00:00
AlxAI
cb04ad0be5
attempt at fixing recursive summarization
2025-02-25 06:55:21 -08:00
AlxAI
eb3de01956
dramatically improving logging thanks to new 3.7sonnet cursor agent mode
2025-02-24 15:49:37 -08:00
AlxAI
b54a8252d6
fix convoy first attempt at summaries
2025-02-23 18:18:47 -08:00
AlxAI
2693b01014
Lots of improvements to prompting putting the right information in for negotiation and phase summaries - CONVOYS BROKEN RN
2025-02-23 11:18:37 -08:00
AlxAI
6b0863cb5b
dramatically simplify phase summary
2025-02-20 22:24:24 -08:00
AlxAI
8f61ba06b3
fixed system prompt for summary , made improvements and debugging for summaries too. Much can be optimized still
2025-02-20 18:22:53 -08:00
AlxAI
72327cfb22
Randomization for powers and models + enhance order instructions
...
Also improved plotting to show model + power
2025-02-20 15:58:41 -08:00
AlxAI
b886fd7bfc
with phase summaries and country prompts
2025-02-19 21:26:54 -08:00
Oam Patel
ed10ae4b15
multithreaded negotiation + orders in context + fix GameHistory class
2025-02-18 23:10:23 +00:00