Fix final line length violations in Pokemon Showdown environment

2026-04-19 12:57:58 +00:00 · 2025-05-26 10:15:32 +10:00 · 2025-05-26 10:15:32 +10:00 · c2c4928882
commit c2c4928882
parent 0038a710d0
8 changed files with 141 additions and 69 deletions
--- a/environments/community/README.md
+++ b/environments/community/README.md
@ -386,6 +386,60 @@ A comprehensive environment for training LLMs on the challenging task of Rubik's

 **Requirements**: scipy, matplotlib, torch, transformers, wandb, plotly, flask, pydantic (see requirements.txt)

+### 12. Pokemon Showdown Environment (`pokemon-showdown/`)
+**Author**: [iyaja](https://github.com/iyaja)
+**Purpose**: Train LLMs to play Pokemon battles through strategic decision-making in competitive battles
+
+A game environment that teaches LLMs strategic thinking and decision-making through Pokemon battles using the Pokemon Showdown battle simulator. Models learn to analyze battle states, evaluate team compositions, predict opponent moves, and execute optimal strategies in turn-based combat scenarios.
+
+**Features**:
+- **Pokemon Showdown Integration**: Uses the official Pokemon Showdown battle simulator
+- **Strategic Decision Making**: Models must choose between attacking, switching, and using items
+- **Battle State Analysis**: Complete game state information including HP, status effects, and move sets
+- **Self-Play Training**: GPT player vs Max Damage baseline for RL training
+- **Random Battle Format**: Uses gen9randombattle for diverse team compositions
+- **Real-time Battle Simulation**: Asynchronous battle management with poke-env library
+
+**Training Components**:
+- **GPT Player**: LLM-controlled player that receives battle state and must choose actions
+- **Max Damage Player**: Baseline opponent that always selects highest damage moves
+- **Battle History**: Complete move sequences and outcomes for learning from experience
+- **Win/Loss Rewards**: Binary reward signal based on battle outcomes
+
+**Strategic Elements**:
+- **Type Effectiveness**: Understanding Pokemon type matchups and damage calculations
+- **Status Effects**: Managing poison, burn, paralysis, sleep, and other conditions
+- **Team Management**: Switching Pokemon strategically based on matchups
+- **Move Selection**: Choosing between different moves based on situation
+- **HP Management**: Risk assessment and resource management throughout battles
+
+**Technical Implementation**:
+- **Async Battle Management**: Non-blocking battle execution for training efficiency
+- **poke-env Integration**: Robust Pokemon battle simulation and state management
+- **Atropos RL Framework**: Standard reward signals and trajectory collection
+- **Battle Format Support**: Configurable battle formats and rule sets
+
+**Applications**:
+- Strategic game AI development
+- Turn-based decision making under uncertainty
+- Complex state space navigation
+- Competitive multi-agent training
+- Game theory and opponent modeling
+
+**Demo Resources**:
+- **W&B Dashboard**: [Training Results](https://wandb.ai/ajayuppili/atropos-environments_game_environments_pokemon-showdown)
+- **Overview Video**: TBD
+
+**Setup Requirements**:
+1. Pokemon Showdown server (local installation)
+2. poke-env Python library
+3. Node.js for Pokemon Showdown simulator
+4. OpenAI API access for GPT player
+
+**Battle Format**: gen9randombattle (Generation 9 Random Battles)
+
+**Requirements**: poke-env, nodejs, pokemon-showdown simulator, OpenAI API
+
 ---

 ## Support