linting

2026-04-19 12:57:58 +00:00 · 2025-05-27 12:15:15 +10:00 · 2025-05-27 12:15:15 +10:00 · 54967ecae9
commit 54967ecae9
parent 13a70e09ab
19 changed files with 1337 additions and 531 deletions
--- a/environments/hack0/protein_design_env/README.md
+++ b/environments/hack0/protein_design_env/README.md
@ -1,64 +0,0 @@
-# 🧬 LLM-Guided De Novo Protein Design Environment
-
-**De novo protein binder design** is one of the hardest problems in bioengineering: you're tasked with inventing an amino acid sequence that folds into a 3D structure that binds to a given target protein. This environment lets **Large Language Models (LLMs)** tackle that problem using reinforcement learning (RL) — not by predicting sequences blindly, but by *learning to use the right tools in the right order* to produce functioning binders.
-
---
-
-## 🤖 Why LLM-based RL Instead of Classic RL?
-
-Classic RL works well for Atari, but it could never work for de novo protein binder design. Why?
-
- **Simulation is slow.** Each step—AlphaFold, RFdiffusion, ProteinMPNN—can take minutes. You don’t get to run millions of episodes like in classic RL.
- **State/action spaces are vast and weird.** Proteins are not 2D boards or pixel arrays. Designing them involves sequences, structures, config files, hotspots, and domain hacks.
- **Heuristics and intuition matter.** LLMs are pretrained on a *world model*—language, code, protein sequences, scientific papers. They come in with baked-in priors that help them reason, even under sparse rewards.
-
-**Classic RL policy networks?** They’d need to learn everything from scratch, which is impossible!
-
---
-
-## 🧪 The Protein Design Pipeline
-
-Each episode consists of an LLM navigating a 4-step design pipeline, using state-of-the-art tools as function calls:
-
-### Step 1: Target Sequence → Structure (`AlphaFold`)
- **Input:** Target protein sequence
- **Output:** 3D `.pdb` file (structure)
- **Reward:** Format validity
-
-### Step 2: Target Structure → Binder Backbone (`RFdiffusion`)
- **Input:** `.pdb` file of target
- **Output:** `.pdb` backbone of potential binder
- **Reward:** Format validity
-
-### Step 3: Backbone → Full Binder Sequence (`ProteinMPNN`)
- **Input:** Binder backbone
- **Output:** `.fasta` with side chains
- **Reward:** Format validity
-
-### Step 4: Evaluate Binding (`AlphaFold-Multimer`)
- **Input:** Target + binder sequences
- **Output:** Complex structure prediction
- **Reward:** 
-  - Format OK
-  - No steric clashes
-  - **Bonus:** Contact interface, binding affinity metrics (Not yet implemented)
-
---
-
-## 🏆 Reward Function
-
-The reward is cumulative:
- **+0.2**: Successfully generate output in correct format at each step  
- **+0.0 to +1.0:** Structural reward based on complex validity smoothly interpolated on AlphaFold2 multimere confidence
- **+1**: High predicted binding affinity (Not yet implemented)  
-
-Sparse, but real. LLMs must *plan* tool use, not just spam actions.
-
---
-
-## 🔧 Setup
-
-Access to hosted NVIDIA APIs:
-```env
-NVIDIA_NIM_API_KEY="YOUR_API_KEY"
-```