Integrate RoshanSanjeev's ExamCraft environment - Merged ExamCraft environment from RoshanSanjeev PR #95 - Moved from environments/hack0/ to environments/community/ - Removed demo_artifacts.tar.gz file to avoid repo clutter - Updated community README with comprehensive ExamCraft description - Fixed all linting issues (flake8, black, isort) - Credited author @RoshanSanjeev with GitHub link

2026-04-25 17:10:42 +00:00 · 2025-05-24 13:48:52 +10:00 · 2025-05-24 13:48:52 +10:00 · 95bec5e7a8
commit 95bec5e7a8
parent 455fbd053c
7 changed files with 532 additions and 324 deletions
--- a/environments/hack0/examcraft/README.md
+++ b/environments/hack0/examcraft/README.md
@ -1,116 +0,0 @@
-# 🎓 ExamCraft: Adaptive LLM Teacher Training Environment
-
-**Hackathon Submission**: Train language models to become adaptive teachers through reinforcement learning.
-
-## 🌟 Overview
-
-ExamCraft trains LLMs to be better teachers by generating adaptive questions, providing explanations, and creating personalized lesson plans. The environment rewards effective teaching strategies and penalizes poor ones.
-
-### Key Features
- **Adaptive Question Generation**: Targets student weak areas automatically
- **Real-time Difficulty Adjustment**: Matches challenge level to student ability
- **Comprehensive Teaching Actions**: Questions, explanations, and lesson plans
- **Sophisticated Reward System**: Multi-factor scoring for teaching effectiveness
- **Student Learning Simulation**: Realistic proficiency progression
-
-## 🚀 Quick Start
-
-### Prerequisites
-```bash
-pip install -r requirements.txt
-```
-
-### Running the Environment
-
-1. **Start Atropos trajectory API**:
-```bash
-run-api
-```
-
-2. **Run environment in serve mode**:
-```bash
-python examcraft_server.py serve --slurm false
-```
-
-3. **Generate inference-only rollouts**:
-```bash
-python examcraft_server.py process --env.data_path_to_save_groups demo_output.jsonl
-```
-
-### Training with Atropos
-```bash
-# Generate SFT data
-atropos-sft-gen examcraft_sft.jsonl --tokenizer NousResearch/Hermes-3-Llama-3.1-8B
-
-# Generate DPO data  
-atropos-dpo-gen examcraft_dpo.jsonl --tokenizer NousResearch/Hermes-3-Llama-3.1-8B
-```
-
-## 🎯 Environment Design
-
-### Teaching Actions
- **QUESTION**: Generate adaptive multiple-choice questions
- **EXPLANATION**: Provide detailed concept explanations
- **LESSON_PLAN**: Create personalized study plans
-
-### Reward Components
-1. **Correctness Reward**: Base reward for student getting questions right
-2. **Targeting Bonus**: Extra points for focusing on weak topics  
-3. **Difficulty Appropriateness**: Rewards for matching difficulty to ability
-4. **Quality Bonus**: Higher scores for detailed, well-structured content
-5. **Learning Impact**: Bonuses for explanations that boost understanding
-
-### Student Simulation
- Probabilistic responses based on topic proficiency
- Dynamic learning from good teaching
- Realistic difficulty sensitivity
- Session momentum effects
-
-## 📊 Example Metrics
-
-The environment tracks:
- Student accuracy improvement across topics
- Teaching effectiveness scores
- Adaptive difficulty selection
- Content quality metrics
- Learning progression over time
-
-## 🏆 Why This Matters
-
-Adaptive AI tutoring can revolutionize education by:
- Personalizing learning experiences at scale
- Identifying knowledge gaps automatically
- Providing instant, detailed feedback
- Making quality education globally accessible
-
-## 🔧 Configuration
-
-### Student Profile Format
-```json
-{
-  "student_id": "student001",
-  "target_grade": "11th grade",
-  "learning_goal": "Master linear algebra basics",
-  "current_avg_score": 73,
-  "topics": [
-    {"name": "vectors", "proficiency": 0.65},
-    {"name": "matrices", "proficiency": 0.50}
-  ],
-  "preferred_learning_style": "visual"
-}
-```
-
-### Environment Parameters
- `max_questions_per_episode`: 8
- `student_learning_rate`: 0.03
- `enable_lesson_plans`: true
-
-## 📈 Results Preview
-
-After training, teachers learn to:
- Prioritize topics where students struggle most
- Adapt question difficulty based on recent performance  
- Generate detailed explanations that boost understanding
- Create comprehensive lesson plans targeting weak areas
-
-Built for the **Nous Research RL Environments Hackathon** 🚀