mirror of
https://github.com/NousResearch/atropos.git
synced 2026-04-19 12:57:58 +00:00
|
|
||
|---|---|---|
| .. | ||
| examcraft_server.py | ||
| README.md | ||
| requirements.txt | ||
| visual_question_demo.py | ||
🎓 ExamCraft: Adaptive LLM Teacher Training Environment
Hackathon Submission: Train language models to become adaptive teachers through reinforcement learning.
🌟 Overview
ExamCraft trains LLMs to be better teachers by generating adaptive questions, providing explanations, and creating personalized lesson plans. The environment rewards effective teaching strategies and penalizes poor ones.
Key Features
- Adaptive Question Generation: Targets student weak areas automatically
- Real-time Difficulty Adjustment: Matches challenge level to student ability
- Comprehensive Teaching Actions: Questions, explanations, and lesson plans
- Sophisticated Reward System: Multi-factor scoring for teaching effectiveness
- Student Learning Simulation: Realistic proficiency progression
🚀 Quick Start
Prerequisites
pip install -r requirements.txt
Running the Environment
- Start Atropos trajectory API:
run-api
- Run environment in serve mode:
python examcraft_server.py serve --slurm false
- Generate inference-only rollouts:
python examcraft_server.py process --env.data_path_to_save_groups demo_output.jsonl
Training with Atropos
# Generate SFT data
atropos-sft-gen examcraft_sft.jsonl --tokenizer NousResearch/Hermes-3-Llama-3.1-8B
# Generate DPO data
atropos-dpo-gen examcraft_dpo.jsonl --tokenizer NousResearch/Hermes-3-Llama-3.1-8B
🎯 Environment Design
Teaching Actions
- QUESTION: Generate adaptive multiple-choice questions
- EXPLANATION: Provide detailed concept explanations
- LESSON_PLAN: Create personalized study plans
Reward Components
- Correctness Reward: Base reward for student getting questions right
- Targeting Bonus: Extra points for focusing on weak topics
- Difficulty Appropriateness: Rewards for matching difficulty to ability
- Quality Bonus: Higher scores for detailed, well-structured content
- Learning Impact: Bonuses for explanations that boost understanding
Student Simulation
- Probabilistic responses based on topic proficiency
- Dynamic learning from good teaching
- Realistic difficulty sensitivity
- Session momentum effects
📊 Example Metrics
The environment tracks:
- Student accuracy improvement across topics
- Teaching effectiveness scores
- Adaptive difficulty selection
- Content quality metrics
- Learning progression over time
🏆 Why This Matters
Adaptive AI tutoring can revolutionize education by:
- Personalizing learning experiences at scale
- Identifying knowledge gaps automatically
- Providing instant, detailed feedback
- Making quality education globally accessible
🔧 Configuration
Student Profile Format
{
"student_id": "student001",
"target_grade": "11th grade",
"learning_goal": "Master linear algebra basics",
"current_avg_score": 73,
"topics": [
{"name": "vectors", "proficiency": 0.65},
{"name": "matrices", "proficiency": 0.50}
],
"preferred_learning_style": "visual"
}
Environment Parameters
max_questions_per_episode: 8student_learning_rate: 0.03enable_lesson_plans: true
📈 Results Preview
After training, teachers learn to:
- Prioritize topics where students struggle most
- Adapt question difficulty based on recent performance
- Generate detailed explanations that boost understanding
- Create comprehensive lesson plans targeting weak areas
Built for the Nous Research RL Environments Hackathon 🚀