mirror of https://github.com/NousResearch/atropos.git synced 2026-04-19 12:57:58 +00:00

History

pre-commit-ci[bot] 34cabbb30f [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci		2025-09-15 16:41:26 +00:00
..
examcraft_server.py	[pre-commit.ci] auto fixes from pre-commit.com hooks	2025-09-15 16:41:26 +00:00
README.md	Integrate RoshanSanjeev's ExamCraft environment - Merged ExamCraft environment from RoshanSanjeev PR #95 - Moved from environments/hack0/ to environments/community/ - Removed demo_artifacts.tar.gz file to avoid repo clutter - Updated community README with comprehensive ExamCraft description - Fixed all linting issues (flake8, black, isort) - Credited author @RoshanSanjeev with GitHub link	2025-05-24 13:48:52 +10:00
requirements.txt	Integrate RoshanSanjeev's ExamCraft environment - Merged ExamCraft environment from RoshanSanjeev PR #95 - Moved from environments/hack0/ to environments/community/ - Removed demo_artifacts.tar.gz file to avoid repo clutter - Updated community README with comprehensive ExamCraft description - Fixed all linting issues (flake8, black, isort) - Credited author @RoshanSanjeev with GitHub link	2025-05-24 13:48:52 +10:00
visual_question_demo.py	Integrate RoshanSanjeev's ExamCraft environment - Merged ExamCraft environment from RoshanSanjeev PR #95 - Moved from environments/hack0/ to environments/community/ - Removed demo_artifacts.tar.gz file to avoid repo clutter - Updated community README with comprehensive ExamCraft description - Fixed all linting issues (flake8, black, isort) - Credited author @RoshanSanjeev with GitHub link	2025-05-24 13:48:52 +10:00

README.md

🎓 ExamCraft: Adaptive LLM Teacher Training Environment

Hackathon Submission: Train language models to become adaptive teachers through reinforcement learning.

🌟 Overview

ExamCraft trains LLMs to be better teachers by generating adaptive questions, providing explanations, and creating personalized lesson plans. The environment rewards effective teaching strategies and penalizes poor ones.

Key Features

Adaptive Question Generation: Targets student weak areas automatically
Real-time Difficulty Adjustment: Matches challenge level to student ability
Comprehensive Teaching Actions: Questions, explanations, and lesson plans
Sophisticated Reward System: Multi-factor scoring for teaching effectiveness
Student Learning Simulation: Realistic proficiency progression

🚀 Quick Start

Prerequisites

pip install -r requirements.txt

Running the Environment

Start Atropos trajectory API:

run-api

Run environment in serve mode:

python examcraft_server.py serve --slurm false

Generate inference-only rollouts:

python examcraft_server.py process --env.data_path_to_save_groups demo_output.jsonl

Training with Atropos

# Generate SFT data
atropos-sft-gen examcraft_sft.jsonl --tokenizer NousResearch/Hermes-3-Llama-3.1-8B

# Generate DPO data
atropos-dpo-gen examcraft_dpo.jsonl --tokenizer NousResearch/Hermes-3-Llama-3.1-8B

🎯 Environment Design

Teaching Actions

QUESTION: Generate adaptive multiple-choice questions
EXPLANATION: Provide detailed concept explanations
LESSON_PLAN: Create personalized study plans

Reward Components

Correctness Reward: Base reward for student getting questions right
Targeting Bonus: Extra points for focusing on weak topics
Difficulty Appropriateness: Rewards for matching difficulty to ability
Quality Bonus: Higher scores for detailed, well-structured content
Learning Impact: Bonuses for explanations that boost understanding

Student Simulation

Probabilistic responses based on topic proficiency
Dynamic learning from good teaching
Realistic difficulty sensitivity
Session momentum effects

📊 Example Metrics

The environment tracks:

Student accuracy improvement across topics
Teaching effectiveness scores
Adaptive difficulty selection
Content quality metrics
Learning progression over time

🏆 Why This Matters

Adaptive AI tutoring can revolutionize education by:

Personalizing learning experiences at scale
Identifying knowledge gaps automatically
Providing instant, detailed feedback
Making quality education globally accessible

🔧 Configuration

Student Profile Format

{
  "student_id": "student001",
  "target_grade": "11th grade",
  "learning_goal": "Master linear algebra basics",
  "current_avg_score": 73,
  "topics": [
    {"name": "vectors", "proficiency": 0.65},
    {"name": "matrices", "proficiency": 0.50}
  ],
  "preferred_learning_style": "visual"
}

Environment Parameters

max_questions_per_episode: 8
student_learning_rate: 0.03
enable_lesson_plans: true

📈 Results Preview

After training, teachers learn to:

Prioritize topics where students struggle most
Adapt question difficulty based on recent performance
Generate detailed explanations that boost understanding
Create comprehensive lesson plans targeting weak areas

Built for the Nous Research RL Environments Hackathon 🚀