mirror of
https://github.com/NousResearch/atropos.git
synced 2026-04-25 17:10:42 +00:00
Integrate RoshanSanjeev's ExamCraft environment - Merged ExamCraft environment from RoshanSanjeev PR #95 - Moved from environments/hack0/ to environments/community/ - Removed demo_artifacts.tar.gz file to avoid repo clutter - Updated community README with comprehensive ExamCraft description - Fixed all linting issues (flake8, black, isort) - Credited author @RoshanSanjeev with GitHub link
This commit is contained in:
parent
455fbd053c
commit
95bec5e7a8
7 changed files with 532 additions and 324 deletions
|
|
@ -1,116 +0,0 @@
|
|||
# 🎓 ExamCraft: Adaptive LLM Teacher Training Environment
|
||||
|
||||
**Hackathon Submission**: Train language models to become adaptive teachers through reinforcement learning.
|
||||
|
||||
## 🌟 Overview
|
||||
|
||||
ExamCraft trains LLMs to be better teachers by generating adaptive questions, providing explanations, and creating personalized lesson plans. The environment rewards effective teaching strategies and penalizes poor ones.
|
||||
|
||||
### Key Features
|
||||
- **Adaptive Question Generation**: Targets student weak areas automatically
|
||||
- **Real-time Difficulty Adjustment**: Matches challenge level to student ability
|
||||
- **Comprehensive Teaching Actions**: Questions, explanations, and lesson plans
|
||||
- **Sophisticated Reward System**: Multi-factor scoring for teaching effectiveness
|
||||
- **Student Learning Simulation**: Realistic proficiency progression
|
||||
|
||||
## 🚀 Quick Start
|
||||
|
||||
### Prerequisites
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
### Running the Environment
|
||||
|
||||
1. **Start Atropos trajectory API**:
|
||||
```bash
|
||||
run-api
|
||||
```
|
||||
|
||||
2. **Run environment in serve mode**:
|
||||
```bash
|
||||
python examcraft_server.py serve --slurm false
|
||||
```
|
||||
|
||||
3. **Generate inference-only rollouts**:
|
||||
```bash
|
||||
python examcraft_server.py process --env.data_path_to_save_groups demo_output.jsonl
|
||||
```
|
||||
|
||||
### Training with Atropos
|
||||
```bash
|
||||
# Generate SFT data
|
||||
atropos-sft-gen examcraft_sft.jsonl --tokenizer NousResearch/Hermes-3-Llama-3.1-8B
|
||||
|
||||
# Generate DPO data
|
||||
atropos-dpo-gen examcraft_dpo.jsonl --tokenizer NousResearch/Hermes-3-Llama-3.1-8B
|
||||
```
|
||||
|
||||
## 🎯 Environment Design
|
||||
|
||||
### Teaching Actions
|
||||
- **QUESTION**: Generate adaptive multiple-choice questions
|
||||
- **EXPLANATION**: Provide detailed concept explanations
|
||||
- **LESSON_PLAN**: Create personalized study plans
|
||||
|
||||
### Reward Components
|
||||
1. **Correctness Reward**: Base reward for student getting questions right
|
||||
2. **Targeting Bonus**: Extra points for focusing on weak topics
|
||||
3. **Difficulty Appropriateness**: Rewards for matching difficulty to ability
|
||||
4. **Quality Bonus**: Higher scores for detailed, well-structured content
|
||||
5. **Learning Impact**: Bonuses for explanations that boost understanding
|
||||
|
||||
### Student Simulation
|
||||
- Probabilistic responses based on topic proficiency
|
||||
- Dynamic learning from good teaching
|
||||
- Realistic difficulty sensitivity
|
||||
- Session momentum effects
|
||||
|
||||
## 📊 Example Metrics
|
||||
|
||||
The environment tracks:
|
||||
- Student accuracy improvement across topics
|
||||
- Teaching effectiveness scores
|
||||
- Adaptive difficulty selection
|
||||
- Content quality metrics
|
||||
- Learning progression over time
|
||||
|
||||
## 🏆 Why This Matters
|
||||
|
||||
Adaptive AI tutoring can revolutionize education by:
|
||||
- Personalizing learning experiences at scale
|
||||
- Identifying knowledge gaps automatically
|
||||
- Providing instant, detailed feedback
|
||||
- Making quality education globally accessible
|
||||
|
||||
## 🔧 Configuration
|
||||
|
||||
### Student Profile Format
|
||||
```json
|
||||
{
|
||||
"student_id": "student001",
|
||||
"target_grade": "11th grade",
|
||||
"learning_goal": "Master linear algebra basics",
|
||||
"current_avg_score": 73,
|
||||
"topics": [
|
||||
{"name": "vectors", "proficiency": 0.65},
|
||||
{"name": "matrices", "proficiency": 0.50}
|
||||
],
|
||||
"preferred_learning_style": "visual"
|
||||
}
|
||||
```
|
||||
|
||||
### Environment Parameters
|
||||
- `max_questions_per_episode`: 8
|
||||
- `student_learning_rate`: 0.03
|
||||
- `enable_lesson_plans`: true
|
||||
|
||||
## 📈 Results Preview
|
||||
|
||||
After training, teachers learn to:
|
||||
- Prioritize topics where students struggle most
|
||||
- Adapt question difficulty based on recent performance
|
||||
- Generate detailed explanations that boost understanding
|
||||
- Create comprehensive lesson plans targeting weak areas
|
||||
|
||||
Built for the **Nous Research RL Environments Hackathon** 🚀
|
||||
Loading…
Add table
Add a link
Reference in a new issue