Integrate chinguun101 goofy math (#145)

* Add GoofyMath environment for fun, engaging math learning * linting, moved to community folder * linting --------- Co-authored-by: chinguun101 <chinguun@uni.minerva.edu>
2026-04-19 12:57:58 +00:00 · 2025-05-28 12:11:02 +10:00 · 2025-05-28 12:11:02 +10:00 · ea304892ee
commit ea304892ee
parent 1a79132809
4 changed files with 649 additions and 5 deletions
--- a/environments/README.md
+++ b/environments/README.md
@ -338,12 +338,8 @@ Every *SEARCH/REPLACE* edit must use this format:
 Here is an example:
 ```python
 ### mathweb/flask/app.py
-<<<<<<< SEARCH
-from flask import Flask
-=======
 import math
 from flask import Flask
->>>>>>> REPLACE
 ```
 Please note that the *SEARCH/REPLACE* edit REQUIRES PROPER INDENTATION. If you would like to add the line ’ print(x)’, you must fully write that out, with all those spaces before the code!
 Wrap each *SEARCH/REPLACE* edit in a code block as shown in the example above. If you have multiple *SEARCH/REPLACE* edits, use a separate code block for each one.
@ -507,7 +503,62 @@ python -m atroposlib.cli.dpo \
 - **Combined Scoring**: Overall article score in [-1, 1] range balancing quality and accuracy
 - **W&B Integration**: Complete research session tracking with tool usage analytics

-## 33. Options Implied Volatility Prediction Environment
+## 33. Goofy Math Environment
+
+**Location:** `environments/community/goofy_math/`
+**Contributor:** [chinguun101](https://github.com/chinguun101)
+**PR:** [#91](https://github.com/NousResearch/atropos/pull/91)
+
+### Core Features
+- **Dual Reward System**: Mathematical correctness verification + goofiness scoring
+- **RLAIF-Based Judging**: AI feedback system for ranking entertaining vs. standard solutions
+- **GSM8K Integration**: Uses standard math dataset with humor enhancement overlay
+- **Position Bias Elimination**: Forward/reverse judgment pairs to ensure fair evaluation
+
+### Technical Implementation
+- **Environment Name**: `goofy_math`
+- **Correctness Verification**: Uses `math_verify` and `latex2sympy2_extended` for objective scoring
+- **Goofiness Assessment**: LLM judge evaluates entertainment value of mathematically correct solutions
+- **Reward Formula**: `score = correctness_score + (goofiness_bonus * 0.5)`
+- **Output Format**: `<think>...</think>` reasoning + `\boxed{answer}` format
+
+### Research Applications
+- **Educational AI**: Training math tutors that are both accurate and engaging
+- **Personality Injection**: Adding entertainment value while maintaining technical correctness
+- **Multi-Objective Optimization**: Balancing objective accuracy with subjective entertainment
+- **Humor in AI**: Systematic approach to training models for appropriate comedic timing
+
+### Setup and Usage
+```bash
+# Install requirements
+pip install -r environments/community/goofy_math/requirements.txt
+
+# Environment variables
+export OPENAI_API_KEY="your-key"
+
+# Process mode for examples
+python environments/community/goofy_math/goofy_math_server.py process \
+  --env.data_path_to_save_groups goofy_math_demo.jsonl \
+  --env.total_steps 3
+
+# Training mode
+python -m atroposlib.cli.dpo \
+    --env-module "environments.community.goofy_math.goofy_math_server"
+```
+
+### Performance Characteristics
+- **Correctness Requirement**: Solutions must pass mathematical verification to receive any reward
+- **Goofiness Scoring**: 0-1 range based on humor, sound effects, and creative explanations
+- **Reward Distribution**: Base 1.0 for correctness + up to 0.5 bonus for entertainment value
+- **Anti-Reward Hacking**: Goofiness only evaluated after correctness verification
+- **W&B Integration**: Tracks goofiness histograms, judgment tables, and accuracy metrics
+
+### Demo and Results
+- **Video Demo**: [1-minute demonstration](https://www.loom.com/share/8704f63e2d2e4b4db23eab673d7990a2)
+- **WandB Run**: [Experiment tracking](https://wandb.ai/goofymath/goofy_math/runs/z92gd2j4)
+- **Unique Metrics**: `train/avg_goofiness_score`, `train/goofiness_histogram`, `train/judgement_table`
+
+## 34. Options Implied Volatility Prediction Environment

 **Location:** `environments/community/options_iv_prediction/`
 **Contributor:** [michaelwaves](https://github.com/michaelwaves)