mirror of https://github.com/NousResearch/atropos.git synced 2026-04-19 12:57:58 +00:00

History

Shannon Sands d768ad68aa merged latest		2025-05-15 11:54:33 -07:00
..
__init__.py	copied from trajectory handler branch	2025-05-12 07:26:10 +10:00
curriculum.py	linting and local testing tidy up	2025-05-12 08:07:39 +10:00
infinimath.py	Updated comments	2025-05-12 08:01:46 +10:00
infinimath_env.py	merged latest	2025-05-15 11:54:33 -07:00
infinimath_local_server.py	merged latest	2025-05-15 11:54:33 -07:00
README.md	copied from trajectory handler branch	2025-05-12 07:26:10 +10:00

README.md

InfiniteMath Environment

Environment Overview

This environment provides procedurally generated math problems with curriculum-based advancement. It allows an agent to solve increasingly difficult math problems, with the difficulty level adapting based on performance.

Demonstrates:

Procedural content generation (math problems).
Curriculum learning: The environment automatically adjusts the difficulty (levels 1-7) based on the LLM's success rate.
Step-by-step reasoning evaluation: Rewards correctness, the presence of reasoning steps (within <think> tags), and the final answer format (\boxed{}).
Handling LaTeX formatting for problems and answers.

Training Goal:

To train LLMs to solve mathematical problems accurately.
To encourage explicit step-by-step reasoning before providing an answer.
To improve the LLM's ability to follow specific formatting instructions (using <think> tags and \boxed{}).
To teach the model to handle progressively more complex problems through the curriculum.

Features

Progressive difficulty scaling across 7 levels of math problems
Built-in curriculum system that adapts to agent performance
Automatic problem generation with solutions
Reward functions for accuracy, formatting, and boxed answer checking

Usage

Running with Default Configuration

To run the InfiniteMath environment with the default configuration:

python environments/infinite_math/infinimath_local_server.py

This will use the default configuration from configs/envs/infinimath.yaml.

Custom Configuration

You can specify a custom configuration file:

python environments/infinite_math/infinimath_local_server.py --config my_custom_config

The --config parameter can be:

A name (without .yaml extension) which will be looked up in configs/envs/
A relative or absolute path to a YAML file

For example:

# Using a config in configs/envs/
python environments/infinite_math/infinimath_local_server.py --config infinimath_hard

# Using a config with full path
python environments/infinite_math/infinimath_local_server.py --config /path/to/my/config.yaml

Configuration Structure

The configuration file follows this structure:

# Base environment parameters
tokenizer_name: "NousResearch/DeepHermes-3-Llama-3-8B-Preview"
group_size: 1
use_wandb: false
# ... other base parameters

# InfiniteMath specific configuration
infinimath:
  # Curriculum parameters
  starting_level: 1
  progress_threshold: 0.7
  # ... other InfiniteMath specific parameters

# Server configuration
server_configs:
  - model_name: "gpt-4.1-nano"
    api_key: ${OPENAI_API_KEY}
    num_requests_for_eval: 70

Important Configuration Parameters

Base Parameters

tokenizer_name: The tokenizer to use for encoding/decoding text
group_size: Number of responses to collect per prompt
max_token_length: Maximum token length for generation
steps_per_eval: How often to run evaluations

InfiniteMath Specific Parameters

starting_level: Initial difficulty level (1-7)
progress_threshold: Success rate needed to advance levels
min_evaluations: Minimum number of evaluations before level advancement
reward_functions: List of reward functions to apply

Server Configuration

model_name: LLM model to use
api_key: API key for the model (can use environment variables with ${VAR_NAME} syntax)
num_requests_for_eval: Number of evaluation requests to allocate