3.8 KiB
Lean Theorem Proving Environment for Atropos
This environment allows testing Language Learning Models (LLMs) on Lean theorem proving tasks using atroposlib.
Setup
-
Atroposlib: Ensure
atroposlibis installed. If you are working within the mainatroposrepository, this is typically done by navigating to theatroposroot directory (i.e.,cd LeanCopilot/atropos) and running:pip install -e . -
Python Dependencies: Navigate to this environment's directory (
LeanCopilot/atropos/environments/lean_proof_env/) and install the required Python packages using the providedrequirements.txt:pip install -r requirements.txtThis will install
datasets,wandb,tqdm, andpython-dotenv. -
OpenAI API Key: This environment requires an OpenAI API key. You can set it in one of two ways:
- As an environment variable:
export OPENAI_API_KEY="your_actual_openai_api_key" - Create a
.envfile within this directory (LeanCopilot/atropos/environments/lean_proof_env/.env) with the following content:
The script will automatically load this if the environment variable is not set.OPENAI_API_KEY=your_actual_openai_api_key
- As an environment variable:
-
Lean Installation: Ensure Lean 4 is installed and the
leanexecutable is in your system's PATH. You can find installation instructions on the Lean official website.
Running the Environment
Make sure your current working directory is this environment's directory (LeanCopilot/atropos/environments/lean_proof_env/).
Run the environment script using the process command provided by AtroposBaseEnv:
python lean_proof_env.py process \
--env.lean_problem_dataset_name="internal_simple_test" \
--env.total_steps=5 \
--env.group_size=1 \
--env.wandb_name="lean_simple_test_run" \
--openai.model_name="gpt-4o"
Command-Line Arguments:
--env.lean_problem_dataset_name:"internal_simple_test": Uses a small set of hardcoded simple theorems (defined within the script)."Tonic/MiniF2F": Attempts to load the MiniF2F benchmark from Hugging Face datasets. You might need to specify--env.lean_problem_dataset_split(e.g., "test" or "train").
--env.total_steps: Number of problems (or items) to process. Theinternal_simple_testset has 8 problems; iftotal_stepsis less, a random subset will be chosen for that many steps.--env.group_size: Number of LLM attempts per problem.--env.wandb_name: Name for the WandB run ifuse_wandbis enabled (e.g., by passing--env.use_wandb=True).--openai.model_name: The OpenAI model to use (e.g., "gpt-4o", "gpt-4-turbo", "gpt-3.5-turbo"). This overrides the default inconfig_init.--openai.api_key: Can be passed explicitly, but using the environment variable or.envfile is recommended and more secure.--openai.base_url: If using a custom OpenAI-compatible API endpoint.
The script will generate a .jsonl file with the trajectories and an HTML visualization of the rollouts. By default, these are saved in a data/ subdirectory created within the current working directory (e.g., LeanCopilot/atropos/environments/lean_proof_env/data/).
Customization
- Problem Sets: Modify the
setup()method inlean_proof_env.pyto load different Lean problems or datasets. - LLM Prompts: Adjust the
get_next_item()method inlean_proof_env.pyto change the prompts sent to the LLM. - Configuration: Default configurations are defined in the
LeanProofEnvConfigclass withinlean_proof_env.py. Most of these can be overridden by command-line arguments (e.g.,--env.max_proof_generation_tokens=256).