docs: Update installation instructions in eval README

This commit is contained in:
Andreas Koepf (aider) 2025-02-25 15:28:12 +01:00 committed by Andreas Koepf
parent a1b0a0414e
commit e48c1f82cd
2 changed files with 12 additions and 5 deletions

View file

@ -18,17 +18,22 @@ This framework provides tools to evaluate language models on the reasoning_gym d
## Setup ## Setup
1. Install the required dependencies: 1. Install reasoning-gym in development mode:
```bash ```bash
pip install -r requirements.txt pip install -e ..
``` ```
2. Set your OpenRouter API key as an environment variable: 2. Install the additional dependencies required for evaluation:
```bash
pip install -r requirements-eval.txt
```
3. Set your OpenRouter API key as an environment variable:
```bash ```bash
export OPENROUTER_API_KEY=your-api-key export OPENROUTER_API_KEY=your-api-key
``` ```
3. Prepare your dataset configuration in JSON format (e.g., `eval_basic.json`): 4. Prepare your dataset configuration in JSON format (e.g., `eval_basic.json`):
```json ```json
[ [
{ {
@ -47,9 +52,11 @@ You can run evaluations in two ways:
1. Using the provided bash script: 1. Using the provided bash script:
```bash ```bash
./run_eval.sh ./eval.sh
``` ```
Before running, you may want to edit the `eval.sh` script to configure which models to evaluate by modifying the `MODELS` array.
2. Running the Python script directly: 2. Running the Python script directly:
```bash ```bash
python eval.py --model "model-name" --config "eval_basic.json" --output-dir "results" python eval.py --model "model-name" --config "eval_basic.json" --output-dir "results"