Update BLEUBERI README with OpenAI API instructions and remove redundant reward functions

This commit is contained in:
Allan Niemerg 2025-06-09 07:07:28 -05:00
parent a520f5f663
commit 3109fe349b
2 changed files with 51 additions and 198 deletions

View file

@ -20,13 +20,60 @@ BLEUBERI uses BLEU scores (a simple n-gram matching metric) directly as rewards
## Usage
```bash
# Run the BLEUBERI environment
python -m atroposlib.cli.dpo --env-module environments.bleuberi.bleuberi_env
# Run the BLEUBERI environment as a service
python -m environments.bleuberi.bleuberi_env serve --config environments/bleuberi/configs/default.yaml
# Generate data with pre-collected references
python -m environments.bleuberi.bleuberi_env process --config environments/bleuberi/configs/default.yaml
# Generate data with pre-collected references (for testing and debugging)
python -m environments.bleuberi.bleuberi_env process --config environments/bleuberi/configs/default.yaml --env.data_path_to_save_groups bleuberi_rollouts.jsonl
```
## Testing with OpenAI API
The BLEUBERI environment can be tested with OpenAI API or any compatible API server. The API key is loaded securely from environment variables:
1. Set your OpenAI API key as an environment variable:
```bash
export OPENAI_API_KEY=your-api-key
```
2. Create or modify a configuration file for OpenAI (e.g., `environments/bleuberi/configs/openai.yaml`):
```yaml
env:
# Standard environment configuration
wandb_name: bleuberi
dataset_name: "allenai/tulu-3-sft-mixture"
reward_funcs:
- "bleu"
ref_models:
- "gold"
openai:
base_url: "https://api.openai.com/v1" # Or your custom server URL
model: "gpt-4o" # Or your preferred model
temperature: 0.7
max_tokens: 1024
top_p: 0.95
```
3. Run the environment in process mode to test with OpenAI:
```bash
python -m environments.bleuberi.bleuberi_env process \
--config environments/bleuberi/configs/openai.yaml \
--env.data_path_to_save_groups bleuberi_openai_test.jsonl
```
This will create two files:
- `bleuberi_openai_test.jsonl`: Raw data containing prompts, responses, and scores
- `bleuberi_openai_test.html`: A visual representation of the interactions for easy review
4. For local inference server testing:
- Set `base_url` to your local server (e.g., "http://localhost:8000/v1")
- Specify the model name as expected by your server
5. For custom reference models:
- Configure `ref_models` in the YAML to use specific models
- Available options include: gold (default), claude-3-7-sonnet@20250219, deepseek-chat-v3, gemini-2.5-pro-exp-03-25, o4-mini-2025-04-16, Llama-3.1-8B-Instruct
## Configuration
See the `configs/` directory for example configurations. The environment supports: