mirror of
https://github.com/NousResearch/atropos.git
synced 2026-04-28 17:29:30 +00:00
Update BLEUBERI README with OpenAI API instructions and remove redundant reward functions
This commit is contained in:
parent
a520f5f663
commit
3109fe349b
2 changed files with 51 additions and 198 deletions
|
|
@ -20,13 +20,60 @@ BLEUBERI uses BLEU scores (a simple n-gram matching metric) directly as rewards
|
|||
## Usage
|
||||
|
||||
```bash
|
||||
# Run the BLEUBERI environment
|
||||
python -m atroposlib.cli.dpo --env-module environments.bleuberi.bleuberi_env
|
||||
# Run the BLEUBERI environment as a service
|
||||
python -m environments.bleuberi.bleuberi_env serve --config environments/bleuberi/configs/default.yaml
|
||||
|
||||
# Generate data with pre-collected references
|
||||
python -m environments.bleuberi.bleuberi_env process --config environments/bleuberi/configs/default.yaml
|
||||
# Generate data with pre-collected references (for testing and debugging)
|
||||
python -m environments.bleuberi.bleuberi_env process --config environments/bleuberi/configs/default.yaml --env.data_path_to_save_groups bleuberi_rollouts.jsonl
|
||||
```
|
||||
|
||||
## Testing with OpenAI API
|
||||
|
||||
The BLEUBERI environment can be tested with OpenAI API or any compatible API server. The API key is loaded securely from environment variables:
|
||||
|
||||
1. Set your OpenAI API key as an environment variable:
|
||||
```bash
|
||||
export OPENAI_API_KEY=your-api-key
|
||||
```
|
||||
|
||||
2. Create or modify a configuration file for OpenAI (e.g., `environments/bleuberi/configs/openai.yaml`):
|
||||
```yaml
|
||||
env:
|
||||
# Standard environment configuration
|
||||
wandb_name: bleuberi
|
||||
dataset_name: "allenai/tulu-3-sft-mixture"
|
||||
reward_funcs:
|
||||
- "bleu"
|
||||
ref_models:
|
||||
- "gold"
|
||||
|
||||
openai:
|
||||
base_url: "https://api.openai.com/v1" # Or your custom server URL
|
||||
model: "gpt-4o" # Or your preferred model
|
||||
temperature: 0.7
|
||||
max_tokens: 1024
|
||||
top_p: 0.95
|
||||
```
|
||||
|
||||
3. Run the environment in process mode to test with OpenAI:
|
||||
```bash
|
||||
python -m environments.bleuberi.bleuberi_env process \
|
||||
--config environments/bleuberi/configs/openai.yaml \
|
||||
--env.data_path_to_save_groups bleuberi_openai_test.jsonl
|
||||
```
|
||||
|
||||
This will create two files:
|
||||
- `bleuberi_openai_test.jsonl`: Raw data containing prompts, responses, and scores
|
||||
- `bleuberi_openai_test.html`: A visual representation of the interactions for easy review
|
||||
|
||||
4. For local inference server testing:
|
||||
- Set `base_url` to your local server (e.g., "http://localhost:8000/v1")
|
||||
- Specify the model name as expected by your server
|
||||
|
||||
5. For custom reference models:
|
||||
- Configure `ref_models` in the YAML to use specific models
|
||||
- Available options include: gold (default), claude-3-7-sonnet@20250219, deepseek-chat-v3, gemini-2.5-pro-exp-03-25, o4-mini-2025-04-16, Llama-3.1-8B-Instruct
|
||||
|
||||
## Configuration
|
||||
|
||||
See the `configs/` directory for example configurations. The environment supports:
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue