readme updates for tool calling

This commit is contained in:
dmahan93 2026-03-03 12:22:10 -06:00
parent 8f21bb57ed
commit c8eb63f33d

View file

@ -8,6 +8,91 @@ For automatic token and logprob tracking, see the [ManagedServer Guide](MANAGED_
> **Note:** OpenAI endpoints do not support token IDs/logprobs required for ManagedServer. Set `ATROPOS_ALLOW_DUMMY_MANAGED_SERVER=1` to use a placeholder implementation for testing/evaluation. See [OpenAI Endpoint Limitations](MANAGED_SERVER.md#openai-endpoint-limitations) for details.
## Tool Call Support
ManagedServer supports OpenAI-style tool calling via vLLM's tool parsers. Pass `tool_parser` at init:
```python
server_manager = ServerManager(
configs=[APIServerConfig(...)],
tool_parser="hermes", # or llama3_json, mistral, deepseek_v3, qwen3_coder, etc.
)
async with server_manager.managed_server(tokenizer=tokenizer) as managed:
result = await managed.chat_completion(
messages=[{"role": "user", "content": "What's the weather?"}],
tools=[{
"type": "function",
"function": {"name": "get_weather", "parameters": {"type": "object", "properties": {"city": {"type": "string"}}}}
}],
tool_choice="auto", # "auto", "none", "required"
)
# Structured tool_calls in response
if result.choices[0].message.tool_calls:
print(result.choices[0].message.tool_calls)
# Nodes still have raw text with <tool_call> tags for training
nodes = managed.get_state()["nodes"]
```
Requires `vllm` installed. Without it, tool parsing is disabled with a warning — everything else still works.
## OpenAI Proxy
Exposes ManagedServer as an OpenAI-compatible HTTP API for external tools (CLIs, GUIs, microservices).
### Standalone
```bash
python -m atroposlib.envs.server_handling.managed_server_proxy \
--config servers.json --port 9100
```
`servers.json`:
```json
{
"model_name": "Qwen/Qwen3-4B",
"servers": [
{"base_url": "http://gpu1:8000/v1", "server_type": "vllm"},
{"base_url": "http://gpu2:8000/v1", "server_type": "vllm"}
]
}
```
### Endpoints
| Method | Path | Description |
|--------|------|-------------|
| POST | `/sessions/create` | Create session. Optional `base_url` to pin to a server, `tool_parser` name. |
| POST | `/{uuid}/v1/chat/completions` | OpenAI chat completions (with tools support). |
| POST | `/{uuid}/v1/chat/completions/render` | Preview rendered prompt without generating. |
| GET | `/{uuid}/nodes` | Get tracked tokens/logprobs/masks for training. |
| DELETE | `/{uuid}` | Cleanup session. |
| GET | `/sessions` | List active sessions. |
| GET | `/servers` | List backend servers. |
| POST | `/setup` | Push server config (used by ServerManager). |
| GET | `/v1/models` | List models. |
| GET | `/health` | Health check. |
### Via ServerManager
```python
server_manager = ServerManager(
configs=[APIServerConfig(...)],
proxy_url="http://localhost:9100", # auto-enables proxy mode
tool_parser="hermes",
)
# managed_server() now routes through the proxy
async with server_manager.managed_server(tokenizer=tokenizer) as managed:
result = await managed.chat_completion(messages=[...], tools=[...])
url = managed.get_url() # "http://localhost:9100/{uuid}/v1" — hand to external apps
nodes = await managed.fetch_state() # get tokens/logprobs
```
Or set `ATROPOS_PROXY_URL=http://localhost:9100` env var instead of passing `proxy_url`.
## Reasoning Model Support
The `ReasoningConfig` class enables support for reasoning/thinking models across different providers.