mirror of
https://github.com/NousResearch/atropos.git
synced 2026-04-19 12:57:58 +00:00
readme updates for tool calling
This commit is contained in:
parent
8f21bb57ed
commit
c8eb63f33d
1 changed files with 85 additions and 0 deletions
|
|
@ -8,6 +8,91 @@ For automatic token and logprob tracking, see the [ManagedServer Guide](MANAGED_
|
|||
|
||||
> **Note:** OpenAI endpoints do not support token IDs/logprobs required for ManagedServer. Set `ATROPOS_ALLOW_DUMMY_MANAGED_SERVER=1` to use a placeholder implementation for testing/evaluation. See [OpenAI Endpoint Limitations](MANAGED_SERVER.md#openai-endpoint-limitations) for details.
|
||||
|
||||
## Tool Call Support
|
||||
|
||||
ManagedServer supports OpenAI-style tool calling via vLLM's tool parsers. Pass `tool_parser` at init:
|
||||
|
||||
```python
|
||||
server_manager = ServerManager(
|
||||
configs=[APIServerConfig(...)],
|
||||
tool_parser="hermes", # or llama3_json, mistral, deepseek_v3, qwen3_coder, etc.
|
||||
)
|
||||
|
||||
async with server_manager.managed_server(tokenizer=tokenizer) as managed:
|
||||
result = await managed.chat_completion(
|
||||
messages=[{"role": "user", "content": "What's the weather?"}],
|
||||
tools=[{
|
||||
"type": "function",
|
||||
"function": {"name": "get_weather", "parameters": {"type": "object", "properties": {"city": {"type": "string"}}}}
|
||||
}],
|
||||
tool_choice="auto", # "auto", "none", "required"
|
||||
)
|
||||
|
||||
# Structured tool_calls in response
|
||||
if result.choices[0].message.tool_calls:
|
||||
print(result.choices[0].message.tool_calls)
|
||||
|
||||
# Nodes still have raw text with <tool_call> tags for training
|
||||
nodes = managed.get_state()["nodes"]
|
||||
```
|
||||
|
||||
Requires `vllm` installed. Without it, tool parsing is disabled with a warning — everything else still works.
|
||||
|
||||
## OpenAI Proxy
|
||||
|
||||
Exposes ManagedServer as an OpenAI-compatible HTTP API for external tools (CLIs, GUIs, microservices).
|
||||
|
||||
### Standalone
|
||||
|
||||
```bash
|
||||
python -m atroposlib.envs.server_handling.managed_server_proxy \
|
||||
--config servers.json --port 9100
|
||||
```
|
||||
|
||||
`servers.json`:
|
||||
```json
|
||||
{
|
||||
"model_name": "Qwen/Qwen3-4B",
|
||||
"servers": [
|
||||
{"base_url": "http://gpu1:8000/v1", "server_type": "vllm"},
|
||||
{"base_url": "http://gpu2:8000/v1", "server_type": "vllm"}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### Endpoints
|
||||
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| POST | `/sessions/create` | Create session. Optional `base_url` to pin to a server, `tool_parser` name. |
|
||||
| POST | `/{uuid}/v1/chat/completions` | OpenAI chat completions (with tools support). |
|
||||
| POST | `/{uuid}/v1/chat/completions/render` | Preview rendered prompt without generating. |
|
||||
| GET | `/{uuid}/nodes` | Get tracked tokens/logprobs/masks for training. |
|
||||
| DELETE | `/{uuid}` | Cleanup session. |
|
||||
| GET | `/sessions` | List active sessions. |
|
||||
| GET | `/servers` | List backend servers. |
|
||||
| POST | `/setup` | Push server config (used by ServerManager). |
|
||||
| GET | `/v1/models` | List models. |
|
||||
| GET | `/health` | Health check. |
|
||||
|
||||
### Via ServerManager
|
||||
|
||||
```python
|
||||
server_manager = ServerManager(
|
||||
configs=[APIServerConfig(...)],
|
||||
proxy_url="http://localhost:9100", # auto-enables proxy mode
|
||||
tool_parser="hermes",
|
||||
)
|
||||
|
||||
# managed_server() now routes through the proxy
|
||||
async with server_manager.managed_server(tokenizer=tokenizer) as managed:
|
||||
result = await managed.chat_completion(messages=[...], tools=[...])
|
||||
url = managed.get_url() # "http://localhost:9100/{uuid}/v1" — hand to external apps
|
||||
nodes = await managed.fetch_state() # get tokens/logprobs
|
||||
```
|
||||
|
||||
Or set `ATROPOS_PROXY_URL=http://localhost:9100` env var instead of passing `proxy_url`.
|
||||
|
||||
## Reasoning Model Support
|
||||
|
||||
The `ReasoningConfig` class enables support for reasoning/thinking models across different providers.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue