dmahan93
f198c1738e
Merge conflict commit
2026-03-09 23:13:43 -05:00
dmahan93
f4875c5dc6
make preserve thinking optional
2026-03-04 15:44:12 -06:00
Jai Suphavadeeprasit
1eeb31065f
fixing comments
2026-03-03 23:16:05 -05:00
Jai Suphavadeeprasit
439b9b129b
prompt logprobs
2026-03-03 21:58:05 -05:00
Jai Suphavadeeprasit
b9291aa29f
init commit
2026-03-03 11:32:09 -05:00
dmahan93
add42a2afb
add tool call parsing based on vllm impl and an openai server endpoint
2026-03-02 23:17:13 -06:00
Dakota
7d6aeb9bbf
add tokenizer name config to set the vllm/sglang tokenizer to something different if needed
2026-02-09 15:26:29 -06:00
Dakota
10f651289c
Add dummy openai managed server
2026-02-04 15:16:36 -06:00
Teknium
837fc237ee
Merge branch 'main' into add_reasoning_handling_draft
2026-01-12 09:45:38 -08:00
teknium
21504537fc
revive _get_server_base_url
2026-01-12 16:49:38 +00:00
teknium
e1ece3e64e
Add reasoning configuration support across server implementations
...
- Updated server classes (OpenAIServer, SGLangServer, TrlVllmServer, VLLMServer) to accept a ReasoningConfig parameter during initialization.
- Enhanced ReasoningConfig to allow flexible max_tokens without strict validation, accommodating varying provider limits.
- Implemented reasoning configuration injection in APIServer methods for chat and completion handling.
- Updated tests to reflect changes in max_tokens validation logic.
This commit integrates reasoning capabilities into the server handling architecture, improving compatibility with diverse reasoning models.
2026-01-05 23:20:01 +00:00
pre-commit-ci[bot]
97047eee7b
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-12-30 00:26:33 +00:00
teknium
62fa51240c
Add support for reasoning models and their variety of providers/endpoints
2025-12-30 00:23:00 +00:00
Dakota
8ec5066998
add eval runner
2025-12-19 19:56:59 -06:00
Dakota
e6ac3abdcb
add managed vllm server
2025-11-07 13:06:49 -06:00
dmahan93
7bf4cfbf80
add managed server to make grabbing logprobs easier w/ tokenized items
2025-10-24 13:09:46 -07:00
pre-commit-ci[bot]
1e6a745491
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-10-16 17:39:04 +00:00
Dakota
c36ec29656
add sglang specific token level logprob handling and server manager/baseline logprob/token fn
2025-10-16 12:38:03 -05:00
Alexey Gorbatovski
35c542328a
Fix infinite loop in wait_for_sem by updating semaphore values inside loop
2025-07-06 00:27:45 +03:00
dmahan93
44b96c7b6c
Add max_n_completions parameter to ServerManager for handling multiple completions
...
- Introduced max_n_completions configuration to limit the number of completions requested per server call.
- Updated chat_completion and completion methods to split requests exceeding max_n_completions into multiple calls, merging results accordingly.
- Enhanced documentation for max_n_completions in ServerManagerConfig.
2025-06-02 11:11:55 -05:00
dmahan93
f7552c9c6f
make default not slurm
2025-05-13 13:11:04 -05:00
dmahan93
8b864e9786
move server_type to serverbaseline instead so it can be used as well for server instantiation
2025-05-13 10:21:58 -05:00
dmahan93
6fc356e76e
fix type checking to substantiate an appropriate class instead of the abstract server class
2025-05-13 10:09:36 -05:00
dmahan93
df62979b90
refactor to not mess up process...
2025-05-13 09:22:07 -05:00
dmahan93
6e9405ba95
Fix bad merge
2025-05-12 20:02:54 -05:00
dmahan93
0aaf59fc9a
add trl server
...
add gsm8k example for axolotl checking
2025-05-12 19:04:46 -05:00
dmahan93
96be544228
Merge commit ' 71e7a5ca27' into add-support-for-custom-api-servers
2025-05-12 18:40:35 -05:00
dmahan93
8ff48065a3
Update server_manager.py to not continue to API config stuff if serverbaseline is set
2025-05-08 20:18:15 -05:00
dmahan93
70cf61c210
add custom server support
2025-05-08 12:01:49 -05:00
Dakota Nous
621d00dd80
first commit
2025-04-29 12:10:10 -07:00