server: split HTTP into its own interface (#17216)

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-19 11:57:07 +00:00

* server: split HTTP into its own interface

* move server-http and httplib to its own file

* add the remaining endpoints

* fix exception/error handling

* renaming

* missing header

* fix missing windows header

* fix error responses from http layer

* fix slot save/restore handler

* fix case where only one stream chunk is returned

* add NOMINMAX

* do not call sink.write on empty data

* use safe_json_to_str for SSE

* clean up

* add some comments

* improve usage of next()

* bring back the "server is listening on" message

* more generic handler

* add req.headers

* move the chat template print to init()

* add req.path

* cont : minor

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

This commit is contained in:

Xuan-Son Nguyen

2025-11-17 22:05:44 +01:00

committed by

GitHub

parent 38e2c1b412

commit 0de8878c96

5 changed files with 1245 additions and 930 deletions

1677

tools/server/server.cpp

View File

File diff suppressed because it is too large Load Diff

server: split HTTP into its own interface (#17216)

1677 tools/server/server.cpp View File

1677

tools/server/server.cpp

View File