llama.cpp/examples/server/server.cpp at 04ce3a8b19256a155aea4d14eaa87edf274c93c3

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-01 09:01:57 +00:00

Files

Bjarke Viksøe cb4d86c4d7 server: Retrieve prompt template in /props (#8337 )

* server: Retrieve prompt template in /props

This PR adds the following:
- Expose the model's Jinja2 prompt template from the model in the /props endpoint.
- Change log-level from Error to Warning for warning about template mismatch.

The front-end stands a better chance of actually executing the Jinja template format correctly. Server is currently just guessing it.

Ideally this should have been inside a JSON block that expose the same key/value pairs as listed during startup in "llm_load_print_meta" function.

* Make string buffer dynamic

* Add doc and better string handling

* Using chat_template naming convention

* Use intermediate vector for string assignment

2024-07-07 11:10:38 +02:00

135 KiB

Raw Blame History

View Raw

135 KiB Raw Blame History

135 KiB

Raw Blame History