llama.cpp/examples/server/server.cpp at 38b16dfca6e5032e6cfb90c1653bf1ba4cf647b4

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-30 08:42:00 +00:00

Files

Xiao-Yong Jin b8ad1b66b2 server : allow json array in prompt or content for direct token input (#2306 )

* server: allow json array in prompt or content

We accept an array of strings and numbers representing tokens,
in addition to the current string valued prompt or content.

This allows direct token input, so that any special tokens
can be processed and used at the frontend during the construction
of the json data, before sending to the server. And the server
does not need to know or parse special tokens from textual input.

With this, we can use EOS and BOS used in llama-2-chat models.

* server: use tokenizePrompt(json) and default "" if empty prompt

* server: fix prompt check

* server: tokenize endpoint no longer adds BOS

2023-08-23 15:12:12 +08:00

52 KiB

Raw Blame History

View Raw

52 KiB Raw Blame History

52 KiB

Raw Blame History