server : fix context shift (#5195)

* server : fix context shift + simplify self-extend * server : take system_tokens into account * server : more n_past fixes * server : rever n_past_se changes
2025-10-28 08:31:25 +00:00 · 2024-01-30 20:17:30 +02:00
parent 4003be0e5f
commit e6f291d158
2 changed files with 60 additions and 50 deletions
--- a/examples/server/chat.sh
+++ b/examples/server/chat.sh
@@ -48,6 +48,7 @@ chat_completion() {
        top_p: 0.9,
        n_keep: $n_keep,
        n_predict: 256,
+        cache_prompt: true,
        stop: ["\n### Human:"],
        stream: true
    }')"