llama.cpp/tools/server/tests/unit/test_ctx_shift.py at ac261bea669cb79d6cf754d110528b09d24ed524

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-04 09:32:00 +00:00

Files

Georgi Gerganov 85a7d8677b memory : remove KV cache size padding (#16812 )

* memory : remove KV cache size padding

* cont : restore padding for n_kv tensor shape

* server : use slot context size instead of training context size

* server : simplify context limit logic

2025-10-28 20:19:44 +02:00

3.0 KiB

Raw Blame History

View Raw

3.0 KiB Raw Blame History

3.0 KiB

Raw Blame History