llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-02 09:12:03 +00:00

Files

Georgi Gerganov 6b64f74b55 batched-bench : fix unified KV cache handling + pp timing (#15562 )

* batched-bench : fix unified KV cache handling + pp timing

* cont : run dummy token only with split KV cache

2025-08-25 13:56:43 +03:00

…

…

…

…

…

…

…

…

…

…

…

…

…

…

…

CMakeLists.txt

…