llama.cpp/tools/batched-bench/batched-bench.cpp at 34bdbbd7c2b70b848718e95bc48010f6aecd2816

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-07 09:57:00 +00:00

Files

Georgi Gerganov 6b64f74b55 batched-bench : fix unified KV cache handling + pp timing (#15562 )

* batched-bench : fix unified KV cache handling + pp timing

* cont : run dummy token only with split KV cache

2025-08-25 13:56:43 +03:00

View Raw