mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-30 08:42:00 +00:00

Files

Kerfuffle 5df7d06c42 llama : allow exporting a view of the KV cache (#4180 )

* Allow exporting a view of the KV cache

* Allow dumping the sequences per cell in common

* Track max contiguous cells value and position as well

* Fix max contiguous empty cells index calculation

Make dump functions deal with lengths or sequences counts > 10 better

* Fix off by one error in dump_kv_cache_view

* Add doc comments for KV cache view functions

Eliminate cell sequence struct; use llama_seq_id directly

Minor cleanups

2023-11-23 18:31:20 +02:00

CMakeLists.txt

build : link against build info instead of compiling against it (#3879 )

2023-11-02 08:50:16 +02:00

parallel.cpp

llama : allow exporting a view of the KV cache (#4180 )

2023-11-23 18:31:20 +02:00

README.md

Fix some documentation typos/grammar mistakes (#4032 )

2023-11-11 23:04:58 -07:00

README.md

llama.cpp/example/parallel

Simplified simulation of serving incoming requests in parallel