Files
llama.cpp/tools
Xuan-Son Nguyen 61bdfd5298 server : implement prompt processing progress report in stream mode (#15827)
* server : implement `return_progress`

* add timings.cache_n

* add progress.time_ms

* add test

* fix test for chat/completions

* readme: add docs on timings

* use ggml_time_us

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-09-06 13:35:04 +02:00
..
2025-08-26 20:05:50 +02:00
2025-08-05 22:10:36 +03:00
2025-05-25 15:35:53 +03:00