llama.cpp/examples/server/tests/features/parallel.feature at 8db003a19d7055b5bd248ce2afff9324e5b8da95

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-31 08:51:55 +00:00

Files

Xuan Son Nguyen 9b2c24c099 server : simplify state machine for slot (#9283 )

* server : simplify state machine for slot

* add SLOT_STATE_DONE_PROMPT

* pop_deferred_task

* add missing notify_one

* fix passkey test

* metrics : add n_busy_slots_per_decode

* fix test step

* add test

* maybe fix AddressSanitizer?

* fix deque ?

* missing lock

* pop_deferred_task: also notify

* Update examples/server/server.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

2024-09-06 23:21:29 +02:00

3.5 KiB

Raw Blame History

View Raw

3.5 KiB Raw Blame History

3.5 KiB

Raw Blame History