llama.cpp/examples/server/server.cpp at 4db8f60fe79a391e82b0464013ada123baced96a

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-30 08:42:00 +00:00

Files

Douglas Hanley c3ebcfa148 server : ensure batches are either all embed or all completion (#8420 )

* make sure batches are all embed or all non-embed

* non-embedding batch for sampled tokens; fix unused params warning

2024-07-12 11:14:12 +03:00

View Raw