llama.cpp/examples/server/tests/unit/test_speculative.py at fb18934a97425c42c8b32a1baaa94f0080eb051d

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-31 08:51:55 +00:00

Files

Georgi Gerganov 1da7b76569 server : fix speculative decoding with context shift (#10641 )

* server : fix speculative decoding with context shift

ggml-ci

* server : take into account speculative limits

ggml-ci

* server : add tests

2024-12-04 22:38:20 +02:00

View Raw