llama.cpp/examples/server/tests/unit/test_speculative.py at 6f0c9e034bb398915a6617ee4acc62adb87d387d

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-30 08:42:00 +00:00

Files

Georgi Gerganov 1da7b76569 server : fix speculative decoding with context shift (#10641 )

* server : fix speculative decoding with context shift

ggml-ci

* server : take into account speculative limits

ggml-ci

* server : add tests

2024-12-04 22:38:20 +02:00

View Raw