llama.cpp/common/common.cpp at 5b8530d88c489f9d0c0ef3d0886b369f655b792e

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-02 09:12:03 +00:00

Files

Georgi Gerganov 47068e5170 speculative : PoC for speeding-up inference via speculative sampling (#2926 )

* speculative : initial example

* speculative : print encoding speed

* speculative : add --draft CLI arg

2023-09-03 15:12:08 +03:00

View Raw