llama.cpp/common/common.h at ebe41d49a69a46735d65e0e7fc8ae6e4b0a0fbda

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-03 09:22:01 +00:00

Files

Georgi Gerganov 47068e5170 speculative : PoC for speeding-up inference via speculative sampling (#2926 )

* speculative : initial example

* speculative : print encoding speed

* speculative : add --draft CLI arg

2023-09-03 15:12:08 +03:00

View Raw