llama.cpp/common/common.h at cf9b08485c4c2d4d945c6e74fe20f273a38b6104

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-31 08:51:55 +00:00

Files

Georgi Gerganov 47068e5170 speculative : PoC for speeding-up inference via speculative sampling (#2926 )

* speculative : initial example

* speculative : print encoding speed

* speculative : add --draft CLI arg

2023-09-03 15:12:08 +03:00

View Raw