mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2025-10-27 08:21:30 +00:00
lookup : add prompt lookup decoding example (#4484)
* initial commit, going through initializations * main loop finished, starting to debug * BUG: generates gibberish/repeating tokens after a while * kv_cache management * Added colors to distinguish drafted tokens (--color). Updated README * lookup : fix token positions in the draft batch * lookup : use n_draft from CLI params * lookup : final touches --------- Co-authored-by: Leon Ericsson <leon.ericsson@icloud.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
This commit is contained in:
1
.gitignore
vendored
1
.gitignore
vendored
@@ -48,6 +48,7 @@ models-mnt
|
||||
/llama-bench
|
||||
/llava-cli
|
||||
/lookahead
|
||||
/lookup
|
||||
/main
|
||||
/metal
|
||||
/perplexity
|
||||
|
||||
Reference in New Issue
Block a user