Commit Graph

5 Commits

Author SHA1 Message Date
Clarissa Miranda
901a3479b1 move cache stack to advance stack 2024-10-14 17:13:40 +11:00
Clarissa Miranda
cb1632b593 llama : adds llama-grammar memorization stacks (#4218) 2024-10-11 12:20:48 +11:00
Georgi Gerganov
df270ef745 llama : refactor sampling v2 (#9294)
- Add `struct llama_sampler` and `struct llama_sampler_i`
- Add `llama_sampler_` API
- Add `llama_sampler_chain_` API for chaining multiple samplers
- Remove `LLAMA_API_INTERNAL`
- Add `llama_perf_` API and remove old `llama_print_timings` and `llama_reset_timings`
2024-09-07 15:16:19 +03:00
slaren
2b1f616b20 ggml : reduce hash table reset cost (#8698)
* ggml : reduce hash table reset cost

* fix unreachable code warnings after GGML_ASSERT(false)

* GGML_ASSERT(false) -> GGML_ABORT("fatal error")

* GGML_ABORT use format string
2024-07-27 04:41:55 +02:00
Georgi Gerganov
938943cdbf llama : move vocab, grammar and sampling into separate files (#8508)
* llama : move sampling code into llama-sampling

ggml-ci

* llama : move grammar code into llama-grammar

ggml-ci

* cont

ggml-ci

* cont : pre-fetch rules

* cont

ggml-ci

* llama : deprecate llama_sample_grammar

* llama : move tokenizers into llama-vocab

ggml-ci

* make : update llama.cpp deps [no ci]

* llama : redirect external API to internal APIs

ggml-ci

* llama : suffix the internal APIs with "_impl"

ggml-ci

* llama : clean-up
2024-07-23 13:10:17 +03:00