Commit Graph

17 Commits

Author SHA1 Message Date
Georgi Gerganov
38db8a5861 llama : introduce concept of llama_memory
ggml-ci
2025-02-28 10:51:17 +02:00
Georgi Gerganov
828effd9d7 kv-cache : basic abstraction
ggml-ci
2025-02-27 16:00:29 +02:00
Georgi Gerganov
08011c2ca1 context : add llama_kv_cache_recurrent prototype
ggml-ci
2025-02-20 20:55:13 +02:00
Georgi Gerganov
5f11a5502a kv-cache : remove llama_kv_cache_i 2025-02-19 14:36:27 +02:00
Georgi Gerganov
f5cedbcaaa kv-cache : prepare for abstraction
ggml-ci
2025-02-18 21:28:58 +02:00
Georgi Gerganov
f0d3ff2388 Merge branch 'master' into gg/llama-kv-cache
ggml-ci
2025-02-18 10:14:37 +02:00
Georgi Gerganov
d5e8e1a2ba context : remove batch_manager
ggml-ci
2025-02-14 16:10:55 +02:00
Georgi Gerganov
3a504d9a0b llama : introduce llama_io interfaces
ggml-ci
2025-02-13 12:25:54 +02:00
Daniel Bevenius
3e69319772 llama : update llama_decode_internal ref [no ci] (#11840)
This commit updates the comment in llama_kv_cache.h to reflect the
change of the function name from llama_decode_internal to
llama_decode_impl.
2025-02-13 08:07:51 +02:00
Georgi Gerganov
a19f671fe0 context : minor
ggml-ci
2025-01-26 20:16:21 +02:00
Georgi Gerganov
17b363afd3 llama : update llama_kv_self API
ggml-ci
2025-01-26 20:16:20 +02:00
Georgi Gerganov
fd05ab87aa kv_cache : move state read/write to llama_kv_cache
ggml-ci
2025-01-26 20:14:36 +02:00
Georgi Gerganov
4cd1b6fa4c context : prepare kv_cache_read/write to be moved to kv_cache
ggml-ci
2025-01-26 20:14:36 +02:00
Georgi Gerganov
73a14eccc9 kv_cache : minor 2025-01-26 20:14:36 +02:00
Georgi Gerganov
4d7bd03e65 kv_cache : functions -> members
ggml-ci
2025-01-26 20:14:36 +02:00
Georgi Gerganov
f78b396ee7 llama : add struct llama_kv_cache (wip) [no ci] 2025-01-26 20:12:06 +02:00
Georgi Gerganov
f66f582927 llama : refactor src/llama.cpp (#10902)
* llama : scatter llama.cpp into multiple modules (wip)

* llama : control-vector -> adapter

* llama : arch

* llama : mmap

ggml-ci

* ci : remove BUILD_SHARED_LIBS=OFF

ggml-ci

* llama : arch (cont)

ggml-ci

* llama : chat

ggml-ci

* llama : model

ggml-ci

* llama : hparams

ggml-ci

* llama : adapter

ggml-ci

* examples : fix

ggml-ci

* rebase

ggml-ci

* minor

* llama : kv cache

ggml-ci

* llama : impl

ggml-ci

* llama : batch

ggml-ci

* cont

ggml-ci

* llama : context

ggml-ci

* minor

* llama : context (cont)

ggml-ci

* llama : model loader

ggml-ci

* common : update lora

ggml-ci

* llama : quant

ggml-ci

* llama : quant (cont)

ggml-ci

* minor [no ci]
2025-01-03 10:18:53 +02:00