Georgi Gerganov
|
38db8a5861
|
llama : introduce concept of llama_memory
ggml-ci
|
2025-02-28 10:51:17 +02:00 |
|
Georgi Gerganov
|
828effd9d7
|
kv-cache : basic abstraction
ggml-ci
|
2025-02-27 16:00:29 +02:00 |
|
Georgi Gerganov
|
08011c2ca1
|
context : add llama_kv_cache_recurrent prototype
ggml-ci
|
2025-02-20 20:55:13 +02:00 |
|
Georgi Gerganov
|
5f11a5502a
|
kv-cache : remove llama_kv_cache_i
|
2025-02-19 14:36:27 +02:00 |
|
Georgi Gerganov
|
f5cedbcaaa
|
kv-cache : prepare for abstraction
ggml-ci
|
2025-02-18 21:28:58 +02:00 |
|
Georgi Gerganov
|
f0d3ff2388
|
Merge branch 'master' into gg/llama-kv-cache
ggml-ci
|
2025-02-18 10:14:37 +02:00 |
|
Georgi Gerganov
|
d5e8e1a2ba
|
context : remove batch_manager
ggml-ci
|
2025-02-14 16:10:55 +02:00 |
|
Georgi Gerganov
|
3a504d9a0b
|
llama : introduce llama_io interfaces
ggml-ci
|
2025-02-13 12:25:54 +02:00 |
|
Daniel Bevenius
|
3e69319772
|
llama : update llama_decode_internal ref [no ci] (#11840)
This commit updates the comment in llama_kv_cache.h to reflect the
change of the function name from llama_decode_internal to
llama_decode_impl.
|
2025-02-13 08:07:51 +02:00 |
|
Georgi Gerganov
|
a19f671fe0
|
context : minor
ggml-ci
|
2025-01-26 20:16:21 +02:00 |
|
Georgi Gerganov
|
17b363afd3
|
llama : update llama_kv_self API
ggml-ci
|
2025-01-26 20:16:20 +02:00 |
|
Georgi Gerganov
|
fd05ab87aa
|
kv_cache : move state read/write to llama_kv_cache
ggml-ci
|
2025-01-26 20:14:36 +02:00 |
|
Georgi Gerganov
|
4cd1b6fa4c
|
context : prepare kv_cache_read/write to be moved to kv_cache
ggml-ci
|
2025-01-26 20:14:36 +02:00 |
|
Georgi Gerganov
|
73a14eccc9
|
kv_cache : minor
|
2025-01-26 20:14:36 +02:00 |
|
Georgi Gerganov
|
4d7bd03e65
|
kv_cache : functions -> members
ggml-ci
|
2025-01-26 20:14:36 +02:00 |
|
Georgi Gerganov
|
f78b396ee7
|
llama : add struct llama_kv_cache (wip) [no ci]
|
2025-01-26 20:12:06 +02:00 |
|
Georgi Gerganov
|
f66f582927
|
llama : refactor src/llama.cpp (#10902)
* llama : scatter llama.cpp into multiple modules (wip)
* llama : control-vector -> adapter
* llama : arch
* llama : mmap
ggml-ci
* ci : remove BUILD_SHARED_LIBS=OFF
ggml-ci
* llama : arch (cont)
ggml-ci
* llama : chat
ggml-ci
* llama : model
ggml-ci
* llama : hparams
ggml-ci
* llama : adapter
ggml-ci
* examples : fix
ggml-ci
* rebase
ggml-ci
* minor
* llama : kv cache
ggml-ci
* llama : impl
ggml-ci
* llama : batch
ggml-ci
* cont
ggml-ci
* llama : context
ggml-ci
* minor
* llama : context (cont)
ggml-ci
* llama : model loader
ggml-ci
* common : update lora
ggml-ci
* llama : quant
ggml-ci
* llama : quant (cont)
ggml-ci
* minor [no ci]
|
2025-01-03 10:18:53 +02:00 |
|