llama.cpp/examples/server-embd.py at ceca1aef0738b57951cd12c603c3477e75312dec

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-30 08:42:00 +00:00

Files

Georgi Gerganov 29ae62d2ae llama : fix embeddings (#5796 )

* llama : fix embeddings

ggml-ci

* llama : do not use KV cache for non-causal models

ggml-ci

* embeddings : fix llama_batch_init arg

* llama : add pooling switch

* llama : distinguish token vs sequence embeddings

ggml-ci

* llama : assert pooling tensor

* llama : simplify causal mask condition

ggml-ci

* llama : assert input batch with pooling enabled

* readme : update API changes list

2024-03-04 22:31:20 +02:00

940 B

Raw Blame History

View Raw

940 B Raw Blame History

940 B

Raw Blame History