model: EmbeddingGemma Adding Support for SentenceTransformers Dense Modules (#16367)

* model: EmbeddingGemma sentence-transformers dense linear projections support

* model: add support for EmbeddingGemma SentenceTransformers dense linear projections

Adding support for the Dense modules used in EmbeddingGemma models.
EmbeddingGemma is a SentenceTransformers model with additional modules beyond the base Transformer backbone.

See: https://developers.googleblog.com/en/gemma-explained-embeddinggemma-architecture-and-recipe/

* model: add support for EmbeddingGemma SentenceTransformers dense linear projections

- converting model with dense-layers is optional
- introduced dense config params

* Update convert_hf_to_gguf.py

Co-authored-by: Daniel Bevenius <daniel.bevenius@gmail.com>

* fixed formatting issues

* Update src/llama-graph.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* - removed pooling_type_opt, always allow overriding pooling_type
- asserts checking dense features dims

* fix python lint

* fix ubuntu gcc build warning

* - fixed thread-safety test
- moved asserts to load_hparams

* - tidying up code
- simplifying graph-context expecting both dense weights

* minor : add TODO

---------

Co-authored-by: Daniel Bevenius <daniel.bevenius@gmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
This commit is contained in:
Saba Fallah
2025-10-09 08:39:18 +02:00
committed by GitHub
parent 12bbc3fa50
commit e08db42595
12 changed files with 170 additions and 7 deletions

View File

@@ -271,6 +271,12 @@ enum llm_kv {
LLM_KV_TOKENIZER_PREFIX_ID,
LLM_KV_TOKENIZER_SUFFIX_ID,
LLM_KV_TOKENIZER_MIDDLE_ID,
// sentence-transformers dense layers in and out features
LLM_KV_DENSE_2_FEAT_IN,
LLM_KV_DENSE_2_FEAT_OUT,
LLM_KV_DENSE_3_FEAT_IN,
LLM_KV_DENSE_3_FEAT_OUT,
};
enum llm_tensor {
@@ -278,6 +284,8 @@ enum llm_tensor {
LLM_TENSOR_TOKEN_EMBD_NORM,
LLM_TENSOR_TOKEN_TYPES,
LLM_TENSOR_POS_EMBD,
LLM_TENSOR_DENSE_2_OUT,
LLM_TENSOR_DENSE_3_OUT,
LLM_TENSOR_OUTPUT,
LLM_TENSOR_OUTPUT_NORM,
LLM_TENSOR_ROPE_FREQS,