llama.cpp/src/llama-chat.cpp at 59fee24c7236857e981a0579e095265357d0ee5c

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-31 08:51:55 +00:00

Files

Mikko Juola 9ae4143bc6 model : add dots.llm1 architecture support (#14044 ) (#14118 )

Adds:

* Dots1Model to convert_hf_to_gguf.py

* Computation graph code to llama-model.cpp

* Chat template to llama-chat.cpp to detect this model's template.

---

The model is called "dots.llm1" (I decided to shorten it to dots1 or
DOTS1 in the code generally) architecture.

The only models that exist as of writing of this commit that follow this
architecture are "dots.llm1.inst" and "dots.llm1.base" from here:

* https://huggingface.co/rednote-hilab/dots.llm1.inst

* https://huggingface.co/rednote-hilab/dots.llm1.base

The model architecture is a combination of Qwen and Deepseek parts, as
seen here:

ffe12627b4/src/transformers/models/dots1/modular_dots1.py

2025-06-15 09:52:06 +02:00

29 KiB

Raw Blame History

View Raw

29 KiB Raw Blame History

29 KiB

Raw Blame History