llama.cpp/gguf-py/gguf/tensor_mapping.py at 53ff6b9b9fb25ed0ec0a213e05534fe7c3d0040f

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-30 08:42:00 +00:00

Files

fairydreaming 9394bbd484 llama : Add support for DeepSeek V3 (#11049 )

* convert : extend DEEPSEEK2 model architecture to support DeepseekV3ForCausalLM by adding EXPERT_WEIGHTS_NORM and EXPERT_GATING_FUNC model parameters and FFN_EXP_PROBS_B tensor type

* vocab : add DeepSeek V3 pre-tokenizer regexes

* unicode : handle ACCENT_MARK and SYMBOL categories in regex

* llama : add DeepSeek V3 chat template, handle new model parameters and tensor types

---------

Co-authored-by: Stanisław Szymczyk <sszymczy@gmail.com>

2025-01-04 21:06:11 +01:00

36 KiB

Raw Blame History

View Raw

36 KiB Raw Blame History

36 KiB

Raw Blame History