llama.cpp/gguf-py/gguf/tensor_mapping.py at 3b8f1ec4b18770531d0b1d792f3edf08254e4f0c

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-04 09:32:00 +00:00

Files

Shijie f4dea7da18 llama : add qwen2moe (#6074 )

* support qwen2moe

* fix-review

* metal : support unary ops for nelements % 4 != 0

* metal : require contiguousness for float4 unary kernels

* metal : require contiguousness for float4 unary kernels (cont)

* fix-review

* names : for brevity "SHARED_EXP" -> "SHEXP"

* llama : reuse build_moe_ffn()

* llama : add model type name

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

2024-04-16 18:40:48 +03:00

21 KiB

Raw Blame History

View Raw

21 KiB Raw Blame History

21 KiB

Raw Blame History