llama.cpp/gguf-py/gguf/tensor_mapping.py at dbceec87c0221ec952e69448df6a71f1372a7487

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-31 08:51:55 +00:00

Files

Shijie f4dea7da18 llama : add qwen2moe (#6074 )

* support qwen2moe

* fix-review

* metal : support unary ops for nelements % 4 != 0

* metal : require contiguousness for float4 unary kernels

* metal : require contiguousness for float4 unary kernels (cont)

* fix-review

* names : for brevity "SHARED_EXP" -> "SHEXP"

* llama : reuse build_moe_ffn()

* llama : add model type name

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

2024-04-16 18:40:48 +03:00

21 KiB

Raw Blame History

View Raw

21 KiB Raw Blame History

21 KiB

Raw Blame History