llama.cpp/llama.cpp at 5326bcceeb7dd34f16d0fe61b134d1e074a8e65d

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-31 08:51:55 +00:00

Files

Ștefan-Gabriel Muscalu a94e6ff877 update: support Qwen2-57B-A14B (#7835 )

* update: convert-hf-to-gguf.py to support Qwen2-57B-A14B

* fix: QWEN2MOE support for expert_feed_forward_length

previously, expert ff was taken from n_ff (intermediate size) but it is now properly taken from LLM_KV_EXPERT_FEED_FORWARD_LENGTH

n_ff_exp and n_ff_shared_exp are now properly calculated

* update: convert-hf-to-gguf.py cleanup for Qwen2MoeForCausalLM

* fix: QWEN2MOE support for expert_feed_forward_length

previously, expert ff was taken from n_ff (intermediate size) but it is now properly taken from LLM_KV_EXPERT_FEED_FORWARD_LENGTH

n_ff_exp and n_ff_shexp are now properly calculated

2024-06-17 21:08:46 +02:00

764 KiB

Raw Blame History

View Raw

764 KiB Raw Blame History

764 KiB

Raw Blame History