llama.cpp/convert_hf_to_gguf.py at 16bdfa42acb09175e88cf97e9d9e4e48f616d120

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-01 09:01:57 +00:00

Files

Georgi Gerganov 73cf442e7b llama : fix Gemma-2 Query scaling factors (#8473 )

* 9B - query_pre_attn_scalar = 256 not 224

See 03e657582d

Gemma 9b should use 256 and not 224 (self.config.hidden_size // self.config.num_attention_heads)

* llama : fix Gemma-2 Query scaling factor

ggml-ci

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

2024-07-14 14:05:09 +03:00

160 KiB

Executable File

Raw Blame History

View Raw

160 KiB Executable File Raw Blame History

160 KiB

Executable File

Raw Blame History