llama.cpp/src/llama-model.cpp at 0b4be4c435849b00dbd98b109cf7a22298d27b69

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-10 10:27:03 +00:00

Files

Georgi Gerganov 5582c49c39 gemma : more consistent attention scaling for v2 and v3 (#13951 )

* gemma : fix attn scale for 27B

* cont : apply scale before attn

* cont : consistent attention scaling

2025-06-02 20:54:26 +03:00

View Raw