llama.cpp/llama.cpp at b2248

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-22 12:27:26 +00:00

Files

Georgi Gerganov 96633eeca1 gemma : use more bits for the token_embd.weight tensor (#5650 )

* gemma : use Q8_0 for the token_embd.weight tensor

* llama : quantize token_embd.weight using output type

2024-02-22 23:23:46 +02:00

View Raw