llama.cpp/ggml-quants.c at 057400a3fd457f4f214684eeb171444663b47a23

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-30 08:42:00 +00:00

Files

Kawrakow cbc8343619 Make IQ1_M work for QK_K = 64 (#6327 )

* iq1_m: make it work for QK_K = 64 (WIP)

* iq1_m: make it work for QK_K = 64 (scalar and AVX2)

* iq1_m: QK_K = 64 seems to work on Metal and ARM_NEON

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

2024-03-27 08:44:27 +01:00

513 KiB

Raw Blame History

View Raw

513 KiB Raw Blame History

513 KiB

Raw Blame History