llama.cpp/utils.cpp at 2af23d30434a677c6416812eea52ccc0af65119c

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-10 10:27:03 +00:00

Files

Matvey Soloviev 904d2a8d6a Q4_1 quantization (#193 )

* Add AVX2 version of ggml_vec_dot_q4_1

* Small optimisations to q4_1 dot product (@Const-me)

* Rearrange Q4_1 quantization to work for multipart models. (Fix #152)

* Fix ggml_vec_mad_q4_1 too

* Fix non-vectorised q4_1 vec mul

2023-03-17 06:48:39 +02:00

18 KiB

Raw Blame History

View Raw

18 KiB Raw Blame History

18 KiB

Raw Blame History