llama.cpp/ggml-metal.m at 7dabc66f3c63f8ea0f61bac346fa138e01df675f

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-20 12:07:33 +00:00

Files

Kawrakow 27ad57a69b Metal: faster Q4_0 and Q4_1 matrix x vector kernels (#2212 )

* 3-5% faster Q4_0 on Metal

* 7-25% faster Q4_1 on Metal

* Oops, forgot to delete the original Q4_1 kernel

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

2023-07-14 11:46:21 +02:00

49 KiB

Raw Blame History

View Raw

49 KiB Raw Blame History

49 KiB

Raw Blame History