llama.cpp/ggml-quants.c at 7733f0c76081b2a69b5f8b192db2db7c43629d58

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-03 09:22:01 +00:00

Files

Justine Tunney 7733f0c760 ggml : support AVX512VNNI (#6280 )

This change causes some quants (e.g. Q4_0, Q8_0) to go faster on some
architectures (e.g. AMD Zen 4).

2024-03-25 07:39:56 +02:00

View Raw