llama.cpp/ggml.c at 8f644a0a859938c787d329d27f98e03c58d7df27

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-02 09:12:03 +00:00

Files

Casey Primozic 2e664f1ff4 Add initial AVX512 support for dot product on Linux (#320 )

* Update Makefile to detect AVX512 support and add compiler flags if it's available
 * Based on existing AVX2 implementation, dot product on one 32-value block of 4-bit quantized ints at a time
 * Perform 8 bit -> 16 bit sign extension and multiply+add on 32 values at time instead of 16
 * Use built-in AVX512 horizontal reduce add to get sum at the end
 * Manual unrolling on inner dot product loop to reduce loop counter overhead

2023-03-21 15:35:42 +01:00

327 KiB

Raw Blame History

View Raw

327 KiB Raw Blame History

327 KiB

Raw Blame History