llama.cpp/ggml/src/ggml-quants.c at 48b73b849880ff43f0dd818252cf00cea5a83061

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-09 10:17:06 +00:00

Files

Francis Couture-Harpin 48b73b8498 ggml-quants : substract 1 when back in epi8

This makes the 1.625 bpw type go faster than q4_0. Still not the fastest.

2024-06-27 02:06:28 -04:00

View Raw