llama.cpp/ggml-quants.c at 78b00dda6c0d62c34f5371d47718defff6ed2b22

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-01 09:01:57 +00:00

Files

Kawrakow 6fdfa2ecc6 iq2_xxs: tune quantization (#5320 )

We get slightly better PPL, and we cut quantization time in
nearly half.

The trick is to 1st quantize without forcing points onto the E8-lattice.
We can then use a narrower search range around the block scale that we
got that way.

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

2024-02-05 10:46:06 +02:00

390 KiB

Raw Blame History

View Raw

390 KiB Raw Blame History

390 KiB

Raw Blame History