llama.cpp/ggml-quants.c at 7e1ae372f36d98fa66b1d778c5862904b4d80c88

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-03 09:22:01 +00:00

Files

Kawrakow 6fdfa2ecc6 iq2_xxs: tune quantization (#5320 )

We get slightly better PPL, and we cut quantization time in
nearly half.

The trick is to 1st quantize without forcing points onto the E8-lattice.
We can then use a narrower search range around the block scale that we
got that way.

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

2024-02-05 10:46:06 +02:00

390 KiB

Raw Blame History

View Raw

390 KiB Raw Blame History

390 KiB

Raw Blame History