llama.cpp/ggml.c at e435bfd93cbc03970450486c4ea526a0fa5aa7f6

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-01 09:01:57 +00:00

Files

Iwan Kawrakow e435bfd93c RMSE-optimized quants for all quantization types

By default this new option is ON. One can turn it off
by setting LLAMA_NO_RMSE.

With this option enabled, the Q4_3 quantization results
in a perplexity  of 6.0344, so 0.0273 lower than simple
Q4_3 quantization.

2023-04-22 17:06:39 +03:00

388 KiB

Raw Blame History

View Raw

388 KiB Raw Blame History

388 KiB

Raw Blame History