Files
llama.cpp/ggml.c
Iwan Kawrakow e435bfd93c RMSE-optimized quants for all quantization types
By default this new option is ON. One can turn it off
by setting LLAMA_NO_RMSE.

With this option enabled, the Q4_3 quantization results
in a perplexity  of 6.0344, so 0.0273 lower than simple
Q4_3 quantization.
2023-04-22 17:06:39 +03:00

388 KiB