llama.cpp/ggml-quants.c at 586e7bc561be88e929a9afca7e67d8ead00c53bd

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-31 08:51:55 +00:00

Files

Kawrakow cfd3be76e3 ggml : same IQ4_NL quantization for CPU/CUDA/Metal (#6196 )

* Make quantize_row_iq4_nl do the same thing is quantization on CUDA

* Make quantize_row_iq4_nl do the same thing is quantization on CUDA

This time for real. backend-ops tests pass.

* Now fix test-quantize-fns

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

2024-03-21 14:59:38 +02:00

488 KiB

Raw Blame History

View Raw

488 KiB Raw Blame History

488 KiB

Raw Blame History