llama.cpp/examples/llava/clip.cpp at f238461236f4e0e18cac1a554af23c7deadc9b01

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-03 09:22:01 +00:00

Files

John d34633d8db clip : support more quantization types (#4846 )

Uses ggml functions instead of hardcoded names and adds support to quantize into the modern Q-K variants.
This is just the bare minimum to get k-types working - a more refined choice of types would be needed to get best quality on low quantizations.

I ran a few tests, it doesn't break anything I could notice and a Q6_K ViT works almost as well as Q8_0 but 3 times the inference speed.

2024-01-10 15:37:09 +02:00

40 KiB

Raw Blame History

View Raw

40 KiB Raw Blame History

40 KiB

Raw Blame History