vulkan: add specific MMV kernels for IQ2 and IQ3 quants + optimizations (#11595)

* vulkan: implement specialized MMV kernels for IQ2 quantizations * vulkan: add MMV kernels for IQ3 quants * vulkan: Increase MMV batch size and unroll IQ LUT setup * vulkan: fix init_iq_shmem for WG sizes larger than tables * vulkan: common batch size for all I-quants
2025-11-13 10:57:15 +00:00 · 2025-02-28 09:42:52 +01:00
parent 9c42b1718c
commit 438a83926a
9 changed files with 509 additions and 42 deletions
--- a/ggml/src/ggml-vulkan/vulkan-shaders/get_rows_quant.comp
+++ b/ggml/src/ggml-vulkan/vulkan-shaders/get_rows_quant.comp
@@ -1,5 +1,7 @@
 #version 450

+#extension GL_EXT_control_flow_attributes : enable
+
 #include "types.comp"
 #include "generic_binary_head.comp"
 #include "dequant_funcs.comp"