vulkan: add specific MMV kernels for IQ2 and IQ3 quants + optimizations (#11595)

* vulkan: implement specialized MMV kernels for IQ2 quantizations

* vulkan: add MMV kernels for IQ3 quants

* vulkan: Increase MMV batch size and unroll IQ LUT setup

* vulkan: fix init_iq_shmem for WG sizes larger than tables

* vulkan: common batch size for all I-quants
This commit is contained in:
Rémy O
2025-02-28 09:42:52 +01:00
committed by GitHub
parent 9c42b1718c
commit 438a83926a
9 changed files with 509 additions and 42 deletions

View File

@@ -1,5 +1,7 @@
#version 450
#extension GL_EXT_control_flow_attributes : enable
#include "types.comp"
#include "generic_binary_head.comp"
#include "dequant_funcs.comp"