mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2025-11-13 10:57:15 +00:00
vulkan: add specific MMV kernels for IQ2 and IQ3 quants + optimizations (#11595)
* vulkan: implement specialized MMV kernels for IQ2 quantizations * vulkan: add MMV kernels for IQ3 quants * vulkan: Increase MMV batch size and unroll IQ LUT setup * vulkan: fix init_iq_shmem for WG sizes larger than tables * vulkan: common batch size for all I-quants
This commit is contained in:
@@ -1,5 +1,7 @@
|
||||
#version 450
|
||||
|
||||
#extension GL_EXT_control_flow_attributes : enable
|
||||
|
||||
#include "types.comp"
|
||||
#include "generic_binary_head.comp"
|
||||
#include "dequant_funcs.comp"
|
||||
|
||||
Reference in New Issue
Block a user