llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-28 08:31:25 +00:00

Files

Rémy O 438a83926a vulkan: add specific MMV kernels for IQ2 and IQ3 quants + optimizations (#11595 )

* vulkan: implement specialized MMV kernels for IQ2 quantizations

* vulkan: add MMV kernels for IQ3 quants

* vulkan: Increase MMV batch size and unroll IQ LUT setup

* vulkan: fix init_iq_shmem for WG sizes larger than tables

* vulkan: common batch size for all I-quants

2025-02-28 09:42:52 +01:00

cmake

cmake: Fix ggml backend dependencies and installation (#11818 )

2025-02-27 09:42:48 +02:00

include

ggml-cpu: Support s390x SIMD Instruction Set (#12019 )

2025-02-22 21:39:24 +00:00

src

vulkan: add specific MMV kernels for IQ2 and IQ3 quants + optimizations (#11595 )

2025-02-28 09:42:52 +01:00

.gitignore

vulkan : cmake integration (#8119 )

2024-07-13 18:12:39 +02:00

CMakeLists.txt

cmake: Fix ggml backend dependencies and installation (#11818 )

2025-02-27 09:42:48 +02:00