llama.cpp/ggml-opencl.cpp at 0bc2cdfc875fa7877d8e01c8bb17066f99c08f21

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-30 08:42:00 +00:00

Files

LostRuins 96a712ca1b Porting the improved K-Quant CUDA kernels to OpenCL (#1966 )

* Added broken new q4k quant

* xx + ib0

* Fix q2_k fast kernel

* Use preprocessor for QK_K

* Add q6_k fast matmul kernel

* ported q3k speedup successfully

* ported q2k and q5k speedups

* remove old dot kernels and template

* fixed global const struct types

* fixing address spaces

* fixed string too long CI issue

---------

Co-authored-by: 0cc4m <picard12@live.de>

2023-06-29 05:56:43 +02:00

67 KiB

Raw Blame History

View Raw

67 KiB Raw Blame History

67 KiB

Raw Blame History