llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-21 12:16:57 +00:00

Files

shaofeiqi 4db5641210 opencl: add kernel to handle mat mul in attention to improve encoding speed (#17181 )

* Add mul_mm_f16_f32_kq_kqv kernel

* Add ggml_cl_mul_mat_kq_kqv_adreno func

* fix whitespace

* remove unused variable

* remove redundant

* refactor and clean up

* remove trailing whitespace

2025-11-15 17:33:10 -08:00

kernels

opencl: add kernel to handle mat mul in attention to improve encoding speed (#17181 )

2025-11-15 17:33:10 -08:00

CMakeLists.txt

opencl: add kernel to handle mat mul in attention to improve encoding speed (#17181 )

2025-11-15 17:33:10 -08:00

ggml-opencl.cpp

opencl: add kernel to handle mat mul in attention to improve encoding speed (#17181 )

2025-11-15 17:33:10 -08:00