llama.cpp/ggml-opencl.cpp at f93af02488179b9c52d0d391b08ae4c4d891b8d3

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-30 08:42:00 +00:00

Files

shibe2 665018c749 CLBlast: Add broadcast support for matrix multiplication (#3402 )

Broadcast src0 into src1 across dimensions 2 and 3 when needed.
This is required for models that use GQA.

2023-10-02 21:26:15 +02:00

View Raw