Files
llama.cpp/ggml-opencl.cpp
shibe2 665018c749 CLBlast: Add broadcast support for matrix multiplication (#3402)
Broadcast src0 into src1 across dimensions 2 and 3 when needed.
This is required for models that use GQA.
2023-10-02 21:26:15 +02:00

68 KiB