mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2025-10-30 08:42:00 +00:00
Broadcast src0 into src1 across dimensions 2 and 3 when needed. This is required for models that use GQA.
68 KiB
68 KiB