llama.cpp/ggml/src/ggml-impl.h at 9e2d9b8b93f44d573800d4e1686b1317fb07ab57

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-08 10:07:01 +00:00

Files

Jeff Bolz e56abd2098 vulkan: Implement topk_moe fused shader, ported from CUDA (#16641 )

This is similar to the CUDA shader from #16130, but doesn't use shared memory
and handles different subgroup sizes.

2025-10-18 12:22:57 +02:00

View Raw