llama.cpp/tests/test-backend-ops.cpp at 7ea15bb64c81e3813eb0babf9a57e1bc5697f569

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-13 10:57:15 +00:00

Files

Aman Gupta 48e2fa9fb7 CUDA: add fp kernel for larger batch size MoE (#16512 )

* CUDA: kernel for larger batch sizes for MoE

* WIP

* WIP

* WIP

* WIP

* WIP

* WIP

* fixup

* tests

* Move mmq_ids_helper to mmid

* cleanup

* Remove redundant checks

2025-10-14 13:15:15 +02:00

277 KiB

Raw Blame History

View Raw

277 KiB Raw Blame History

277 KiB

Raw Blame History