llama.cpp/tests/test-backend-ops.cpp at fa882fd2b1bcb663de23af06fdc391489d05b007

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-01 09:01:57 +00:00

Files

Aman Gupta 48e2fa9fb7 CUDA: add fp kernel for larger batch size MoE (#16512 )

* CUDA: kernel for larger batch sizes for MoE

* WIP

* WIP

* WIP

* WIP

* WIP

* WIP

* fixup

* tests

* Move mmq_ids_helper to mmid

* cleanup

* Remove redundant checks

2025-10-14 13:15:15 +02:00

277 KiB

Raw Blame History

View Raw

277 KiB Raw Blame History

277 KiB

Raw Blame History