llama.cpp/tests/test-backend-ops.cpp at 17304cbcc1dd24de7741cbe57925d58e90a98ac1

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-30 08:42:00 +00:00

Files

Aman Gupta 48e2fa9fb7 CUDA: add fp kernel for larger batch size MoE (#16512 )

* CUDA: kernel for larger batch sizes for MoE

* WIP

* WIP

* WIP

* WIP

* WIP

* WIP

* fixup

* tests

* Move mmq_ids_helper to mmid

* cleanup

* Remove redundant checks

2025-10-14 13:15:15 +02:00

277 KiB

Raw Blame History

View Raw

277 KiB Raw Blame History

277 KiB

Raw Blame History