Commit Graph

2 Commits

Author SHA1 Message Date
Johannes Gäßler
658987cfc9 CUDA: noncont MMVQ + batched bs1 MUL_MAT_ID (#13014)
* CUDA: noncont MMVQ + batched bs1 MUL_MAT_ID

* fix logic for RoPE support, CUDA graphs
2025-04-22 21:27:40 +02:00
Johannes Gäßler
c3ea58aca4 CUDA: remove DMMV, consolidate F16 mult mat vec (#10318) 2024-11-17 09:09:55 +01:00