llama.cpp/ggml-cuda/mma.cuh at a94e6ff8774b7c9f950d9545baf0ce35e8d1ed2f - llama.cpp - Gitea - Peisong Xiao

CS348Project/llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-12 10:47:01 +00:00

Files

Johannes Gäßler bdcb8f4222 CUDA: int8 tensor cores for MMQ (q4_K, q5_K, q6_K) (#7860 )

2024-06-11 08:26:07 +02:00

5.2 KiB

Raw Blame History

View Raw