llama.cpp/ggml-cuda/mma.cuh at 5326bcceeb7dd34f16d0fe61b134d1e074a8e65d - llama.cpp - Gitea - Peisong Xiao

CS348Project/llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-20 12:07:33 +00:00

Files

Johannes Gäßler bdcb8f4222 CUDA: int8 tensor cores for MMQ (q4_K, q5_K, q6_K) (#7860 )

2024-06-11 08:26:07 +02:00

5.2 KiB

Raw Blame History

View Raw