llama.cpp/ggml-cuda/mmq.cuh at ff794f55355a912604fa3b54d0c6fa8ca4e4533e - llama.cpp - Gitea - Peisong Xiao

CS348Project/llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-18 11:46:58 +00:00

Files

Johannes Gäßler bdcb8f4222 CUDA: int8 tensor cores for MMQ (q4_K, q5_K, q6_K) (#7860 )

2024-06-11 08:26:07 +02:00

73 KiB

Raw Blame History

View Raw