llama.cpp/ggml-cuda/mmq.cuh at 6fcd1331efbfbb89c8c96eba2321bb7b4d0c40e4 - llama.cpp - Gitea - Peisong Xiao

CS348Project/llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-02 09:12:03 +00:00

Files

Johannes Gäßler bdcb8f4222 CUDA: int8 tensor cores for MMQ (q4_K, q5_K, q6_K) (#7860 )

2024-06-11 08:26:07 +02:00

73 KiB

Raw Blame History

View Raw