llama.cpp/ggml-cuda.cu at a128c38de862431f1aae9ccc40b792fbc1b8b682 - llama.cpp - Gitea - Peisong Xiao

CS348Project/llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-13 10:57:15 +00:00

Files

Johannes Gäßler 3fe81781e3 CUDA: faster q8_0 -> f16 dequantization (#4895 )

2024-01-12 20:38:54 +01:00

414 KiB

Raw Blame History

View Raw