llama.cpp/ggml-cuda.cu at 6d66ef96eb83e90e4d0ba24e30b790c5936c22e5 - llama.cpp - Gitea - Peisong Xiao

CS348Project/llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-01 09:01:57 +00:00

Files

Johannes Gäßler 1cd06fa25e CUDA: launch_bounds, small q4_K, q5_K mmq refactor (#2596 )

2023-08-14 10:41:22 +02:00

238 KiB

Raw Blame History

View Raw