llama.cpp/ggml-cuda.cu at ec1b100720a66bd0bdcc6fa65d49145c07c25eec - llama.cpp - Gitea - Peisong Xiao

CS348Project/llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-01 09:01:57 +00:00

Files

Johannes Gäßler 1cd06fa25e CUDA: launch_bounds, small q4_K, q5_K mmq refactor (#2596 )

2023-08-14 10:41:22 +02:00

238 KiB

Raw Blame History

View Raw