llama.cpp/ggml/src/ggml-cuda/template-instances/fattn-vec-instance-q4_1-f16.cu at bd0af02fc96c2057726f33c0f0daf7bb8f3e462a - llama.cpp - Gitea - Peisong Xiao

CS348Project/llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-11 10:36:54 +00:00

Files

Johannes Gäßler 75a3a6c2cd CUDA: refactor and deduplicate vector FA kernels (#16208 )

* CUDA: refactor and deduplicate vector FA kernels

2025-09-27 18:45:07 +02:00

8 lines

284 B

Plaintext

Raw Blame History

 // This file has been autogenerated by generate_cu_files.py, do not edit manually.
 #include "../fattn-vec.cuh"
 DECL_FATTN_VEC_CASE( 64, GGML_TYPE_Q4_1, GGML_TYPE_F16);
 DECL_FATTN_VEC_CASE(128, GGML_TYPE_Q4_1, GGML_TYPE_F16);
 DECL_FATTN_VEC_CASE(256, GGML_TYPE_Q4_1, GGML_TYPE_F16);