llama.cpp/ggml/src/ggml-cuda/fattn.cuh at bd0af02fc96c2057726f33c0f0daf7bb8f3e462a - llama.cpp - Gitea - Peisong Xiao

CS348Project/llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-16 11:27:03 +00:00

Files

Johannes Gäßler 13aeb7aef2 CUDA: refactor FA support/selection code (#15454 )

2025-08-20 23:14:14 +02:00

6 lines

185 B

Plaintext

Raw Blame History

 #include "common.cuh"
 void ggml_cuda_flash_attn_ext(ggml_backend_cuda_context & ctx, ggml_tensor * dst);
 bool ggml_cuda_flash_attn_ext_supported(int device, const ggml_tensor * dst);