llama.cpp/ggml/src/ggml-cuda/softcap.cuh at 6d758839ff741d4966ca92b7f801b7a8b5b96364 - llama.cpp - Gitea - Peisong Xiao

CS348Project/llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-10 10:27:03 +00:00

Files

Sigbjørn Skjæret 138b288b59 cuda : add softcap fusion (#14907 )

2025-07-29 14:22:03 +02:00

6 lines

158 B

Plaintext

Raw Blame History

 #include "common.cuh"
 #define CUDA_SOFTCAP_BLOCK_SIZE 256
 void ggml_cuda_op_softcap(ggml_backend_cuda_context & ctx, ggml_tensor * dst, ggml_tensor * src);