llama.cpp/ggml/src/ggml-cuda/conv2d-dw.cuh at a118d80233d3bf92569c051346fd2638f87bf202 - llama.cpp - Gitea - Peisong Xiao

CS348Project/llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-17 11:37:10 +00:00

Files

Aman Gupta 9eaa51e7f0 CUDA: add conv_2d_dw (#14265 )

* CUDA: add conv_2d_dw

* better naming

* simplify using template

* Review: fix operation ordering in ggml-cuda, use __forceinline__, use more const

2025-06-20 09:50:24 +08:00

6 lines

155 B

Plaintext

Raw Blame History

 #pragma once
 #include "common.cuh"
 #define CUDA_CONV2D_DW_BLOCK_SIZE 256
 void ggml_cuda_op_conv2d_dw(ggml_backend_cuda_context & ctx, ggml_tensor * dst);