llama.cpp

CS348Project/llama.cpp

Fork 0

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-21 12:16:57 +00:00

Commit Graph

Author	SHA1	Message	Date
Akarshan Biswas	b66df9d9c9	CUDA: fix build error from ambiguous __half conversions in conv2d (#15690 ) * CUDA: fix build error from ambiguous __half conversions in conv2d Building conv2d with half precision failed because `__half` defines multiple implicit conversion operators (to float, int, short, etc.), causing ambiguous overload resolution when multiplying with float. Introduce a templated `to_float` helper that explicitly converts `__half` via `__half2float`, while passing through float unchanged. Use this helper in conv2d accumulation to ensure unambiguous and correct promotion to float. Fixes some build errors with half-precision kernels on CUDA. ggml-ci * CUDA: Replace custom to_float helper with unified ggml_cuda_cast and add half‑>float conversion * CUDA: Add missing convert.cuh header * CUDA: remove unnecessary extension in ggml_cuda_cast * CUDA: Address review comment, remove second type template argument	2025-09-01 06:55:06 +05:30
Johannes Gäßler	38ad381f9f	CUDA: use FP32 arithmetic for conv2d (#15683 )	2025-08-30 16:20:32 +02:00
mnehete32	c97dc09391	CUDA: add conv2d (#15635 ) * CUDA: add conv2d * CUDA: conv2d - correct formatting and added const	2025-08-28 20:33:03 +02:00

Author

SHA1

Message

Date

Akarshan Biswas

b66df9d9c9

CUDA: fix build error from ambiguous __half conversions in conv2d (#15690 )

* CUDA: fix build error from ambiguous __half conversions in conv2d

Building conv2d with half precision failed because `__half` defines
multiple implicit conversion operators (to float, int, short, etc.),
causing ambiguous overload resolution when multiplying with float.

Introduce a templated `to_float` helper that explicitly converts
`__half` via `__half2float`, while passing through float unchanged.
Use this helper in conv2d accumulation to ensure unambiguous and
correct promotion to float.

Fixes some build errors with half-precision kernels on CUDA.

ggml-ci

* CUDA: Replace custom to_float helper with unified ggml_cuda_cast and add half‑>float conversion

* CUDA: Add missing convert.cuh header

* CUDA: remove unnecessary extension in ggml_cuda_cast

* CUDA: Address review comment, remove second type template argument

2025-09-01 06:55:06 +05:30

Johannes Gäßler

38ad381f9f

CUDA: use FP32 arithmetic for conv2d (#15683 )

2025-08-30 16:20:32 +02:00

mnehete32

c97dc09391

CUDA: add conv2d (#15635 )

* CUDA: add conv2d

* CUDA: conv2d - correct formatting and added const

2025-08-28 20:33:03 +02:00

3 Commits