llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-02 09:12:03 +00:00

Files

Sigbjørn Skjæret 36ca8b3628 CUDA: don't convert BF16 weights to FP32 (ggml/1174)

* add bf16 support

* use convert_from_bf16_cuda instead of convert_unary_cuda for f32

* revert 7ec5085

* move functionality into convert_unary with constexpr

2025-04-07 18:44:17 +03:00

cmake

scripts : update sync + fix cmake merge

2025-03-27 10:09:29 +02:00

include

metal : improve FA + improve MoE (#12612 )

2025-03-28 20:21:59 +02:00

src

CUDA: don't convert BF16 weights to FP32 (ggml/1174)

2025-04-07 18:44:17 +03:00

.gitignore

vulkan : cmake integration (#8119 )

2024-07-13 18:12:39 +02:00

CMakeLists.txt

ggml : add logging for native build options/vars (whisper/2935)

2025-03-30 08:33:31 +03:00