llama.cpp/ggml-vulkan.cpp at 9aa672490c848e45eaa704a554e0f1f6df995fc8

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-30 08:42:00 +00:00

Files

Georgi Gerganov 9cb317f77e ggml : full ALiBi support (#7192 )

* ggml : full ALiBi support

* ggml : update ggml_soft_max_ext() CUDA, SYCL

* ggml : ggml_flash_attn_ext() support ALiBi (CPU)

* ggml : ggml_flash_attn_ext() support ALiBi (Metal)

* ggml : fix warning

* ggml : ggml_flash_attn_ext() support ALiBi (CUDA)

ggml-ci

* ggml : fix assert message

* vulkan : add dev notes

* ggml : require mask when using ALiBi

ggml-ci

* convert : fix convert for refact models

2024-05-11 10:32:41 +03:00

381 KiB

Raw Blame History

View Raw

381 KiB Raw Blame History

381 KiB

Raw Blame History