llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-28 08:31:25 +00:00

Files

Jeff Bolz 98197e5c98 vulkan: optimizations for deepseek prompt processing (#14555 )

* vulkan: allow unclamped loads in coopmat2 mul_mat_id shader

* vulkan: increase coopmat2 mul_mat_id tile size

* vulkan: optimize mat_mul_id row_ids search to batch loads, and port to coopmat1 path

* vulkan: use smaller FA row size when head size is large. applies to both scalar and CM2 paths (CM1 isn't used due to shared memory limits)

2025-07-12 11:51:58 +02:00

cmake

ggml-cpu : rework weak alias on apple targets (#14146 )

2025-06-16 13:54:15 +08:00

include

ggml : add ggml_scale_bias (#14417 )

2025-07-09 18:16:12 +02:00

src

vulkan: optimizations for deepseek prompt processing (#14555 )

2025-07-12 11:51:58 +02:00

.gitignore

vulkan : cmake integration (#8119 )

2024-07-13 18:12:39 +02:00

CMakeLists.txt

ggml : remove kompute backend (#14501 )

2025-07-03 07:48:32 +03:00