llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-27 08:21:30 +00:00

Files

muggle-stack 342c728d03 ggml : fix SpaceMit IME array out-of-bounds in task assignment (#16629 )

Fix incorrect task-to-batch index calculation in the quantization phase.

The bug caused out-of-bounds access to qnbitgemm_args array when
compute_idx exceeded per_gemm_block_count_m, leading to invalid
pointer dereferences and SIGBUS errors.

Correctly map tasks to batches by dividing compute_idx by
per_gemm_block_count_m instead of block_size_m.

Example:
  batch_feature=1, gemm_m=30, block_size_m=4
  per_gemm_block_count_m = 8, task_count = 8

  Old: gemm_idx = 4/4 = 1 (out of bounds  New: gemm_idx = 4/8 = 0 (correct)

Tested on SpaceMit K1 RISC-V64 with qwen2.5:0.5b model.

Co-authored-by: muggle <mingjun.rong@spacemit.com>

2025-10-17 13:01:23 +03:00

cmake

ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (#15094 )

2025-08-07 13:45:41 +02:00

include

cpu : add FLOOR, CEIL, ROUND and TRUNC unary operators (#16083 )

2025-10-15 21:24:51 +02:00

src

ggml : fix SpaceMit IME array out-of-bounds in task assignment (#16629 )

2025-10-17 13:01:23 +03:00

.gitignore

vulkan : cmake integration (#8119 )

2024-07-13 18:12:39 +02:00

CMakeLists.txt

ggml webgpu: profiling, CI updates, reworking of command submission (#16452 )

2025-10-07 13:48:56 -07:00