Retire the ggml_mul_mat() branch for transposed src0 (#500)

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-30 08:42:00 +00:00

* Retire the ggml_mul_mat() for transposed src0

- It can always be made contiguous with ggml_cpy()
- The code is now simplified
- The results are deterministic in respect to num threads

* SIMD-ify dequantize_row_q4_0() for ARM_NEON (#502)

* Attempt to SIMD-ify dequantize_row_q4_0() for ARM_NEON

* Fix dequantization - forgot to interleave the quants

This commit is contained in:

Georgi Gerganov

2023-03-25 19:47:21 +02:00

committed by

GitHub

parent 502a400192

commit ecbe466a36

1 changed files with 277 additions and 756 deletions

1033

ggml.c

View File

File diff suppressed because it is too large Load Diff

Retire the ggml_mul_mat() branch for transposed src0 (#500)

1033 ggml.c View File

1033

ggml.c

View File