llama.cpp/ggml.c at 2bed4aa3f37cb4e39e16e9ec7b595a7738fd5faf

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-03 09:22:01 +00:00

Files

Reinforce-II 780e24a22e ggml : parallelize FP32 conversion when using BLAS (#5045 )

* make GGML_TASK_INIT phase can be run in multithread

* multithreaded dequantize in mul_mat when using blas library

* minor fixes

* update outdated comment
* fix coding style

* simplify code

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

2024-01-22 15:15:08 +02:00

654 KiB

Raw Blame History

View Raw

654 KiB Raw Blame History

654 KiB

Raw Blame History