llama.cpp/ggml.c at 214b6a35702a489e3738acd81fad6d46182d3036

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-15 11:17:31 +00:00

Files

Georgi Gerganov 214b6a3570 ggml : adjust mul_mat_f16 work memory (#1226 )

* llama : minor - remove explicity int64_t cast

* ggml : reduce memory buffer for F16 mul_mat when not using cuBLAS

* ggml : add asserts to guard for incorrect wsize

2023-04-29 18:43:28 +03:00

408 KiB

Raw Blame History

View Raw

408 KiB Raw Blame History

408 KiB

Raw Blame History