llama.cpp/ggml-metal.m at 7704db252108d3ec69be4fdcaee4d834ea5e8fa8

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-17 11:37:10 +00:00

Files

Kawrakow ca82cf7bac metal : more optimizations (#2959 )

* Very minor speedup via simd-group synchronization in f16 x f32

* Another very minor speedup on metal

* Quite significant PP speedup on metal

* Another attempt

* Minor

* Massive improvement for TG for fp16

* ~4-5% improvement for Q8_0 TG on metal

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

2023-09-03 11:06:22 +03:00

61 KiB

Raw Blame History

View Raw

61 KiB Raw Blame History

61 KiB

Raw Blame History