mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2025-10-27 08:21:30 +00:00
* Leverage the existing GGML_F32_VEC helpers to broadcast the fill value across SIMD registers and store in vector-sized chunks, while retaining the scalar tail for leftover elements and non-SIMD builds. * Vectorize additional f32 helper loops * Normalize f32 helper tails for ggml vec ops --------- Co-authored-by: Aaron <shelhamer.aaron@gmail.com>