llama.cpp/ggml.c at de731963441ff128248259e1b99573d75264d210

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-03 09:22:01 +00:00

Files

Justine Tunney 934266c0e0 ggml : rewrite silu and softmax for cpu (#7154 )

This change upstreams llamafile's vectorized expf() functions. This lets
us compute softmax and silu more accurately than the short[65536] lookup
table that GGML previously used to make this operation go faster. We can
support aarch64 and sse2+ with the worst case rounding error of 2ulp. It
makes make -j8 tests && ./tests/test-backend-ops -o SOFT_MAX -b CPU perf
go 1.5x faster for SSE2+FMA, 1.9x faster for AVX2+FMA and 2.1x on AVX512

2024-05-17 09:58:52 +03:00

751 KiB

Raw Blame History

View Raw

751 KiB Raw Blame History

751 KiB

Raw Blame History