llama.cpp/ggml.h at 331343ab0e74ba97a8b2dba169969c19518bfd6d

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-06 09:46:50 +00:00

Files

Georgi Gerganov e95b6554b4 ggml : add Q8_0 quantization for intermediate results (#951 )

* ggml : add Q8_0 quantization for intermediate results

* quantize-stats : fix test + add it to Makefile default

* Q8: use int8_t, AVX/AVX2 optimizations

* ggml : fix quantize_row_q8_0() ARM_NEON rounding

* minor : updates after rebase to latest master

* quantize-stats : delete obsolete strings

* ggml : fix q4_1 dot func

---------

Co-authored-by: Stephan Walter <stephan@walter.name>

2023-04-15 17:53:22 +03:00

24 KiB

Raw Blame History

View Raw

24 KiB Raw Blame History

24 KiB

Raw Blame History