llama.cpp/examples/quantize/quantize.cpp at 0ead1f1072fdc70720f92e008a294dc74a826b1d

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-30 08:42:00 +00:00

Files

jiez 1966eb2615 quantize : add '--keep-split' to quantize model into shards (#6688 )

* Implement '--keep-split' to quantize model into several shards

* Add test script

* Update examples/quantize/quantize.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Split model correctly even if tensor id is out-of-order

* Update llama_model_quantize_params

* Fix preci failures

---------

Co-authored-by: z5269887 <z5269887@unsw.edu.au>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

2024-04-25 13:29:35 +03:00

17 KiB

Raw Blame History

View Raw

17 KiB Raw Blame History

17 KiB

Raw Blame History