llama.cpp/gguf-py/gguf/lazy.py at jg/cuda-fa-np-runtime

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-31 08:51:55 +00:00

Files

compilade 3a14e00366 gguf-py : simplify support for quant types (#8838 )

* gguf-py : use classes for quants

* convert_hf : simplify internal quantization type selection

* gguf-py : fix flake8 lint

* gguf-py : fix BF16 numpy view type

* gguf-py : remove LlamaFileTypeMap

Too specific to 'llama.cpp', and would be a maintenance burden
to keep up to date.

* gguf-py : add generic quantize and dequantize functions

The quant classes no longer need to be known,
only the target or the source type,
for 'quantize' and 'dequantize', respectively.

2024-08-08 13:33:09 -04:00

8.4 KiB

Raw Permalink Blame History

View Raw

8.4 KiB Raw Permalink Blame History

8.4 KiB

Raw Permalink Blame History