llama.cpp/gguf-py/gguf/quants.py at 0996149911458ce9821aa49e10db4e7c1187486d

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-11 10:36:54 +00:00

Files

Francis Couture-Harpin 0996149911 convert-hf : allow converting the weird BitNet 1.3B

Its FFN size is 5460 which is not convenient.
The offending tensors are kept in F16,
which makes the final model 5.01 bpw.

2024-06-27 02:06:28 -04:00

View Raw