llama.cpp/gguf-py/gguf/tensor_mapping.py at e112b610a1a75cb7fa8351e1a933e2e7a755a5ce

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-31 08:51:55 +00:00

Files

Eddie-Wang e112b610a1 llama : add support for BitnetForCausalLM (#7931 )

* hf bitnet v1

* hf bitnet e2e v2

* finish bitnet e2e

* finish f16 hf bitnet e2e

* remove unsed

* finish bitnet i2 e2e

* move i2s to quantize v1

* move i2 to quantize

* clean code

* clean code 2

* fix codestyle

* fix code

* fix

* fix code

* fix merge

* remove unused

* change table name

* fix whitespace

* delete redundant

* i2_s to absmax

* finish i2_s/i8_s vec_dot x86 simd

* i2s->q22

* fix code

* remove block scale

* add dequantize

* fix seq

* update avx2

* remove q2_2

* remove q22_grid

* fix whitespace

* reuse llm_build_kv

* fix bo

---------

Co-authored-by: root <root@wangjinheng>

2024-06-23 21:27:57 +03:00

24 KiB

Raw Blame History

View Raw

24 KiB Raw Blame History

24 KiB

Raw Blame History