llama.cpp/convert-persimmon-to-gguf.py at llm-build-context

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-31 08:51:55 +00:00

Files

Phillip Kravtsov 0e797c2fc5 llm : support Adept Persimmon 8B (#3410 )

* Produces garbage output

* wip: correct tensors up to RoPE

* correct tensors thru RoPE

* Correct outputs through masked & softmax'd KQ

* fp32 works

* Rename adept->persimmon

* Produces correct outputs

* clean up convert scripts

* remove printing logic from ggml.c

* remove prints from llama.cpp & fix merge

* trivial cleanups

* Add offload funcs

* update conversion script to directly take adept artifacts rather than .saftensors file

* Fix norm eps bug

* Support sqr and concat on metal, persimmon-8b-q4 runs correctly

* Small changes from review

* Formatting changes

* Minor changes to conversion script

* Remove old script

* Fix editorconfig formatting

* Fix build

* add overlooked offload code ggml-ci

2023-10-07 10:12:43 +03:00

4.7 KiB

Raw Permalink Blame History

View Raw

4.7 KiB Raw Permalink Blame History

4.7 KiB

Raw Permalink Blame History