* Fix to use hidden_size_per_head
* Fix num heads
* Fix array
* Fix loading weights
* Support old GGUF converted by the previous version of llama.cpp
* Update src/llama-model.cpp
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Move shared parameter definitions to the outside of loop
* Not calculating n_embd_head_k,v by n_embd / n_head
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>