Commit Graph

13 Commits

Author SHA1 Message Date
Francis Couture-Harpin
183eeb5518 imatrix : avoid loading model to convert or combine imatrix 2025-07-12 16:50:10 -04:00
Francis Couture-Harpin
50f53b3e40 imatrix : warn when writing partial data, to help guess dataset coverage
Also make the legacy format store partial data
by using neutral values for missing data.
This matches what is done at read-time for the new format,
and so should get the same quality in case the old format is still used.
2025-07-12 16:50:10 -04:00
Francis Couture-Harpin
42423ec4d3 imatrix : add warning when legacy format is written 2025-07-12 15:19:51 -04:00
Francis Couture-Harpin
e33de128c7 common : move string_remove_suffix from quantize and imatrix
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2025-06-23 16:24:06 -04:00
Francis Couture-Harpin
43cd2b3eb5 imatrix : support 3d tensors with MUL_MAT 2025-06-23 12:20:55 -04:00
Francis Couture-Harpin
1a9454a3d2 imatrix : avoid returning from void function save_imatrix 2025-06-18 16:44:41 -04:00
Francis Couture-Harpin
ba6f6be6ce imatrix : don't use FMA explicitly
This should make comparisons between the formats easier
because this matches the behavior of the previous version.
2025-06-18 16:33:37 -04:00
Francis Couture-Harpin
2c0945027a Merge branch 'master' into compilade/imatrix-batched-chunks 2025-06-18 16:32:35 -04:00
Georgi Gerganov
745aa5319b llama : deprecate llama_kv_self_ API (#14030)
* llama : deprecate llama_kv_self_ API

ggml-ci

* llama : allow llama_memory_(nullptr)

ggml-ci

* memory : add flag for optional data clear in llama_memory_clear

ggml-ci
2025-06-06 14:11:15 +03:00
Bartowski
efb8b47eda imatrix : Add --parse-special for enabling parsing of special tokens in imatrix calculation (#13389)
* Add --parse-special for enabling parsing of special tokens in imatrix calculation

* whitespace
2025-05-09 11:53:58 +02:00
Georgi Gerganov
51fb96b1ff context : remove logits_all flag (#13284)
* context : remove logits_all flag

ggml-ci

* llama : remove logits_all flag + reorder llama_context_params

ggml-ci
2025-05-08 14:26:50 +03:00
Johannes Gäßler
3e959f0976 imatrix: fix oob writes if src1 is not contiguous (#13286) 2025-05-04 00:50:37 +02:00
Diego Devesa
1d36b3670b llama : move end-user examples to tools directory (#13249)
* llama : move end-user examples to tools directory

---------

Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
2025-05-02 20:27:13 +02:00