Francis Couture-Harpin
183eeb5518
imatrix : avoid loading model to convert or combine imatrix
2025-07-12 16:50:10 -04:00
Francis Couture-Harpin
50f53b3e40
imatrix : warn when writing partial data, to help guess dataset coverage
...
Also make the legacy format store partial data
by using neutral values for missing data.
This matches what is done at read-time for the new format,
and so should get the same quality in case the old format is still used.
2025-07-12 16:50:10 -04:00
Francis Couture-Harpin
42423ec4d3
imatrix : add warning when legacy format is written
2025-07-12 15:19:51 -04:00
Francis Couture-Harpin
e33de128c7
common : move string_remove_suffix from quantize and imatrix
...
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com >
2025-06-23 16:24:06 -04:00
Francis Couture-Harpin
43cd2b3eb5
imatrix : support 3d tensors with MUL_MAT
2025-06-23 12:20:55 -04:00
Francis Couture-Harpin
1a9454a3d2
imatrix : avoid returning from void function save_imatrix
2025-06-18 16:44:41 -04:00
Francis Couture-Harpin
ba6f6be6ce
imatrix : don't use FMA explicitly
...
This should make comparisons between the formats easier
because this matches the behavior of the previous version.
2025-06-18 16:33:37 -04:00
Francis Couture-Harpin
2c0945027a
Merge branch 'master' into compilade/imatrix-batched-chunks
2025-06-18 16:32:35 -04:00
Georgi Gerganov
745aa5319b
llama : deprecate llama_kv_self_ API ( #14030 )
...
* llama : deprecate llama_kv_self_ API
ggml-ci
* llama : allow llama_memory_(nullptr)
ggml-ci
* memory : add flag for optional data clear in llama_memory_clear
ggml-ci
2025-06-06 14:11:15 +03:00
Bartowski
efb8b47eda
imatrix : Add --parse-special for enabling parsing of special tokens in imatrix calculation ( #13389 )
...
* Add --parse-special for enabling parsing of special tokens in imatrix calculation
* whitespace
2025-05-09 11:53:58 +02:00
Georgi Gerganov
51fb96b1ff
context : remove logits_all flag ( #13284 )
...
* context : remove logits_all flag
ggml-ci
* llama : remove logits_all flag + reorder llama_context_params
ggml-ci
2025-05-08 14:26:50 +03:00
Johannes Gäßler
3e959f0976
imatrix: fix oob writes if src1 is not contiguous ( #13286 )
2025-05-04 00:50:37 +02:00
Diego Devesa
1d36b3670b
llama : move end-user examples to tools directory ( #13249 )
...
* llama : move end-user examples to tools directory
---------
Co-authored-by: Xuan Son Nguyen <son@huggingface.co >
2025-05-02 20:27:13 +02:00