llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-28 08:31:25 +00:00

Author	SHA1	Message	Date
Francis Couture-Harpin	183eeb5518	imatrix : avoid loading model to convert or combine imatrix	2025-07-12 16:50:10 -04:00
Francis Couture-Harpin	50f53b3e40	imatrix : warn when writing partial data, to help guess dataset coverage Also make the legacy format store partial data by using neutral values for missing data. This matches what is done at read-time for the new format, and so should get the same quality in case the old format is still used.	2025-07-12 16:50:10 -04:00
Francis Couture-Harpin	42423ec4d3	imatrix : add warning when legacy format is written	2025-07-12 15:19:51 -04:00
Francis Couture-Harpin	e33de128c7	common : move string_remove_suffix from quantize and imatrix Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>	2025-06-23 16:24:06 -04:00
Francis Couture-Harpin	43cd2b3eb5	imatrix : support 3d tensors with MUL_MAT	2025-06-23 12:20:55 -04:00
Francis Couture-Harpin	1a9454a3d2	imatrix : avoid returning from void function save_imatrix	2025-06-18 16:44:41 -04:00
Francis Couture-Harpin	ba6f6be6ce	imatrix : don't use FMA explicitly This should make comparisons between the formats easier because this matches the behavior of the previous version.	2025-06-18 16:33:37 -04:00
Francis Couture-Harpin	2c0945027a	Merge branch 'master' into compilade/imatrix-batched-chunks	2025-06-18 16:32:35 -04:00
Georgi Gerganov	745aa5319b	llama : deprecate llama_kv_self_ API (#14030 ) * llama : deprecate llama_kv_self_ API ggml-ci * llama : allow llama_memory_(nullptr) ggml-ci * memory : add flag for optional data clear in llama_memory_clear ggml-ci	2025-06-06 14:11:15 +03:00
Bartowski	efb8b47eda	imatrix : Add --parse-special for enabling parsing of special tokens in imatrix calculation (#13389 ) * Add --parse-special for enabling parsing of special tokens in imatrix calculation * whitespace	2025-05-09 11:53:58 +02:00
Georgi Gerganov	51fb96b1ff	context : remove logits_all flag (#13284 ) * context : remove logits_all flag ggml-ci * llama : remove logits_all flag + reorder llama_context_params ggml-ci	2025-05-08 14:26:50 +03:00
Johannes Gäßler	3e959f0976	imatrix: fix oob writes if src1 is not contiguous (#13286 )	2025-05-04 00:50:37 +02:00
Diego Devesa	1d36b3670b	llama : move end-user examples to tools directory (#13249 ) * llama : move end-user examples to tools directory --------- Co-authored-by: Xuan Son Nguyen <son@huggingface.co>	2025-05-02 20:27:13 +02:00

13 Commits