Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						3c3635d2f2 
					 
					
						
						
							
							server : speed up tests ( #15836 )  
						
						... 
						
						
						
						* server : speed up tests
* clean up
* restore timeout_seconds in some places
* flake8
* explicit offline 
						
						
					 
					
						2025-09-06 14:45:24 +02:00 
						 
				 
			
				
					
						
							
							
								Piotr Wilkin (ilintar) 
							
						 
					 
					
						
						
							
						
						9e2b1e83c6 
					 
					
						
						
							
							scripts : add Jinja tester PySide6 simple app ( #15756 )  
						
						... 
						
						
						
						* feat: add Jinja tester PySide6 simple app
* Linter fixes
* Pylint fixes
* Whitespace
* Add commandline support; add formatter; add extensions
* Remove testing actions
* Silence flake8 warnings for commandline mode
* Apply suggestions from code review
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com >
* Fix trailing whitespace/newline logic
* Update scripts/jinja/jinja-tester.py
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com >
* Update scripts/jinja/jinja-tester.py
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com >
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com > 
						
						
					 
					
						2025-09-05 01:05:12 +02:00 
						 
				 
			
				
					
						
							
							
								Johannes Gäßler 
							
						 
					 
					
						
						
							
						
						e81b8e4b7f 
					 
					
						
						
							
							llama: use FA + max. GPU layers by default ( #15434 )  
						
						... 
						
						
						
						* llama: use max. GPU layers by default, auto -fa
* ggml-backend: abort instead of segfault 
						
						
					 
					
						2025-08-30 16:32:10 +02:00 
						 
				 
			
				
					
						
							
							
								Johannes Gäßler 
							
						 
					 
					
						
						
							
						
						3d16b29c3b 
					 
					
						
						
							
							scripts: strip "AMD Instinct" from GPU name ( #15668 )  
						
						
						
						
					 
					
						2025-08-29 22:04:08 +02:00 
						 
				 
			
				
					
						
							
							
								Aman Gupta 
							
						 
					 
					
						
						
							
						
						55042b3692 
					 
					
						
						
							
							scripts: add sqlite3 check for compare-commits.sh ( #15633 )  
						
						
						
						
					 
					
						2025-08-28 19:23:22 +08:00 
						 
				 
			
				
					
						
							
							
								Johannes Gäßler 
							
						 
					 
					
						
						
							
						
						9ef536907d 
					 
					
						
						
							
							scripts: fix compare-llama-bench.py ( #15521 )  
						
						
						
						
					 
					
						2025-08-23 13:58:58 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						9ebebef62f 
					 
					
						
						
							
							llama : remove KV cache defragmentation logic ( #15473 )  
						
						... 
						
						
						
						ggml-ci 
						
						
					 
					
						2025-08-22 12:22:13 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						60212f1ead 
					 
					
						
						
							
							sync : ggml  
						
						
						
						
					 
					
						2025-08-18 22:06:44 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						f0c541d315 
					 
					
						
						
							
							scripts : update sync scripts  
						
						
						
						
					 
					
						2025-08-18 22:06:44 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						3973163bff 
					 
					
						
						
							
							sync : ggml  
						
						... 
						
						
						
						ggml-ci 
						
						
					 
					
						2025-08-14 14:59:27 +03:00 
						 
				 
			
				
					
						
							
							
								Johannes Gäßler 
							
						 
					 
					
						
						
							
						
						4850b52aed 
					 
					
						
						
							
							server-bench: external OAI servers, sqlite ( #15179 )  
						
						... 
						
						
						
						* server-bench: external OAI servers, sqlite
* Update scripts/server-bench.py
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com >
* Update scripts/server-bench.py
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com >
* Update scripts/server-bench.py
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com >
* raise_for_status
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com > 
						
						
					 
					
						2025-08-08 23:04:36 +02:00 
						 
				 
			
				
					
						
							
							
								Johannes Gäßler 
							
						 
					 
					
						
						
							
						
						20638e4f16 
					 
					
						
						
							
							scripts: fix crash when --tool is not set ( #15133 )  
						
						
						
						
					 
					
						2025-08-07 08:50:30 +02:00 
						 
				 
			
				
					
						
							
							
								R0CKSTAR 
							
						 
					 
					
						
						
							
						
						3025b621d1 
					 
					
						
						
							
							llama-bench: rename DB table name from test to llama_bench ( #15003 )  
						
						... 
						
						
						
						Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com > 
						
						
					 
					
						2025-08-02 17:20:40 +08:00 
						 
				 
			
				
					
						
							
							
								R0CKSTAR 
							
						 
					 
					
						
						
							
						
						484b2091ce 
					 
					
						
						
							
							compare-commits.sh: support both llama-bench and test-backend-ops ( #14392 )  
						
						... 
						
						
						
						* compare-commits.sh: support both llama-bench and test-backend-ops
Signed-off-by: Xiaodong Ye <yeahdongcn@gmail.com >
* Speed up the build by specifying -j 12
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
* Remove build_number from test-backend-ops db
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
* Apply suggestion from @JohannesGaessler
Co-authored-by: Johannes Gäßler <johannesg@5d6.de >
* Refine tool selection logic
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
* Address review comments
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
---------
Signed-off-by: Xiaodong Ye <yeahdongcn@gmail.com >
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
Co-authored-by: Johannes Gäßler <johannesg@5d6.de > 
						
						
					 
					
						2025-08-01 08:47:27 +08:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						e32a4ec60e 
					 
					
						
						
							
							sync : ggml  
						
						... 
						
						
						
						ggml-ci 
						
						
					 
					
						2025-07-30 17:33:11 +03:00 
						 
				 
			
				
					
						
							
							
								Johannes Gäßler 
							
						 
					 
					
						
						
							
						
						bbd0f91779 
					 
					
						
						
							
							server-bench: make seed choice configurable ( #14929 )  
						
						... 
						
						
						
						* server-bench: make seed choice configurable
* Update scripts/server-bench.py
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com >
* Update scripts/server-bench.py
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com >
* fix error formatting
* Update scripts/server-bench.py
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com >
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com > 
						
						
					 
					
						2025-07-29 10:40:50 +02:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						1f45f2890e 
					 
					
						
						
							
							sync : ggml  
						
						
						
						
					 
					
						2025-07-28 08:15:01 +03:00 
						 
				 
			
				
					
						
							
							
								Aman Gupta 
							
						 
					 
					
						
						
							
						
						446595b9b3 
					 
					
						
						
							
							Docs: add instructions for adding backends ( #14889 )  
						
						
						
						
					 
					
						2025-07-27 09:36:43 +08:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						2df255da3c 
					 
					
						
						
							
							sync : ggml  
						
						... 
						
						
						
						ggml-ci 
						
						
					 
					
						2025-07-24 20:27:23 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						b17230917c 
					 
					
						
						
							
							sync : ggml  
						
						
						
						
					 
					
						2025-07-19 11:46:50 +03:00 
						 
				 
			
				
					
						
							
							
								Johannes Gäßler 
							
						 
					 
					
						
						
							
						
						5cae766541 
					 
					
						
						
							
							scripts: synthetic prompt mode for server-bench.py ( #14695 )  
						
						
						
						
					 
					
						2025-07-16 09:33:28 +02:00 
						 
				 
			
				
					
						
							
							
								Johannes Gäßler 
							
						 
					 
					
						
						
							
						
						494c5899cb 
					 
					
						
						
							
							scripts: benchmark for HTTP server throughput ( #14668 )  
						
						... 
						
						
						
						* scripts: benchmark for HTTP server throughput
* fix server connection reset 
						
						
					 
					
						2025-07-14 13:14:30 +02:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						8eff95544e 
					 
					
						
						
							
							sync : ggml  
						
						
						
						
					 
					
						2025-07-12 16:13:27 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						215535701d 
					 
					
						
						
							
							sync : ggml  
						
						... 
						
						
						
						ggml-ci 
						
						
					 
					
						2025-07-12 14:25:44 +03:00 
						 
				 
			
				
					
						
							
							
								Aman Gupta 
							
						 
					 
					
						
						
							
						
						11ee0fea2a 
					 
					
						
						
							
							Docs: script to auto-generate ggml operations docs ( #14598 )  
						
						... 
						
						
						
						* Docs: script to auto-generate ggml operations docs
* Review: formatting changes + change github action
* Use built-in types instead of typing
* docs : add BLAS and Metal ops
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com > 
						
						
					 
					
						2025-07-10 23:29:01 +08:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						d4cdd9c1c3 
					 
					
						
						
							
							ggml : remove kompute backend ( #14501 )  
						
						... 
						
						
						
						ggml-ci 
						
						
					 
					
						2025-07-03 07:48:32 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						e17991c466 
					 
					
						
						
							
							sync : ggml  
						
						... 
						
						
						
						ggml-ci 
						
						
					 
					
						2025-07-02 20:08:45 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						f61c05d4b1 
					 
					
						
						
							
							sync : ggml  
						
						... 
						
						
						
						ggml-ci 
						
						
					 
					
						2025-07-01 11:06:39 +03:00 
						 
				 
			
				
					
						
							
							
								Vedran Miletić 
							
						 
					 
					
						
						
							
						
						e9b6350e61 
					 
					
						
						
							
							scripts : make the shell scripts cross-platform ( #14341 )  
						
						
						
						
					 
					
						2025-06-30 10:17:18 +02:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						06cbedfca1 
					 
					
						
						
							
							sync : ggml  
						
						... 
						
						
						
						ggml-ci 
						
						
					 
					
						2025-06-20 21:02:47 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						d03172cc79 
					 
					
						
						
							
							sync : ggml  
						
						... 
						
						
						
						ggml-ci 
						
						
					 
					
						2025-06-18 09:59:21 +03:00 
						 
				 
			
				
					
						
							
							
								Aman Gupta 
							
						 
					 
					
						
						
							
						
						2e42be42bd 
					 
					
						
						
							
							compare-llama-bench: add option to plot ( #14169 )  
						
						... 
						
						
						
						* compare llama-bench: add option to plot
* Address review comments: convert case + add type hints
* Add matplotlib to requirements
* fix tests
* Improve comment and fix assert condition for test
* Add back default test_name, add --plot_log_scale
* use log_scale regardless of x_values 
						
						
					 
					
						2025-06-14 10:34:20 +02:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						ae92c1855b 
					 
					
						
						
							
							sync : ggml  
						
						... 
						
						
						
						ggml-ci 
						
						
					 
					
						2025-06-10 18:39:33 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						b8e2194efc 
					 
					
						
						
							
							sync : ggml  
						
						... 
						
						
						
						ggml-ci 
						
						
					 
					
						2025-06-10 09:21:56 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						f3a4b1659c 
					 
					
						
						
							
							sync : ggml  
						
						... 
						
						
						
						ggml-ci 
						
						
					 
					
						2025-06-01 13:43:57 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						53f925074d 
					 
					
						
						
							
							sync : vendor ( #13901 )  
						
						... 
						
						
						
						* sync : vendor
ggml-ci
* cont : fix httplib version
ggml-ci
* cont : fix lint
* cont : fix lint
* vendor : move to common folder /vendor
ggml-ci
* cont : fix lint
* cont : move httplib to /vendor + use json_fwd.hpp
ggml-ci
* cont : fix server build
ggml-ci
* cont : add missing headers
ggml-ci
* cont : header clean-up
ggml-ci 
						
						
					 
					
						2025-05-30 16:25:45 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						1c49c70d07 
					 
					
						
						
							
							sync : ggml  
						
						
						
						
					 
					
						2025-05-27 18:05:33 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						a26c4cc11e 
					 
					
						
						
							
							scripts : add option to compare commits in Debug ( #13806 )  
						
						... 
						
						
						
						* scripts : add option to compare commits in Debug
* cont : reuse existing CMAKE_OPTS 
						
						
					 
					
						2025-05-26 22:24:01 +03:00 
						 
				 
			
				
					
						
							
							
								Olivier Chafik 
							
						 
					 
					
						
						
							
						
						f5cd27b71d 
					 
					
						
						
							
							server: streaming of tool calls and thoughts when --jinja is on (#12379 )  
						
						... 
						
						
						
						* add common_json w/ support for truncated json healing
* add common_chat_msg_diff
* partial common_chat_parse
* refactor parser w/ optionals
* server: wire chat diffs in stream mode
* fix trigger of thinking models (must happen after thoughts are closed)
* fix functionary v3.2 raw python!
* rename: common_chat_syntax (now contains format)
* rm common_regex.at_start
* don't return empty <think></think>
* accommodate yet another deepseek r1 distill fantasy syntax (`<|tool▁calls|>`)
* fix QwQ 32B tool call parsing after thoughts (hermes2)
* better logs for grammar triggers
* consume spaces after parse_json_tool_calls
* fix required tool calls w/ thinking models that have pre-opened thinking tags
* fix thinking model's initial trigger + test qwq's template
* run most test_tool_call tests in stream + non-stream modes
* make functionary v3.2 parsing more strict (differentiate first match from others)
* send final diff from server, to close off raw python arguments
* support partial content streaming in Generic mode
* tool-call: allow content prelude before hermes2 tool calls (for Qwen2.5)
* Update function-calling.md
* Update tool_bench.py
* chat-parser: remove input from exception (llm output may contain PII)
---------
Co-authored-by: ochafik <ochafik@google.com >
Co-authored-by: Olivier Chafik <ochafik@users.noreply.github.com > 
						
						
					 
					
						2025-05-25 01:48:08 +01:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						d30cb5a7fa 
					 
					
						
						
							
							sync : ggml  
						
						... 
						
						
						
						ggml-ci 
						
						
					 
					
						2025-05-19 13:29:56 +03:00 
						 
				 
			
				
					
						
							
							
								Sigbjørn Skjæret 
							
						 
					 
					
						
						
							
						
						be1d4a13db 
					 
					
						
						
							
							scripts : fix compare-llama-bench.py show parameter ( #13514 )  
						
						
						
						
					 
					
						2025-05-14 08:41:01 +02:00 
						 
				 
			
				
					
						
							
							
								Sigbjørn Skjæret 
							
						 
					 
					
						
						
							
						
						bf79371120 
					 
					
						
						
							
							scripts : support arbitrary input file formats in compare-llama-bench.py ( #13455 )  
						
						
						
						
					 
					
						2025-05-13 15:31:12 +02:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						1e2809bc4b 
					 
					
						
						
							
							sync : ggml  
						
						
						
						
					 
					
						2025-05-13 14:02:28 +03:00 
						 
				 
			
				
					
						
							
							
								Sigbjørn Skjæret 
							
						 
					 
					
						
						
							
						
						09232370fc 
					 
					
						
						
							
							scripts : exit compare-llama-bench.py gracefully when there's nothing to compare ( #13451 )  
						
						
						
						
					 
					
						2025-05-11 16:20:39 +02:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						d879433824 
					 
					
						
						
							
							sync : ggml  
						
						... 
						
						
						
						ggml-ci 
						
						
					 
					
						2025-05-07 17:28:36 +03:00 
						 
				 
			
				
					
						
							
							
								Diego Devesa 
							
						 
					 
					
						
						
							
						
						1d36b3670b 
					 
					
						
						
							
							llama : move end-user examples to tools directory ( #13249 )  
						
						... 
						
						
						
						* llama : move end-user examples to tools directory
---------
Co-authored-by: Xuan Son Nguyen <son@huggingface.co > 
						
						
					 
					
						2025-05-02 20:27:13 +02:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						b34443923c 
					 
					
						
						
							
							sync : ggml ( #13268 )  
						
						... 
						
						
						
						* vulkan : kernels for depthwise 2D convolution (CONV_2D_DW) (ggml/1204)
* vulkan : add kernels for depthwise 2d convolution (OP_CONV_2D_DW)
* review: remove src_x/y < 0 checks; add performance tests
* sync : ggml
ggml-ci
* vulkan : fix lint (#0 )
---------
Co-authored-by: Acly <aclysia@gmail.com > 
						
						
					 
					
						2025-05-02 20:54:30 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						b1dd4d08e8 
					 
					
						
						
							
							sync : ggml  
						
						... 
						
						
						
						ggml-ci 
						
						
					 
					
						2025-05-01 20:15:34 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						8d33d740c3 
					 
					
						
						
							
							sync : ggml  
						
						
						
						
					 
					
						2025-05-01 10:00:39 +03:00 
						 
				 
			
				
					
						
							
							
								Johannes Gäßler 
							
						 
					 
					
						
						
							
						
						19e899ce21 
					 
					
						
						
							
							scripts: n_depth for compare-llama-bench [no ci] ( #13201 )  
						
						
						
						
					 
					
						2025-04-29 23:32:04 +02:00