| 
							
							
								 Xuan-Son Nguyen | 3c3635d2f2 | server : speed up tests (#15836) * server : speed up tests
* clean up
* restore timeout_seconds in some places
* flake8
* explicit offline | 2025-09-06 14:45:24 +02:00 |  | 
			
				
					| 
							
							
								 Olivier Chafik | e121edc432 | server: add--reasoning-budget 0to disable thinking (incl. qwen3 w/ enable_thinking:false) (#13771)---------
Co-authored-by: ochafik <ochafik@google.com>
Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com> | 2025-05-26 00:30:51 +01:00 |  | 
			
				
					| 
							
							
								 Olivier Chafik | d785f9c1fd | server: fix/test add_generation_prompt (#13770) Co-authored-by: ochafik <ochafik@google.com> | 2025-05-25 10:45:49 +01:00 |  | 
			
				
					| 
							
							
								 Olivier Chafik | aa48e373f2 | server: inject date_string in llama 3.x template + fix date for firefunction v2 (#12802)* Inject date_string in llama 3.x + fix for functionary v2
https://github.com/ggml-org/llama.cpp/issues/12729
* move/fix detection of functionary v3.1 before llama 3.x, fix & test their non-tool mode
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* generate more tokens in test_completion_with_required_tool_tiny_fast to avoid truncation
---------
Co-authored-by: ochafik <ochafik@google.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> | 2025-05-15 02:39:51 +01:00 |  |