Xuan-Son Nguyen
							
						 
					 | 
					
						
						
							
						
						3c3635d2f2
					 | 
					
						
						
							
							server : speed up tests (#15836)
						
						
						
						
						
						
						
						* server : speed up tests
* clean up
* restore timeout_seconds in some places
* flake8
* explicit offline 
						
						
					 | 
					
						2025-09-06 14:45:24 +02:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Olivier Chafik
							
						 
					 | 
					
						
						
							
						
						e121edc432
					 | 
					
						
						
							
							server: add --reasoning-budget 0 to disable thinking (incl. qwen3 w/ enable_thinking:false) (#13771)
						
						
						
						
						
						
						
						---------
Co-authored-by: ochafik <ochafik@google.com>
Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com> 
						
						
					 | 
					
						2025-05-26 00:30:51 +01:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Olivier Chafik
							
						 
					 | 
					
						
						
							
						
						d785f9c1fd
					 | 
					
						
						
							
							server: fix/test add_generation_prompt (#13770)
						
						
						
						
						
						
						
						Co-authored-by: ochafik <ochafik@google.com> 
						
						
					 | 
					
						2025-05-25 10:45:49 +01:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Olivier Chafik
							
						 
					 | 
					
						
						
							
						
						aa48e373f2
					 | 
					
						
						
							
							server: inject date_string in llama 3.x template + fix date for firefunction v2 (#12802)
						
						
						
						
						
						
						
						* Inject date_string in llama 3.x + fix for functionary v2
https://github.com/ggml-org/llama.cpp/issues/12729
* move/fix detection of functionary v3.1 before llama 3.x, fix & test their non-tool mode
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* generate more tokens in test_completion_with_required_tool_tiny_fast to avoid truncation
---------
Co-authored-by: ochafik <ochafik@google.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> 
						
						
					 | 
					
						2025-05-15 02:39:51 +01:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 |