mirror of
				https://github.com/ggml-org/llama.cpp.git
				synced 2025-10-30 08:42:00 +00:00 
			
		
		
		
	 9731134296
			
		
	
	9731134296
	
	
	
		
			
			* server: tests: add models endpoint scenario * server: /v1/models add some metadata * server: tests: add debug field in context before scenario * server: tests: download model from HF, add batch size * server: tests: add passkey test * server: tests: add group attention params * server: do not truncate prompt tokens if self-extend through group attention is enabled * server: logs: do not truncate log values * server: tests - passkey - first good working value of nga * server: tests: fix server timeout * server: tests: fix passkey, add doc, fix regex content matching, fix timeout * server: tests: fix regex content matching * server: tests: schedule slow tests on master * server: metrics: fix when no prompt processed * server: tests: self-extend add llama-2-7B and Mixtral-8x7B-v0.1 * server: tests: increase timeout for completion * server: tests: keep only the PHI-2 test * server: tests: passkey add a negative test
		
			
				
	
	
		
			52 lines
		
	
	
		
			1.8 KiB
		
	
	
	
		
			Gherkin
		
	
	
	
	
	
			
		
		
	
	
			52 lines
		
	
	
		
			1.8 KiB
		
	
	
	
		
			Gherkin
		
	
	
	
	
	
| @llama.cpp
 | |
| @security
 | |
| Feature: Security
 | |
| 
 | |
|   Background: Server startup with an api key defined
 | |
|     Given a server listening on localhost:8080
 | |
|     And   a model file tinyllamas/stories260K.gguf from HF repo ggml-org/models
 | |
|     And   a server api key llama.cpp
 | |
|     Then  the server is starting
 | |
|     Then  the server is healthy
 | |
| 
 | |
|   Scenario Outline: Completion with some user api key
 | |
|     Given a prompt test
 | |
|     And   a user api key <api_key>
 | |
|     And   4 max tokens to predict
 | |
|     And   a completion request with <api_error> api error
 | |
| 
 | |
|     Examples: Prompts
 | |
|       | api_key   | api_error |
 | |
|       | llama.cpp | no        |
 | |
|       | llama.cpp | no        |
 | |
|       | hackeme   | raised    |
 | |
|       |           | raised    |
 | |
| 
 | |
|   Scenario Outline: OAI Compatibility
 | |
|     Given a system prompt test
 | |
|     And   a user prompt test
 | |
|     And   a model test
 | |
|     And   2 max tokens to predict
 | |
|     And   streaming is disabled
 | |
|     And   a user api key <api_key>
 | |
|     Given an OAI compatible chat completions request with <api_error> api error
 | |
| 
 | |
|     Examples: Prompts
 | |
|       | api_key   | api_error |
 | |
|       | llama.cpp | no        |
 | |
|       | llama.cpp | no        |
 | |
|       | hackme    | raised    |
 | |
| 
 | |
| 
 | |
|   Scenario Outline: CORS Options
 | |
|     When an OPTIONS request is sent from <origin>
 | |
|     Then CORS header <cors_header> is set to <cors_header_value>
 | |
| 
 | |
|     Examples: Headers
 | |
|       | origin          | cors_header                      | cors_header_value |
 | |
|       | localhost       | Access-Control-Allow-Origin      | localhost         |
 | |
|       | web.mydomain.fr | Access-Control-Allow-Origin      | web.mydomain.fr   |
 | |
|       | origin          | Access-Control-Allow-Credentials | true              |
 | |
|       | web.mydomain.fr | Access-Control-Allow-Methods     | POST              |
 | |
|       | web.mydomain.fr | Access-Control-Allow-Headers     | *                 |
 |