mirror of
				https://github.com/ggml-org/llama.cpp.git
				synced 2025-10-31 08:51:55 +00:00 
			
		
		
		
	 525213d2f5
			
		
	
	525213d2f5
	
	
	
		
			
			* server: tests: init scenarios - health and slots endpoints - completion endpoint - OAI compatible chat completion requests w/ and without streaming - completion multi users scenario - multi users scenario on OAI compatible endpoint with streaming - multi users with total number of tokens to predict exceeds the KV Cache size - server wrong usage scenario, like in Infinite loop of "context shift" #3969 - slots shifting - continuous batching - embeddings endpoint - multi users embedding endpoint: Segmentation fault #5655 - OpenAI-compatible embeddings API - tokenize endpoint - CORS and api key scenario * server: CI GitHub workflow --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
		
			
				
	
	
		
			51 lines
		
	
	
		
			1.7 KiB
		
	
	
	
		
			Gherkin
		
	
	
	
	
	
			
		
		
	
	
			51 lines
		
	
	
		
			1.7 KiB
		
	
	
	
		
			Gherkin
		
	
	
	
	
	
| @llama.cpp
 | |
| Feature: Security
 | |
| 
 | |
|   Background: Server startup with an api key defined
 | |
|     Given a server listening on localhost:8080
 | |
|     And   a model file stories260K.gguf
 | |
|     And   a server api key llama.cpp
 | |
|     Then  the server is starting
 | |
|     Then  the server is healthy
 | |
| 
 | |
|   Scenario Outline: Completion with some user api key
 | |
|     Given a prompt test
 | |
|     And   a user api key <api_key>
 | |
|     And   4 max tokens to predict
 | |
|     And   a completion request with <api_error> api error
 | |
| 
 | |
|     Examples: Prompts
 | |
|       | api_key   | api_error |
 | |
|       | llama.cpp | no        |
 | |
|       | llama.cpp | no        |
 | |
|       | hackeme   | raised    |
 | |
|       |           | raised    |
 | |
| 
 | |
|   Scenario Outline: OAI Compatibility
 | |
|     Given a system prompt test
 | |
|     And   a user prompt test
 | |
|     And   a model test
 | |
|     And   2 max tokens to predict
 | |
|     And   streaming is disabled
 | |
|     And   a user api key <api_key>
 | |
|     Given an OAI compatible chat completions request with <api_error> api error
 | |
| 
 | |
|     Examples: Prompts
 | |
|       | api_key   | api_error |
 | |
|       | llama.cpp | no        |
 | |
|       | llama.cpp | no        |
 | |
|       | hackme    | raised    |
 | |
| 
 | |
| 
 | |
|   Scenario Outline: CORS Options
 | |
|     When an OPTIONS request is sent from <origin>
 | |
|     Then CORS header <cors_header> is set to <cors_header_value>
 | |
| 
 | |
|     Examples: Headers
 | |
|       | origin          | cors_header                      | cors_header_value |
 | |
|       | localhost       | Access-Control-Allow-Origin      | localhost         |
 | |
|       | web.mydomain.fr | Access-Control-Allow-Origin      | web.mydomain.fr   |
 | |
|       | origin          | Access-Control-Allow-Credentials | true              |
 | |
|       | web.mydomain.fr | Access-Control-Allow-Methods     | POST              |
 | |
|       | web.mydomain.fr | Access-Control-Allow-Headers     | *                 |
 |