mirror of
				https://github.com/ggml-org/llama.cpp.git
				synced 2025-10-30 08:42:00 +00:00 
			
		
		
		
	 b8ad1b66b2
			
		
	
	b8ad1b66b2
	
	
	
		
			
			* server: allow json array in prompt or content We accept an array of strings and numbers representing tokens, in addition to the current string valued prompt or content. This allows direct token input, so that any special tokens can be processed and used at the frontend during the construction of the json data, before sending to the server. And the server does not need to know or parse special tokens from textual input. With this, we can use EOS and BOS used in llama-2-chat models. * server: use tokenizePrompt(json) and default "" if empty prompt * server: fix prompt check * server: tokenize endpoint no longer adds BOS
		
			
				
	
	
	
		
			52 KiB
		
	
	
	
	
	
	
	
			
		
		
	
	
			52 KiB