Ruikai Peng 
							
						 
					 
					
						
						
							
						
						66aba7aca9 
					 
					
						
						
							
							run : avoid double tokenization ( #14327 )  
						
						... 
						
						
						
						* run : avoid double tokenization by adopting common_tokenize heuristic
* build : fix windows gcc and clang warnings
* lint : fixed trailing whitepace
* run : fix is_first flag 
						
						
					 
					
						2025-06-23 01:28:06 +08:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						f1f5e82df6 
					 
					
						
						
							
							examples : fix is_first logic for tokenization ( #14329 )  
						
						... 
						
						
						
						ggml-ci 
						
						
					 
					
						2025-06-22 20:10:07 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						745aa5319b 
					 
					
						
						
							
							llama : deprecate llama_kv_self_ API ( #14030 )  
						
						... 
						
						
						
						* llama : deprecate llama_kv_self_ API
ggml-ci
* llama : allow llama_memory_(nullptr)
ggml-ci
* memory : add flag for optional data clear in llama_memory_clear
ggml-ci 
						
						
					 
					
						2025-06-06 14:11:15 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						53f925074d 
					 
					
						
						
							
							sync : vendor ( #13901 )  
						
						... 
						
						
						
						* sync : vendor
ggml-ci
* cont : fix httplib version
ggml-ci
* cont : fix lint
* cont : fix lint
* vendor : move to common folder /vendor
ggml-ci
* cont : fix lint
* cont : move httplib to /vendor + use json_fwd.hpp
ggml-ci
* cont : fix server build
ggml-ci
* cont : add missing headers
ggml-ci
* cont : header clean-up
ggml-ci 
						
						
					 
					
						2025-05-30 16:25:45 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						797f2ac062 
					 
					
						
						
							
							kv-cache : simplify the interface ( #13660 )  
						
						... 
						
						
						
						* kv-cache : simplify the interface
ggml-ci
* context : revert llama_batch_allocr position change
ggml-ci 
						
						
					 
					
						2025-05-21 15:11:13 +03:00 
						 
				 
			
				
					
						
							
							
								R0CKSTAR 
							
						 
					 
					
						
						
							
						
						0527771dd8 
					 
					
						
						
							
							llama-run: add support for downloading models from ModelScope ( #13370 )  
						
						... 
						
						
						
						Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com > 
						
						
					 
					
						2025-05-09 10:25:50 +01:00 
						 
				 
			
				
					
						
							
							
								Diego Devesa 
							
						 
					 
					
						
						
							
						
						1d36b3670b 
					 
					
						
						
							
							llama : move end-user examples to tools directory ( #13249 )  
						
						... 
						
						
						
						* llama : move end-user examples to tools directory
---------
Co-authored-by: Xuan Son Nguyen <son@huggingface.co > 
						
						
					 
					
						2025-05-02 20:27:13 +02:00