Kerfuffle 
							
						 
					 
					
						
						
							
						
						6e08281e58 
					 
					
						
						
							
							Extend llama_kv_cache_seq_rm to allow matching any sequence ( #3843 )  
						
						... 
						
						
						
						* Extend llama_kv_cache_seq_rm to allow matichng any sequence
* Replace llama_kv_cache_tokens_rm with llama_kv_cache_clear
Use llama_kv_cache_clear for cache clearing
Change calls to llama_kv_cache_tokens_rm that want to delete by position to use llama_kv_cache_seq_rm functionality 
						
						
					 
					
						2023-10-29 11:31:40 -06:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						6961c4bd0b 
					 
					
						
						
							
							batched-bench : print params at start  
						
						
						
						
					 
					
						2023-10-25 10:26:27 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						0e89203b51 
					 
					
						
						
							
							speculative : add tree-based sampling example ( #3624 )  
						
						... 
						
						
						
						* sampling : one sequence per sampling context
ggml-ci
* speculative : add tree-based sampling support
ggml-ci
* speculative : reuse the n_parallel CLI param
* speculative : refactor sampling
* examples : fix build after sampling refactoring
ggml-ci
* batched : fix n_seq_id
* sampling : fix malloc
ggml-ci
* swift : fix build
ggml-ci
* swift : try to fix build
ggml-ci
* prompts : add assistant.txt
* common : add llama_batch_add() and llama_batch_clear() helpers
* speculative : minor refactor
ggml-ci
* minor : comments + rename
ggml-ci
* speculative : fix off-by-one for n_drafted
* speculative : fix the n_drafted fix + p constants 
						
						
					 
					
						2023-10-18 16:21:57 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						8c70a5ff25 
					 
					
						
						
							
							batched : add bench tool ( #3545 )  
						
						... 
						
						
						
						* batched : add bench tool
* batched : minor fix table
* batched-bench : add readme + n_kv_max is now configurable
* batched-bench : init warm-up batch
* batched-bench : pass custom set of PP, TG and PL
* batched-bench : add mmq CLI arg 
						
						
					 
					
						2023-10-11 21:25:33 +03:00