Diego Devesa 
							
						 
					 
					
						
						
							
						
						7eee341bee 
					 
					
						
						
							
							common : use common_ prefix for common library functions ( #9805 )  
						
						... 
						
						
						
						* common : use common_ prefix for common library functions
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com > 
						
						
					 
					
						2024-10-10 22:57:42 +02:00 
						 
				 
			
				
					
						
							
							
								Diego Devesa 
							
						 
					 
					
						
						
							
						
						0e9f760eb1 
					 
					
						
						
							
							rpc : add backend registry / device interfaces ( #9812 )  
						
						... 
						
						
						
						* rpc : add backend registry / device interfaces
* llama : add llama_supports_rpc API
* ggml_backend_rpc_start_rpc_server -> ggml_backend_rpc_start_server 
						
						
					 
					
						2024-10-10 20:14:55 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan Son Nguyen 
							
						 
					 
					
						
						
							
						
						458367a906 
					 
					
						
						
							
							server : better security control for public deployments ( #9776 )  
						
						... 
						
						
						
						* server : more explicit endpoint access settings
* protect /props endpoint
* fix tests
* update server docs
* fix typo
* fix tests 
						
						
					 
					
						2024-10-08 13:27:04 +02:00 
						 
				 
			
				
					
						
							
							
								Daniel Kleine 
							
						 
					 
					
						
						
							
						
						133c7b46b3 
					 
					
						
						
							
							Fixed RNG seed docs ( #9723 )  
						
						... 
						
						
						
						* Update README.md
fixed RNG seed info
* changed print format to unsigned 
						
						
					 
					
						2024-10-04 10:54:44 +02:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						f4d2b8846a 
					 
					
						
						
							
							llama : add reranking support ( #9510 )  
						
						... 
						
						
						
						* py : add XLMRobertaForSequenceClassification [no ci]
* py : fix scalar-tensor conversion [no ci]
* py : fix position embeddings chop [no ci]
* llama : read new cls tensors [no ci]
* llama : add classigication head (wip) [no ci]
* llama : add "rank" pooling type
ggml-ci
* server : add rerank endpoint
ggml-ci
* llama : aboud ggml_repeat during classification
* rerank : cleanup + comments
* server : accept /rerank endpoint in addition to /v1/rerank [no ci]
* embedding : parse special tokens
* jina : support v1 reranker
* vocab : minor style
ggml-ci
* server : initiate tests for later
ggml-ci
* server : add docs
* llama : add comment [no ci]
* llama : fix uninitialized tensors
* ci : add rerank tests
ggml-ci
* add reranking test
* change test data
* Update examples/server/server.cpp
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com >
* add `--reranking` argument
* update server docs
* llama : fix comment [no ci]
ggml-ci
---------
Co-authored-by: Xuan Son Nguyen <son@huggingface.co >
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com > 
						
						
					 
					
						2024-09-28 17:42:03 +03:00 
						 
				 
			
				
					
						
							
							
								Xuan Son Nguyen 
							
						 
					 
					
						
						
							
						
						afbbfaa537 
					 
					
						
						
							
							server : add more env vars, improve gen-docs ( #9635 )  
						
						... 
						
						
						
						* server : add more env vars, improve gen-docs
* update server docs
* LLAMA_ARG_NO_CONTEXT_SHIFT 
						
						
					 
					
						2024-09-25 14:05:13 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan Son Nguyen 
							
						 
					 
					
						
						
							
						
						0b3bf966f4 
					 
					
						
						
							
							server : add --no-context-shift option ( #9607 )  
						
						... 
						
						
						
						* server : add --no-context-shift option
* small fix
* Update examples/server/tests/features/embeddings.feature
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
* tests : minor fix
* revert usage of GGML_ASSERT
* update server documentation
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com > 
						
						
					 
					
						2024-09-23 22:23:54 +02:00 
						 
				 
			
				
					
						
							
							
								Bert Wagner 
							
						 
					 
					
						
						
							
						
						8b836ae731 
					 
					
						
						
							
							arg : add env variable for parallel ( #9513 )  
						
						... 
						
						
						
						* add env variable for parallel
* Update README.md with env:  LLAMA_ARG_N_PARALLEL 
						
						
					 
					
						2024-09-17 16:35:38 +03:00 
						 
				 
			
				
					
						
							
							
								Vinesh Janarthanan 
							
						 
					 
					
						
						
							
						
						441b72b91f 
					 
					
						
						
							
							main : option to disable context shift ( #9484 )  
						
						... 
						
						
						
						* added cli arg to disable context shift
* reverted precommit
* updated README.md for main
* white space
* allow disabling context shift in the server
* Update common/arg.cpp
no-context-shift only works for main example
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
* added server example to --no-context-shift args
* removed server changes
* white space
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com > 
						
						
					 
					
						2024-09-16 09:20:01 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						6262d13e0b 
					 
					
						
						
							
							common : reimplement logging ( #9418 )  
						
						... 
						
						
						
						https://github.com/ggerganov/llama.cpp/pull/9418  
					
						2024-09-15 20:46:12 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						0abc6a2c25 
					 
					
						
						
							
							llama : llama_perf + option to disable timings during decode ( #9355 )  
						
						... 
						
						
						
						* llama : llama_perf + option to disable timings during decode
ggml-ci
* common : add llama_arg
* Update src/llama.cpp
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com >
* perf : separate functions in the API
ggml-ci
* perf : safer pointer handling + naming update
ggml-ci
* minor : better local var name
* perf : abort on invalid sampler pointer
ggml-ci
---------
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com > 
						
						
					 
					
						2024-09-13 09:53:38 +03:00 
						 
				 
			
				
					
						
							
							
								Xuan Son Nguyen 
							
						 
					 
					
						
						
							
						
						6cd4e03444 
					 
					
						
						
							
							arg : bring back missing ifdef ( #9411 )  
						
						... 
						
						
						
						* arg : bring back missing ifdef
* replace with llama_supports_gpu_offload 
						
						
					 
					
						2024-09-10 22:41:29 +02:00 
						 
				 
			
				
					
						
							
							
								matteo 
							
						 
					 
					
						
						
							
						
						8d300bd35f 
					 
					
						
						
							
							enable --special arg for llama-server ( #9419 )  
						
						... 
						
						
						
						Co-authored-by: matteo serva <matteo.serva@gmail.com > 
						
						
					 
					
						2024-09-10 22:40:59 +02:00 
						 
				 
			
				
					
						
							
							
								slaren 
							
						 
					 
					
						
						
							
						
						49006c67b4 
					 
					
						
						
							
							llama : move random seed generation to the samplers ( #9398 )  
						
						... 
						
						
						
						* llama_sampler_penalties : clamp penalty_last_n to zero 
						
						
					 
					
						2024-09-10 18:04:25 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan Son Nguyen 
							
						 
					 
					
						
						
							
						
						bfe76d4a17 
					 
					
						
						
							
							common : move arg parser code to arg.cpp ( #9388 )  
						
						... 
						
						
						
						* common : move arg parser to arg.cpp
* better categorize args
* add cmake
* missing climits
* missing cstdarg
* common : more explicit includes
* fix build
* refactor gpt_params_parse
* update server readme
* fix test
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com > 
						
						
					 
					
						2024-09-09 23:36:09 +02:00