Radoslav Gerganov 
							
						 
					 
					
						
						
							
						
						c875e03f96 
					 
					
						
						
							
							rpc : update README for cache usage  
						
						
						
						
					 
					
						2025-03-28 09:41:47 +02:00 
						 
				 
			
				
					
						
							
							
								Radoslav Gerganov 
							
						 
					 
					
						
						
							
						
						ab6ab8f809 
					 
					
						
						
							
							rpc : send hash when tensor data is above some fixed threshold ( #12496 )  
						
						... 
						
						
						
						* rpc : send hash when tensor data is above some fixed threshold
ref #10095 
* rpc : put cache under $HOME/.cache/llama.cpp
* try to fix win32 build
* another try to fix win32 build
* remove llama as dependency 
						
						
					 
					
						2025-03-28 08:18:04 +02:00 
						 
				 
			
				
					
						
							
							
								Radoslav Gerganov 
							
						 
					 
					
						
						
							
						
						86bf31cfe6 
					 
					
						
						
							
							rpc-server : add support for the SYCL backend ( #10934 )  
						
						
						
						
					 
					
						2024-12-23 10:39:30 +02:00 
						 
				 
			
				
					
						
							
							
								Diego Devesa 
							
						 
					 
					
						
						
							
						
						9f40989351 
					 
					
						
						
							
							ggml : move CPU backend to a separate file ( #10144 )  
						
						
						
						
					 
					
						2024-11-03 19:34:08 +01:00 
						 
				 
			
				
					
						
							
							
								Diego Devesa 
							
						 
					 
					
						
						
							
						
						0e9f760eb1 
					 
					
						
						
							
							rpc : add backend registry / device interfaces ( #9812 )  
						
						... 
						
						
						
						* rpc : add backend registry / device interfaces
* llama : add llama_supports_rpc API
* ggml_backend_rpc_start_rpc_server -> ggml_backend_rpc_start_server 
						
						
					 
					
						2024-10-10 20:14:55 +02:00 
						 
				 
			
				
					
						
							
							
								Radoslav Gerganov 
							
						 
					 
					
						
						
							
						
						841713e1e4 
					 
					
						
						
							
							rpc : enable vulkan ( #9714 )  
						
						... 
						
						
						
						closes  #8536  
					
						2024-10-03 13:00:52 +03:00 
						 
				 
			
				
					
						
							
							
								Antonis Makropoulos 
							
						 
					 
					
						
						
							
						
						5ed087573e 
					 
					
						
						
							
							readme : add LLMUnity to UI projects ( #9381 )  
						
						... 
						
						
						
						* add LLMUnity to UI projects
* add newline to examples/rpc/README.md to fix editorconfig-checker unit test 
						
						
					 
					
						2024-09-09 14:21:38 +03:00 
						 
				 
			
				
					
						
							
							
								Radoslav Gerganov 
							
						 
					 
					
						
						
							
						
						54f376d0b9 
					 
					
						
						
							
							rpc : update README [no ci] ( #9320 )  
						
						... 
						
						
						
						Update README with instructions how to offload model layers to both
local and remote devices 
						
						
					 
					
						2024-09-09 11:04:39 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						b72942fac9 
					 
					
						
						
							
							Merge commit from fork  
						
						
						
						
					 
					
						2024-08-09 23:03:21 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						f3f65429c4 
					 
					
						
						
							
							llama : reorganize source code + improve CMake ( #8006 )  
						
						... 
						
						
						
						* scripts : update sync [no ci]
* files : relocate [no ci]
* ci : disable kompute build [no ci]
* cmake : fixes [no ci]
* server : fix mingw build
ggml-ci
* cmake : minor [no ci]
* cmake : link math library [no ci]
* cmake : build normal ggml library (not object library) [no ci]
* cmake : fix kompute build
ggml-ci
* make,cmake : fix LLAMA_CUDA + replace GGML_CDEF_PRIVATE
ggml-ci
* move public backend headers to the public include directory (#8122 )
* move public backend headers to the public include directory
* nix test
* spm : fix metal header
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
* scripts : fix sync paths [no ci]
* scripts : sync ggml-blas.h [no ci]
---------
Co-authored-by: slaren <slarengh@gmail.com > 
						
						
					 
					
						2024-06-26 18:33:02 +03:00 
						 
				 
			
				
					
						
							
							
								Olivier Chafik 
							
						 
					 
					
						
						
							
						
						1c641e6aac 
					 
					
						
						
							
							build: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )  
						
						... 
						
						
						
						* `main`/`server`: rename to `llama` / `llama-server` for consistency w/ homebrew
* server: update refs -> llama-server
gitignore llama-server
* server: simplify nix package
* main: update refs -> llama
fix examples/main ref
* main/server: fix targets
* update more names
* Update build.yml
* rm accidentally checked in bins
* update straggling refs
* Update .gitignore
* Update server-llm.sh
* main: target name -> llama-cli
* Prefix all example bins w/ llama-
* fix main refs
* rename {main->llama}-cmake-pkg binary
* prefix more cmake targets w/ llama-
* add/fix gbnf-validator subfolder to cmake
* sort cmake example subdirs
* rm bin files
* fix llama-lookup-* Makefile rules
* gitignore /llama-*
* rename Dockerfiles
* rename llama|main -> llama-cli; consistent RPM bin prefixes
* fix some missing -cli suffixes
* rename dockerfile w/ llama-cli
* rename(make): llama-baby-llama
* update dockerfile refs
* more llama-cli(.exe)
* fix test-eval-callback
* rename: llama-cli-cmake-pkg(.exe)
* address gbnf-validator unused fread warning (switched to C++ / ifstream)
* add two missing llama- prefixes
* Updating docs for eval-callback binary to use new `llama-` prefix.
* Updating a few lingering doc references for rename of main to llama-cli
* Updating `run-with-preset.py` to use new binary names.
Updating docs around `perplexity` binary rename.
* Updating documentation references for lookup-merge and export-lora
* Updating two small `main` references missed earlier in the finetune docs.
* Update apps.nix
* update grammar/README.md w/ new llama-* names
* update llama-rpc-server bin name + doc
* Revert "update llama-rpc-server bin name + doc"
This reverts commit e474ef1df4hanclinto@gmail.com > 
						
						
					 
					
						2024-06-13 00:41:52 +01:00 
						 
				 
			
				
					
						
							
							
								slaren 
							
						 
					 
					
						
						
							
						
						fe1e3917cf 
					 
					
						
						
							
							Revert "[SYCL] Update rpc-server.cpp to include SYCL backend ( #7682 )" ( #7808 )  
						
						... 
						
						
						
						This reverts commit 9422c5e34b 
						
						
					 
					
						2024-06-09 01:43:39 +02:00 
						 
				 
			
				
					
						
							
							
								nickp27 
							
						 
					 
					
						
						
							
						
						9422c5e34b 
					 
					
						
						
							
							[SYCL] Update rpc-server.cpp to include SYCL backend ( #7682 )  
						
						... 
						
						
						
						* Update rpc-server.cpp to include SYCL backend
Draft PR to address inclusion of SYCL backend for RPC server
* Update rpc-server.cpp 
						
						
					 
					
						2024-06-02 12:13:54 +03:00 
						 
				 
			
				
					
						
							
							
								Radoslav Gerganov 
							
						 
					 
					
						
						
							
						
						f4bd8b3d26 
					 
					
						
						
							
							rpc : set SO_REUSEADDR for the server socket ( #7320 )  
						
						... 
						
						
						
						ref: #7293  
						
						
					 
					
						2024-05-17 17:25:44 +03:00 
						 
				 
			
				
					
						
							
							
								Radoslav Gerganov 
							
						 
					 
					
						
						
							
						
						9afdffe70e 
					 
					
						
						
							
							rpc : get available mem for the CPU backend  
						
						... 
						
						
						
						This can be overridden with the -m command line option
ref: #7293  
						
						
					 
					
						2024-05-16 12:04:08 +03:00 
						 
				 
			
				
					
						
							
							
								Radoslav Gerganov 
							
						 
					 
					
						
						
							
						
						3b3963c55c 
					 
					
						
						
							
							rpc : add command line arg for specifying backend memory  
						
						... 
						
						
						
						ref: #7293  
						
						
					 
					
						2024-05-16 09:58:29 +03:00 
						 
				 
			
				
					
						
							
							
								Radoslav Gerganov 
							
						 
					 
					
						
						
							
						
						5e31828d3e 
					 
					
						
						
							
							ggml : add RPC backend ( #6829 )  
						
						... 
						
						
						
						* ggml : add RPC backend
The RPC backend proxies all operations to a remote server which runs a
regular backend (CPU, CUDA, Metal, etc).
* set TCP_NODELAY
* add CI workflows
* Address review comments
* fix warning
* implement llama_max_devices() for RPC
* Address review comments
* Address review comments
* wrap sockfd into a struct
* implement get_alignment and get_max_size
* add get_device_memory
* fix warning
* win32 support
* add README
* readme : trim trailing whitespace
* Address review comments
* win32 fix
* Address review comments
* fix compile warnings on macos 
						
						
					 
					
						2024-05-14 14:27:19 +03:00