Radoslav Gerganov 
							
						 
					 
					
						
						
							
						
						bde7cd3cd9 
					 
					
						
						
							
							llama : offload to RPC in addition to other backends ( #7640 )  
						
						... 
						
						
						
						* llama : offload to RPC in addition to other backends
* - fix copy_tensor being called on the src buffer instead of the dst buffer
- always initialize views in the view_src buffer
- add RPC backend to Makefile build
- add endpoint to all RPC object names
* add rpc-server to Makefile
* Update llama.cpp
Co-authored-by: slaren <slarengh@gmail.com >
---------
Co-authored-by: slaren <slarengh@gmail.com > 
						
						
					 
					
						2024-06-03 20:03:26 +03:00 
						 
				 
			
				
					
						
							
							
								Radoslav Gerganov 
							
						 
					 
					
						
						
							
						
						2b737caae1 
					 
					
						
						
							
							rpc : resource management rework ( #7562 )  
						
						... 
						
						
						
						* rpc : resource management rework
* address review comments 
						
						
					 
					
						2024-05-28 18:13:36 +03:00 
						 
				 
			
				
					
						
							
							
								Radoslav Gerganov 
							
						 
					 
					
						
						
							
						
						db10f01310 
					 
					
						
						
							
							rpc : track allocated buffers ( #7411 )  
						
						... 
						
						
						
						* rpc : track allocated buffers
ref: #7407 
* rpc : pack rpc_tensor tightly 
						
						
					 
					
						2024-05-20 16:36:55 +03:00 
						 
				 
			
				
					
						
							
							
								Radoslav Gerganov 
							
						 
					 
					
						
						
							
						
						f4bd8b3d26 
					 
					
						
						
							
							rpc : set SO_REUSEADDR for the server socket ( #7320 )  
						
						... 
						
						
						
						ref: #7293  
						
						
					 
					
						2024-05-17 17:25:44 +03:00 
						 
				 
			
				
					
						
							
							
								Radoslav Gerganov 
							
						 
					 
					
						
						
							
						
						3b3963c55c 
					 
					
						
						
							
							rpc : add command line arg for specifying backend memory  
						
						... 
						
						
						
						ref: #7293  
						
						
					 
					
						2024-05-16 09:58:29 +03:00 
						 
				 
			
				
					
						
							
							
								Radoslav Gerganov 
							
						 
					 
					
						
						
							
						
						5e31828d3e 
					 
					
						
						
							
							ggml : add RPC backend ( #6829 )  
						
						... 
						
						
						
						* ggml : add RPC backend
The RPC backend proxies all operations to a remote server which runs a
regular backend (CPU, CUDA, Metal, etc).
* set TCP_NODELAY
* add CI workflows
* Address review comments
* fix warning
* implement llama_max_devices() for RPC
* Address review comments
* Address review comments
* wrap sockfd into a struct
* implement get_alignment and get_max_size
* add get_device_memory
* fix warning
* win32 support
* add README
* readme : trim trailing whitespace
* Address review comments
* win32 fix
* Address review comments
* fix compile warnings on macos 
						
						
					 
					
						2024-05-14 14:27:19 +03:00