mirror of
				https://github.com/ggml-org/llama.cpp.git
				synced 2025-10-31 08:51:55 +00:00 
			
		
		
		
	 553a5c3a9f
			
		
	
	553a5c3a9f
	
	
	
		
			
			RPC_CMD_SET_TENSOR always returns an empty response and we send this 4 times per token. We can improve TG speed if we don't wait for this empty response. The performance impact of this change depends on the network latency.
		
			
				
	
	
	
		
			1.0 KiB
		
	
	
	
	
	
	
	
			
		
		
	
	
			1.0 KiB