* rpc : report actual free memory
Start reporting the free memory on every device instead of using
fixed values. Now llama-cli users can get a nice memory breakdown
when using RPC devices.
* drop --mem in rpc-server
Update the README file to match the newly added functionality of
exposing multiple devices from a single server.
Co-authored-by: Diego Devesa <slarengh@gmail.com>
* rpc : add support for multiple devices
Allow rpc-server to expose multiple devices from a single endpoint.
Change RPC protocol to include device identifier where needed.
closes: #15210
* fixes
* use ggml_backend_reg_t
* address review comments
* fix llama-bench backend report
* address review comments, change device naming
* fix cmd order