Commit Graph

9 Commits

Author SHA1 Message Date
Radoslav Gerganov
41386cf365 rpc : report actual free memory (#16616)
* rpc : report actual free memory

Start reporting the free memory on every device instead of using
fixed values. Now llama-cli users can get a nice memory breakdown
when using RPC devices.

* drop --mem in rpc-server
2025-10-17 18:02:52 +03:00
Radoslav Gerganov
c61ae20d05 rpc : update documentation (#16441)
Update the README file to match the newly added functionality of
exposing multiple devices from a single server.

Co-authored-by: Diego Devesa <slarengh@gmail.com>
2025-10-07 06:59:13 +00:00
Radoslav Gerganov
898acba681 rpc : add support for multiple devices (#16276)
* rpc : add support for multiple devices

Allow rpc-server to expose multiple devices from a single endpoint.
Change RPC protocol to include device identifier where needed.

closes: #15210

* fixes

* use ggml_backend_reg_t

* address review comments

* fix llama-bench backend report

* address review comments, change device naming

* fix cmd order
2025-10-04 12:49:16 +03:00
Radoslav Gerganov
918b26f197 rpc : fix regression when --device is used (#15981)
Fix regression introduced with commit 50f4281a6
2025-09-14 12:28:18 +03:00
Diego Devesa
50f4281a6f llama : allow using iGPUs with --device (#15951)
* llama : allow using iGPUs with --device

* mtmd : allow iGPU

* rpc-server : allow iGPU
2025-09-13 16:49:49 +02:00
Percy Piper
c508256db2 rpc : Fix build on OpenBSD (#13541) 2025-05-25 15:35:53 +03:00
Diego Devesa
27ebfcacba llama : do not crash if there is no CPU backend (#13395)
* llama : do not crash if there is no CPU backend

* add checks to examples
2025-05-09 13:02:07 +02:00
Diego Devesa
9fdfcdaedd rpc : use backend registry, support dl backends (#13304) 2025-05-04 21:25:43 +02:00
Diego Devesa
1d36b3670b llama : move end-user examples to tools directory (#13249)
* llama : move end-user examples to tools directory

---------

Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
2025-05-02 20:27:13 +02:00