rpc : update README for cache usage

2025-10-30 08:42:00 +00:00 · 2025-03-28 09:15:09 +02:00
parent ab6ab8f809
commit c875e03f96
1 changed files with 11 additions and 0 deletions
--- a/examples/rpc/README.md
+++ b/examples/rpc/README.md
@@ -72,3 +72,14 @@ $ bin/llama-cli -m ../models/tinyllama-1b/ggml-model-f16.gguf -p "Hello, my name
 This way you can offload model layers to both local and remote devices.
 ### Local cache
 The RPC server can use a local cache to store large tensors and avoid transferring them over the network.
 This can speed up model loading significantly, especially when using large models.
 To enable the cache, use the `-c` option:
 ```bash
 $ bin/rpc-server -c
 ```
 By default, the cache is stored in the `$HOME/.cache/llama.cpp/rpc` directory and can be controlled via the `LLAMA_CACHE` environment variable.