Implement llama-pull tool

Complete llama-pull tool with documentation Signed-off-by: Eric Curtin <eric.curtin@docker.com>
2025-11-13 10:57:15 +00:00 · 2025-09-20 17:24:35 +01:00
parent 7f766929ca
commit 17ca6ed540
4 changed files with 136 additions and 0 deletions
--- a/tools/pull/README.md
+++ b/tools/pull/README.md
@@ -0,0 +1,43 @@
+# llama-pull - Model Download Tool
+
+A command-line tool for downloading AI models from HuggingFace and Docker Hub for use with llama.cpp.
+
+## Usage
+
+```bash
+# Download from HuggingFace
+llama-pull -hf <user>/<model>[:<quant>]
+
+# Download from Docker Hub
+llama-pull -dr [<repo>/]<model>[:<quant>]
+```
+
+## Options
+
+- `-hf, --hf-repo REPO` - Download model from HuggingFace repository
+- `-dr, --docker-repo REPO` - Download model from Docker Hub
+- `--hf-token TOKEN` - HuggingFace token for private repositories
+- `-h, --help` - Show help message
+
+## Examples
+
+```bash
+# Download a HuggingFace model
+llama-pull -hf microsoft/DialoGPT-medium
+
+# Download a Docker model (ai/ repo is default)
+llama-pull -dr gemma3
+
+# Download with specific quantization
+llama-pull -hf bartowski/Llama-3.2-1B-Instruct-GGUF:Q4_K_M
+```
+
+## Model Storage
+
+Downloaded models are stored in the standard llama.cpp cache directory:
+- Linux/macOS: `~/.cache/llama.cpp/`
+- The models can then be used with other llama.cpp tools
+
+## Requirements
+
+- Built with `LLAMA_USE_CURL=ON` (default) for download functionality