mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2025-10-27 08:21:30 +00:00
Complete llama-pull tool with documentation Signed-off-by: Eric Curtin <eric.curtin@docker.com>
llama-pull - Model Download Tool
A command-line tool for downloading AI models from HuggingFace and Docker Hub for use with llama.cpp.
Usage
# Download from HuggingFace
llama-pull -hf <user>/<model>[:<quant>]
# Download from Docker Hub
llama-pull -dr [<repo>/]<model>[:<quant>]
Options
-hf, --hf-repo REPO- Download model from HuggingFace repository-dr, --docker-repo REPO- Download model from Docker Hub--hf-token TOKEN- HuggingFace token for private repositories-h, --help- Show help message
Examples
# Download a HuggingFace model
llama-pull -hf microsoft/DialoGPT-medium
# Download a Docker model (ai/ repo is default)
llama-pull -dr gemma3
# Download with specific quantization
llama-pull -hf bartowski/Llama-3.2-1B-Instruct-GGUF:Q4_K_M
Model Storage
Downloaded models are stored in the standard llama.cpp cache directory:
- Linux/macOS:
~/.cache/llama.cpp/ - The models can then be used with other llama.cpp tools
Requirements
- Built with
LLAMA_USE_CURL=ON(default) for download functionality