Files
llama.cpp/tools/pull/README.md
Eric Curtin 17ca6ed540 Implement llama-pull tool
Complete llama-pull tool with documentation

Signed-off-by: Eric Curtin <eric.curtin@docker.com>
2025-09-20 17:25:21 +01:00

1.1 KiB

llama-pull - Model Download Tool

A command-line tool for downloading AI models from HuggingFace and Docker Hub for use with llama.cpp.

Usage

# Download from HuggingFace
llama-pull -hf <user>/<model>[:<quant>]

# Download from Docker Hub
llama-pull -dr [<repo>/]<model>[:<quant>]

Options

  • -hf, --hf-repo REPO - Download model from HuggingFace repository
  • -dr, --docker-repo REPO - Download model from Docker Hub
  • --hf-token TOKEN - HuggingFace token for private repositories
  • -h, --help - Show help message

Examples

# Download a HuggingFace model
llama-pull -hf microsoft/DialoGPT-medium

# Download a Docker model (ai/ repo is default)
llama-pull -dr gemma3

# Download with specific quantization
llama-pull -hf bartowski/Llama-3.2-1B-Instruct-GGUF:Q4_K_M

Model Storage

Downloaded models are stored in the standard llama.cpp cache directory:

  • Linux/macOS: ~/.cache/llama.cpp/
  • The models can then be used with other llama.cpp tools

Requirements

  • Built with LLAMA_USE_CURL=ON (default) for download functionality