mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-27 08:21:30 +00:00

Files

Eric Curtin 17ca6ed540 Implement llama-pull tool

Complete llama-pull tool with documentation

Signed-off-by: Eric Curtin <eric.curtin@docker.com>

2025-09-20 17:25:21 +01:00

1.1 KiB

Raw Blame History

llama-pull - Model Download Tool

A command-line tool for downloading AI models from HuggingFace and Docker Hub for use with llama.cpp.

Usage

# Download from HuggingFace
llama-pull -hf <user>/<model>[:<quant>]

# Download from Docker Hub
llama-pull -dr [<repo>/]<model>[:<quant>]

Options

-hf, --hf-repo REPO - Download model from HuggingFace repository
-dr, --docker-repo REPO - Download model from Docker Hub
--hf-token TOKEN - HuggingFace token for private repositories
-h, --help - Show help message

Examples

# Download a HuggingFace model
llama-pull -hf microsoft/DialoGPT-medium

# Download a Docker model (ai/ repo is default)
llama-pull -dr gemma3

# Download with specific quantization
llama-pull -hf bartowski/Llama-3.2-1B-Instruct-GGUF:Q4_K_M

Model Storage

Downloaded models are stored in the standard llama.cpp cache directory:

Linux/macOS: ~/.cache/llama.cpp/
The models can then be used with other llama.cpp tools

Requirements

Built with LLAMA_USE_CURL=ON (default) for download functionality

1.1 KiB Raw Blame History