llama.cpp/tools/pull/README.md

# llama-pull - Model Download Tool

A command-line tool for downloading AI models from HuggingFace and Docker Hub for use with llama.cpp.

## Usage

```bash
# Download from HuggingFace
llama-pull -hf <user>/<model>[:<quant>]

# Download from Docker Hub
llama-pull -dr [<repo>/]<model>[:<quant>]
```

## Options

- `-hf, --hf-repo REPO` - Download model from HuggingFace repository
- `-dr, --docker-repo REPO` - Download model from Docker Hub
- `--hf-token TOKEN` - HuggingFace token for private repositories
- `-h, --help` - Show help message

## Examples

```bash
# Download a HuggingFace model
llama-pull -hf microsoft/DialoGPT-medium

# Download a Docker model (ai/ repo is default)
llama-pull -dr gemma3

# Download with specific quantization
llama-pull -hf bartowski/Llama-3.2-1B-Instruct-GGUF:Q4_K_M
```

## Model Storage

Downloaded models are stored in the standard llama.cpp cache directory:
- Linux/macOS: `~/.cache/llama.cpp/`
- The models can then be used with other llama.cpp tools

## Requirements

- Built with `LLAMA_USE_CURL=ON` (default) for download functionality