mirror of
				https://github.com/ggml-org/llama.cpp.git
				synced 2025-11-04 09:32:00 +00:00 
			
		
		
		
	* add pixtral text model (vision is wip) * cgraph ok, just missing 2D RoPE * fix bad rebase * first working version * fix problem with img_break token * support dynamic image size * update docs * update test script
		
			
				
	
	
		
			52 lines
		
	
	
		
			1.1 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			52 lines
		
	
	
		
			1.1 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
# Gemma 3 vision
 | 
						|
 | 
						|
> [!IMPORTANT]
 | 
						|
>
 | 
						|
> This is very experimental, only used for demo purpose.
 | 
						|
 | 
						|
## Quick started
 | 
						|
 | 
						|
You can use pre-quantized model from [ggml-org](https://huggingface.co/ggml-org)'s Hugging Face account
 | 
						|
 | 
						|
```bash
 | 
						|
# build
 | 
						|
cmake -B build
 | 
						|
cmake --build build --target llama-mtmd-cli
 | 
						|
 | 
						|
# alternatively, install from brew (MacOS)
 | 
						|
brew install llama.cpp
 | 
						|
 | 
						|
# run it
 | 
						|
llama-mtmd-cli -hf ggml-org/gemma-3-4b-it-GGUF
 | 
						|
llama-mtmd-cli -hf ggml-org/gemma-3-12b-it-GGUF
 | 
						|
llama-mtmd-cli -hf ggml-org/gemma-3-27b-it-GGUF
 | 
						|
 | 
						|
# note: 1B model does not support vision
 | 
						|
```
 | 
						|
 | 
						|
## How to get mmproj.gguf?
 | 
						|
 | 
						|
Simply to add `--mmproj` in when converting model via `convert_hf_to_gguf.py`:
 | 
						|
 | 
						|
```bash
 | 
						|
cd gemma-3-4b-it
 | 
						|
python ../llama.cpp/convert_hf_to_gguf.py --outfile model.gguf --outtype f16 --mmproj .
 | 
						|
# output file: mmproj-model.gguf
 | 
						|
```
 | 
						|
 | 
						|
## How to run it?
 | 
						|
 | 
						|
What you need:
 | 
						|
- The text model GGUF, can be converted using `convert_hf_to_gguf.py`
 | 
						|
- The mmproj file from step above
 | 
						|
- An image file
 | 
						|
 | 
						|
```bash
 | 
						|
# build
 | 
						|
cmake -B build
 | 
						|
cmake --build build --target llama-mtmd-cli
 | 
						|
 | 
						|
# run it
 | 
						|
./build/bin/llama-mtmd-cli -m {text_model}.gguf --mmproj mmproj.gguf --image your_image.jpg
 | 
						|
```
 |