mirror of
				https://github.com/ggml-org/llama.cpp.git
				synced 2025-10-30 08:42:00 +00:00 
			
		
		
		
	
		
			
				
	
	
		
			34 lines
		
	
	
		
			1.1 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			34 lines
		
	
	
		
			1.1 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
| # export-lora
 | |
| 
 | |
| Apply LORA adapters to base model and export the resulting model.
 | |
| 
 | |
| ```
 | |
| usage: llama-export-lora [options]
 | |
| 
 | |
| options:
 | |
|   -m,    --model                  model path from which to load base model (default '')
 | |
|          --lora FNAME             path to LoRA adapter  (can be repeated to use multiple adapters)
 | |
|          --lora-scaled FNAME S    path to LoRA adapter with user defined scaling S  (can be repeated to use multiple adapters)
 | |
|   -t,    --threads N              number of threads to use during computation (default: 4)
 | |
|   -o,    --output FNAME           output file (default: 'ggml-lora-merged-f16.gguf')
 | |
| ```
 | |
| 
 | |
| For example:
 | |
| 
 | |
| ```bash
 | |
| ./bin/llama-export-lora \
 | |
|     -m open-llama-3b-v2.gguf \
 | |
|     -o open-llama-3b-v2-english2tokipona-chat.gguf \
 | |
|     --lora lora-open-llama-3b-v2-english2tokipona-chat-LATEST.gguf
 | |
| ```
 | |
| 
 | |
| Multiple LORA adapters can be applied by passing multiple `--lora FNAME` or `--lora-scaled FNAME S` command line parameters:
 | |
| 
 | |
| ```bash
 | |
| ./bin/llama-export-lora \
 | |
|     -m your_base_model.gguf \
 | |
|     -o your_merged_model.gguf \
 | |
|     --lora-scaled lora_task_A.gguf 0.5 \
 | |
|     --lora-scaled lora_task_B.gguf 0.5
 | |
| ```
 | 
