mirror of
				https://github.com/ggml-org/llama.cpp.git
				synced 2025-10-31 08:51:55 +00:00 
			
		
		
		
	 d283d02bf2
			
		
	
	d283d02bf2
	
	
	
		
			
			Warning types fixed (observed under MSYS2 GCC 14.2.0): * format '%ld' expects argument of type 'long int', but argument has type 'size_t' * llama.cpp/ggml/src/ggml-vulkan/vulkan-shaders/vulkan-shaders-gen.cpp:81:46: warning: missing initializer for member '_STARTUPINFOA::lpDesktop' [-Wmissing-field-initializers] (emitted for all struct field except first)
export-lora
Apply LORA adapters to base model and export the resulting model.
usage: llama-export-lora [options]
options:
  -m,    --model                  model path from which to load base model (default '')
         --lora FNAME             path to LoRA adapter  (can be repeated to use multiple adapters)
         --lora-scaled FNAME S    path to LoRA adapter with user defined scaling S  (can be repeated to use multiple adapters)
  -t,    --threads N              number of threads to use during computation (default: 4)
  -o,    --output FNAME           output file (default: 'ggml-lora-merged-f16.gguf')
For example:
./bin/llama-export-lora \
    -m open-llama-3b-v2.gguf \
    -o open-llama-3b-v2-english2tokipona-chat.gguf \
    --lora lora-open-llama-3b-v2-english2tokipona-chat-LATEST.gguf
Multiple LORA adapters can be applied by passing multiple --lora FNAME or --lora-scaled FNAME S command line parameters:
./bin/llama-export-lora \
    -m your_base_model.gguf \
    -o your_merged_model.gguf \
    --lora-scaled lora_task_A.gguf 0.5 \
    --lora-scaled lora_task_B.gguf 0.5