Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						2016f07bd1 
					 
					
						
						
							
							convert : experimental support for --mmproj flag ( #13023 )  
						
						... 
						
						
						
						* convert : experimental support for `--mmproj` flag
* fix bad ctrl+f replace
* fix style
* split into subclasses TextModel and VisionModel
* rename Mode --> ModelBase
* small fix
* correct CLIP_VISION arch name (because existing GGUF already use it)
* Apply suggestions from code review
Co-authored-by: compilade <git@compilade.net >
* fix Mistral3Model
* fix typo
Co-authored-by: compilade <git@compilade.net >
---------
Co-authored-by: compilade <git@compilade.net > 
						
						
					 
					
						2025-04-20 23:29:36 +02:00 
						 
				 
			
				
					
						
							
							
								Jeffrey Morgan 
							
						 
					 
					
						
						
							
						
						6602304814 
					 
					
						
						
							
							llava: fix errors in clip.h on certain compilers ( #13030 )  
						
						
						
						
					 
					
						2025-04-20 12:15:41 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						37b9f0d29d 
					 
					
						
						
							
							clip : refactor, add image_manipulation and llava_uhd classes ( #13011 )  
						
						... 
						
						
						
						* clip : refactor, add `image_manipulation` and `llava_uhd`
* refactor llava-1.6 preprocessing
* simplify logic for llava-1.5
* missing include 
						
						
					 
					
						2025-04-19 09:15:45 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						b9154ecff9 
					 
					
						
						
							
							mtmd : add methods to access mtmd_image_tokens ( #12906 )  
						
						... 
						
						
						
						* mtmd : add more api around mtmd_image_tokens
* mtmd : ability to calc image hash
* shared_ptr for mtmd_image_tokens
* move hash to user-define ID (fixed)
* fix prompt_modified
* rm redundant data member 
						
						
					 
					
						2025-04-18 10:04:51 +02:00 
						 
				 
			
				
					
						
							
							
								Russyyds 
							
						 
					 
					
						
						
							
						
						d6d2c2ab8c 
					 
					
						
						
							
							Add performance print for gemma3 in example ( #12929 )  
						
						
						
						
					 
					
						2025-04-14 19:18:20 +02:00 
						 
				 
			
				
					
						
							
							
								Matt Clayton 
							
						 
					 
					
						
						
							
						
						e59ea539b8 
					 
					
						
						
							
							llava: Fix cpu-only clip image encoding sefault ( #12907 )  
						
						... 
						
						
						
						* llava: Fix cpu-only clip image encoding
* clip : no smart ptr for ggml_backend_t
* Fix for backend_ptr push_back
---------
Co-authored-by: Xuan Son Nguyen <son@huggingface.co > 
						
						
					 
					
						2025-04-12 07:29:03 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						0c50923944 
					 
					
						
						
							
							clip : use smart pointer ( ⚠️  breaking change) ( #12869 )  
						
						... 
						
						
						
						* clip : use smart pointers
* fix warmup
* add forward declaration
* misisng include
* fix include (2)
* composite
* simplify batch ptr
* fix conflict 
						
						
					 
					
						2025-04-11 12:09:39 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						8b9cc7cdd8 
					 
					
						
						
							
							llava : introduce libmtmd ( #12849 )  
						
						... 
						
						
						
						* wip llava2
* migrated gemma3 to llava2
* add timings
* correct pre/postfix
* fix missing include
* fix compilation unused var warn
* update llava2_tokenize
* change name llava2 --> mtmd
* improve api
* refine helpers
* Update examples/llava/mtmd.cpp
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com > 
						
						
					 
					
						2025-04-10 22:57:16 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						65a69e6e1b 
					 
					
						
						
							
							clip : do not print ftype ( #12832 )  
						
						
						
						
					 
					
						2025-04-09 10:09:53 +02:00 
						 
				 
			
				
					
						
							
							
								Matt Clayton 
							
						 
					 
					
						
						
							
						
						b32efad2bc 
					 
					
						
						
							
							llava: improve clip_ctx destructor to not memleak load_image_size ( #12834 )  
						
						
						
						
					 
					
						2025-04-08 22:01:58 +02:00 
						 
				 
			
				
					
						
							
							
								dm4 
							
						 
					 
					
						
						
							
						
						2dabf759e7 
					 
					
						
						
							
							llava: add more helper functions to check projector types in clip context ( #12824 )  
						
						... 
						
						
						
						Signed-off-by: dm4 <sunrisedm4@gmail.com > 
						
						
					 
					
						2025-04-08 15:49:13 +02:00 
						 
				 
			
				
					
						
							
							
								Sergey Fedorov 
							
						 
					 
					
						
						
							
						
						f1e3eb4249 
					 
					
						
						
							
							common : fix includes in arg.cpp and gemma3-cli.cpp ( #12766 )  
						
						... 
						
						
						
						* arg.cpp: add a missing include
* gemma3-cli.cpp: fix cinttypes include 
						
						
					 
					
						2025-04-05 17:46:00 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						0364178ca2 
					 
					
						
						
							
							clip : refactor clip_init, add tests ( #12757 )  
						
						... 
						
						
						
						* refactor clip_init
* fix loading file
* fix style
* test ok
* better test with report
* add missing headers
* clarify
* add KEY_MM_PATCH_MERGE_TYPE
* remove bool has_* pattern
* Apply suggestions from code review
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
* Update examples/llava/clip.cpp
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
* use ggml_soft_max_ext
* refactor logging system
* add minicpm-v-o 2.6 for testing
* use nullptr everywhere
* fix Yi-VL model
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com > 
						
						
					 
					
						2025-04-05 17:17:40 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						267c1399f1 
					 
					
						
						
							
							common : refactor downloading system, handle mmproj with -hf option ( #12694 )  
						
						... 
						
						
						
						* (wip) refactor downloading system [no ci]
* fix all examples
* fix mmproj with -hf
* gemma3: update readme
* only handle mmproj in llava example
* fix multi-shard download
* windows: fix problem with std::min and std::max
* fix 2 
						
						
					 
					
						2025-04-01 23:44:05 +02:00 
						 
				 
			
				
					
						
							
							
								Sigbjørn Skjæret 
							
						 
					 
					
						
						
							
						
						1a85949067 
					 
					
						
						
							
							llava : proper description fix ( #12668 )  
						
						
						
						
					 
					
						2025-03-31 11:28:30 +02:00 
						 
				 
			
				
					
						
							
							
								Sigbjørn Skjæret 
							
						 
					 
					
						
						
							
						
						f52d59d771 
					 
					
						
						
							
							llava : fix clip loading GGUFs with missing description ( #12660 )  
						
						
						
						
					 
					
						2025-03-31 11:07:07 +02:00 
						 
				 
			
				
					
						
							
							
								Ivy233 
							
						 
					 
					
						
						
							
						
						02082f1519 
					 
					
						
						
							
							clip: Fix llama-llava-clip-quantize-cli quantization error under CUDA backend ( #12566 )  
						
						... 
						
						
						
						* [Fix] Compiling clip-quantize-cli and running it in a CUDA environment will cause ggml_fp16_to_fp32 to report an error when trying to access video memory. You need to switch to the CPU backend to run quantize.
After the fix, it will automatically run in the CPU backend and will no longer be bound to CUDA.
* [Fix]Roll back the signature and implementation of clip_model_load, and change the call in clip_model_quantize to clip_init. 
						
						
					 
					
						2025-03-26 15:06:04 +01:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						e0dbec0bc6 
					 
					
						
						
							
							llama : refactor llama_context, llama_kv_cache, llm_build_context ( #12181 )  
						
						... 
						
						
						
						* llama : refactor llama_context, llama_kv_cache, llm_build_context
ggml-ci
* graph : don't mutate the KV cache during defrag
ggml-ci
* context : reduce virtuals + remove test function
ggml-ci
* context : move interface implementation to source file + factory
ggml-ci
* graph : move KV cache build functions to llama_context impl
ggml-ci
* graph : remove model reference from build_pooling
ggml-ci
* graph : remove llama_model reference
ggml-ci
* kv_cache : provide rope factors
ggml-ci
* graph : rework inputs to use only unique_ptr, remove attn input abstraction
ggml-ci
* context : remove llama_context_i abstraction
ggml-ci
* context : clean-up
ggml-ci
* graph : clean-up
ggml-ci
* llama : remove redundant keywords (struct, enum)
ggml-ci
* model : adapt gemma3
ggml-ci
* graph : restore same attention ops as on master
ggml-ci
* llama : remove TODO + fix indent
ggml-ci 
						
						
					 
					
						2025-03-13 12:35:44 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						7841fc723e 
					 
					
						
						
							
							llama : Add Gemma 3 support (+ experimental vision capability) ( #12343 )  
						
						... 
						
						
						
						* llama : Add Gemma 3 text-only support
* fix python coding style
* fix compile on ubuntu
* python: fix style
* fix ubuntu compile
* fix build on ubuntu (again)
* fix ubuntu build, finally
* clip : Experimental support for Gemma 3 vision (#12344 )
* clip : Experimental support for Gemma 3 vision
* fix build
* PRId64 
						
						
					 
					
						2025-03-12 09:30:24 +01:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						96e1280839 
					 
					
						
						
							
							clip : bring back GPU support ( #12322 )  
						
						... 
						
						
						
						* clip : bring back GPU support
* use n_gpu_layers param
* fix double free
* ggml_backend_init_by_type
* clean up 
						
						
					 
					
						2025-03-11 09:20:16 +01:00 
						 
				 
			
				
					
						
							
							
								tc-mb 
							
						 
					 
					
						
						
							
						
						8352cdc87b 
					 
					
						
						
							
							llava : fix bug in minicpm-v code ( #11513 )  
						
						... 
						
						
						
						* fix bug in minicpm-v code
* update readme of minicpm-v 
						
						
					 
					
						2025-03-10 10:33:24 +02:00 
						 
				 
			
				
					
						
							
							
								Aaron Teo 
							
						 
					 
					
						
						
							
						
						e9b2f84f14 
					 
					
						
						
							
							llava: add big-endian conversion for image encoder ( #12218 )  
						
						... 
						
						
						
						Signed-off-by: Aaron Teo <aaron.teo1@ibm.com > 
						
						
					 
					
						2025-03-06 09:33:21 +01:00 
						 
				 
			
				
					
						
							
							
								Alex Brooks 
							
						 
					 
					
						
						
							
						
						84d5f4bc19 
					 
					
						
						
							
							Update granite vision docs for 3.2 model ( #12105 )  
						
						... 
						
						
						
						Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com > 
						
						
					 
					
						2025-02-28 11:31:47 +00:00 
						 
				 
			
				
					
						
							
							
								Ting Lou 
							
						 
					 
					
						
						
							
						
						a800ae46da 
					 
					
						
						
							
							llava : add struct for FFI bindgen ( #12079 )  
						
						... 
						
						
						
						* add struct for FFI bindgen
* Apply suggestions from code review
---------
Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com > 
						
						
					 
					
						2025-02-26 15:26:52 +01:00 
						 
				 
			
				
					
						
							
							
								Alex Brooks 
							
						 
					 
					
						
						
							
						
						4d1051a40f 
					 
					
						
						
							
							Add Doc for Converting Granite Vision -> GGUF ( #12006 )  
						
						... 
						
						
						
						* Add example docs for granite vision
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com > 
						
						
					 
					
						2025-02-25 10:46:05 +01:00 
						 
				 
			
				
					
						
							
							
								Alex Brooks 
							
						 
					 
					
						
						
							
						
						7a2c913e66 
					 
					
						
						
							
							llava : Add Granite Vision Support ( #11794 )  
						
						... 
						
						
						
						* Add super wip scripts for multimodal granite gguf
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Add example for converting mmgranite to gguf
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* remove hardcoded path
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Add vision feature layer to gguf params
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Clean up llava surgery and remove name substitution hacks
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Add transformers llava next tensor name mapping
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Make siglip / openclip mutuall exclusive
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Fix projector linear substitution
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Fix linear 2 substitution index
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Increase max flattened gridpoints to 64
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Fix hardcoded concat for multiple feature layers
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Pull vision feature layers out of gguf keys
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* fix num gridpoints and use all layers
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Avoid dropping last image encoder layer in llava models
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Use 10 for max number of patches
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Standardize vision feature layers
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Cleanup logs
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Update comment for vision feature layer init
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Update notes for alternative to legacy llm conversion script
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Fix notes rendering
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Add v prefix to vision feature layer log
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Use current defaults for feature layer
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Use constant for max gridpoints / feat layers, style fixes
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* clarify non-negative feature layers
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Remove CLIP_API from func signature
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* USE MAX_IMAGE_FEATURE_LAYERS const in layer calc
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Clarify feature layers are non negative ints and not uint
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Fix condition for reading feature layers
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* pop last llava layer when feature layers are unset
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Fix unset vision layer 0
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Update examples/llava/clip.cpp
Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com >
* Reenable assertion for out of bounds get_rows
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Use std vector for gridpoints and feature layers
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Caculate max feature layer at load time
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Include base patch for granite vision allocation
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Fix trailing whitespace
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Add max num patches = 10 back for minicpmv
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Use unordered set to store feature layers
Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com >
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Use max feature layer for postnorm
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Apply suggestions from code review
---------
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com > 
						
						
					 
					
						2025-02-24 17:09:51 +01:00 
						 
				 
			
				
					
						
							
							
								Ting Lou 
							
						 
					 
					
						
						
							
						
						36c258ee92 
					 
					
						
						
							
							llava: build clip image from pixels ( #11999 )  
						
						... 
						
						
						
						* llava: export function `clip_build_img_from_pixels` to build image from pixels decoded by other libraries instead of stb_image.h for better performance
* Apply suggestions from code review
---------
Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com > 
						
						
					 
					
						2025-02-22 15:28:28 +01:00 
						 
				 
			
				
					
						
							
							
								Alex Brooks 
							
						 
					 
					
						
						
							
						
						ee02ad02c5 
					 
					
						
						
							
							clip : fix visual encoders with no CLS ( #11982 )  
						
						... 
						
						
						
						Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com > 
						
						
					 
					
						2025-02-21 08:11:03 +02:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						68ff663a04 
					 
					
						
						
							
							repo : update links to new url ( #11886 )  
						
						... 
						
						
						
						* repo : update links to new url
ggml-ci
* cont : more urls
ggml-ci 
						
						
					 
					
						2025-02-15 16:40:57 +02:00 
						 
				 
			
				
					
						
							
							
								SAMI 
							
						 
					 
					
						
						
							
						
						1ec208083c 
					 
					
						
						
							
							llava: add quantization for the visual projector LLAVA, Qwen2VL ( #11644 )  
						
						... 
						
						
						
						* Added quantization for visual projector
* Added README
* Fixed the clip quantize implementation in the file
* Fixed the gcc warning regarding minor linting
* Removed trailing whitespace 
						
						
					 
					
						2025-02-05 10:45:40 +03:00 
						 
				 
			
				
					
						
							
							
								piDack 
							
						 
					 
					
						
						
							
						
						0cec062a63 
					 
					
						
						
							
							llama : add support for GLM-Edge and GLM-Edge-V series models ( #10573 )  
						
						... 
						
						
						
						* add glm edge chat model
* use config partial_rotary_factor as rope ratio
* support for glm edge model
* vision model support
* remove debug info
* fix format
* llava.cpp trailing whitespace
* remove unused AutoTokenizer
* Update src/llama.cpp for not contain <|end|> or </s>
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com >
* add edge template
* fix chat template
* fix confict
* fix confict
* fix ci err
* fix format err
* fix template err
* 9b hf chat support
* format
* format clip.cpp
* fix format
* Apply suggestions from code review
* Apply suggestions from code review
* Update examples/llava/clip.cpp
* fix format
* minor : style
---------
Co-authored-by: liyuhang <yuhang.li@zhipuai.cn >
Co-authored-by: piDack <pcdack@hotmail.co >
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com >
Co-authored-by: liyuhang <yuhang.li@aminer.cn >
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com > 
						
						
					 
					
						2025-02-02 09:48:46 +02:00 
						 
				 
			
				
					
						
							
							
								tc-mb 
							
						 
					 
					
						
						
							
						
						3e3357fd77 
					 
					
						
						
							
							llava : support Minicpm-omni ( #11289 )  
						
						... 
						
						
						
						* init
* add readme
* update readme
* no use make
* update readme
* update fix code
* fix editorconfig-checker
* no change convert py
* use clip_image_u8_free 
						
						
					 
					
						2025-01-22 09:35:48 +02:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						afa8a9ec9b 
					 
					
						
						
							
							llama : add llama_vocab, functions -> methods, naming ( #11110 )  
						
						... 
						
						
						
						* llama : functions -> methods (#11110 )
* llama : add struct llama_vocab to the API (#11156 )
ggml-ci
* hparams : move vocab params to llama_vocab (#11159 )
ggml-ci
* vocab : more pimpl (#11165 )
ggml-ci
* vocab : minor tokenization optimizations (#11160 )
ggml-ci
Co-authored-by: Diego Devesa <slarengh@gmail.com >
* lora : update API names (#11167 )
ggml-ci
* llama : update API names to use correct prefix (#11174 )
* llama : update API names to use correct prefix
ggml-ci
* cont
ggml-ci
* cont
ggml-ci
* minor [no ci]
* vocab : llama_vocab_add_[be]os -> llama_vocab_get_add_[be]os (#11174 )
ggml-ci
* vocab : llama_vocab_n_vocab -> llama_vocab_n_tokens (#11174 )
ggml-ci
---------
Co-authored-by: Diego Devesa <slarengh@gmail.com > 
						
						
					 
					
						2025-01-12 11:32:42 +02:00 
						 
				 
			
				
					
						
							
							
								Johannes Gäßler 
							
						 
					 
					
						
						
							
						
						53ff6b9b9f 
					 
					
						
						
							
							GGUF: C++ refactor, backend support, misc fixes ( #11030 )  
						
						... 
						
						
						
						* GGUF: C++ refactor, backend support, misc fixes
remove ggml_tensor.backend
update CODEOWNERS [no ci]
remove gguf_get_data from API
revise GGUF API data types 
						
						
					 
					
						2025-01-07 18:01:58 +01:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						47182dd03f 
					 
					
						
						
							
							llama : update llama_model API names ( #11063 )  
						
						... 
						
						
						
						* llama : deprecate llama_free_model, add llama_model_free
ggml-ci
* llama : change `llama_load_model_from_file` -> `llama_model_load_from_file`
ggml-ci 
						
						
					 
					
						2025-01-06 10:55:18 +02:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						d408bb9268 
					 
					
						
						
							
							clip : disable GPU support ( #10896 )  
						
						... 
						
						
						
						ggml-ci 
						
						
					 
					
						2024-12-19 18:47:15 +02:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						0bf2d10c55 
					 
					
						
						
							
							tts : add OuteTTS support ( #10784 )  
						
						... 
						
						
						
						* server : add "tokens" output
ggml-ci
* server : output embeddings for all tokens when pooling = none
ggml-ci
* server : be explicit about the pooling type in the tests
ggml-ci
* server : do not normalize embeddings when there is no pooling
ggml-ci
* llama : add OuteTTS support (wip)
* wip
* extract features
* first conv
* group norm
* resnet conv
* resnet
* attn
* pos net
* layer norm
* convnext
* head
* hann window
* fix n_embd + remove llama.cpp hacks
* compute hann window
* fft
* spectrum processing
* clean-up
* tts : receive input text and generate codes
* clip : fix new conv name
* tts : minor fix
* tts : add header + minor fixes
ggml-ci
* tts : add matchematical constant
ggml-ci
* tts : fix sampling + cut initial noise
* tts : fixes
* tts : update default samplers
ggml-ci
* tts : text pre-processing
* tts : outetts-voc -> wavtokenizer-dec
* tts : remove hardcoded constants
ggml-ci
* tts : fix tensor shapes
* llama : refactor wavtokenizer tensors
ggml-ci
* cont
ggml-ci
* cont [no ci]
* llama : update WavTokenizer to non-causal attn
* llama : handle no-vocab detokenization
* tts : add Python example for OuteTTS (wip)
* tts : extend python example to generate spectrogram
ggml-ci
* server : fix rebase artifacts
* tts : enable "return_tokens" in Python example
ggml-ci
* tts : minor fixes
* common : support HF download for vocoder 
						
						
					 
					
						2024-12-18 19:27:21 +02:00 
						 
				 
			
				
					
						
							
							
								Bartowski 
							
						 
					 
					
						
						
							
						
						4ddd199f6f 
					 
					
						
						
							
							llava : Allow locally downloaded models for QwenVL ( #10833 )  
						
						... 
						
						
						
						* Allow locally downloaded models for QwenVL
* Define model_path
* rm trailing space
---------
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com > 
						
						
					 
					
						2024-12-15 21:43:25 +01:00 
						 
				 
			
				
					
						
							
							
								HimariO 
							
						 
					 
					
						
						
							
						
						ba1cb19cdd 
					 
					
						
						
							
							llama : add Qwen2VL support + multimodal RoPE ( #10361 )  
						
						... 
						
						
						
						* Barebone Qwen2VL LLM convertor
* Add Qwen2VL cli entrypoint
* [WIP] add qwen2vl arch
* Verify m-rope output
* Add vl-rope/2d-rope support for qwen2vl ViT
* update qwen2vl cli tool
* update 5D tensor op workaround
* [WIP] qwen2vl vision model
* make batch and clip utils compatible with qwen2vl
* [WIP] create inference workflow, gguf convert script but fix
* correcting vision-rope behavior, add the missing last layer back to ViT
* add arg parser to qwen2vl_surgery
* replace variable size array with vector
* cuda-gdb cmake preset
* add fp32 mrope, vision rope kernel
* add fp16 support for qwen2vl and m-rope
* add `GGML_ROPE_TYPE_MROPE`, `GGML_ROPE_TYPE_VISION`
* fix rope op mode switching, out dated func args
* update `llama_hparams`
* update to keep up stream changes
* resolve linter, test errors
* add makefile entry, update speical image padding token
* add mrope unit test, fix few compiler warnings
* rename `mrope` related function, params
* minor updates on debug util, bug fixs
* add `m-rope` testcase to `test-backend-ops`
* Apply suggestions from code review
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
* fix traililng whitespce
* store `llama_hparams.rope_sections` with fixed size array
* update position id tensor size check in GGML_OP_ROPE
* minor updates
* update `ggml_backend_*_supports_op` of unsupported backends
* remote old `rope_section` compare operator
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com > 
						
						
					 
					
						2024-12-14 14:43:46 +02:00 
						 
				 
			
				
					
						
							
							
								piDack 
							
						 
					 
					
						
						
							
						
						01e6d9bb71 
					 
					
						
						
							
							clip : add sycl support ( #10574 )  
						
						... 
						
						
						
						Co-authored-by: piDack <pcdack@hotmail.co > 
						
						
					 
					
						2024-12-04 01:26:37 +01:00 
						 
				 
			
				
					
						
							
							
								Diego Devesa 
							
						 
					 
					
						
						
							
						
						7cc2d2c889 
					 
					
						
						
							
							ggml : move AMX to the CPU backend ( #10570 )  
						
						... 
						
						
						
						* ggml : move AMX to the CPU backend
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com > 
						
						
					 
					
						2024-11-29 21:54:58 +01:00 
						 
				 
			
				
					
						
							
							
								Ting Lou 
							
						 
					 
					
						
						
							
						
						678d7994f4 
					 
					
						
						
							
							llava: return false instead of exit ( #10546 )  
						
						
						
						
					 
					
						2024-11-29 01:09:46 +01:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						d9d54e498d 
					 
					
						
						
							
							speculative : refactor and add a simpler example ( #10362 )  
						
						... 
						
						
						
						* speculative : refactor and add a simpler example
ggml-ci
* speculative : clean-up and add comments and TODOs [no ci]
* speculative : manage context in common_speculative
ggml-ci
* speculative : simplify
ggml-ci
* speculative : simplify (cont)
ggml-ci
* speculative : add --draft-min CLI arg
* speculative : minor fixup
* make : build fixes
* speculative : do not redraft previous drafts
ggml-ci
* speculative : fix the draft sampling
ggml-ci
* speculative : fix compile warning
* common : refactor args
ggml-ci
* common : change defaults [no ci]
* common : final touches
ggml-ci 
						
						
					 
					
						2024-11-25 09:58:41 +02:00 
						 
				 
			
				
					
						
							
							
								Diego Devesa 
							
						 
					 
					
						
						
							
						
						9f40989351 
					 
					
						
						
							
							ggml : move CPU backend to a separate file ( #10144 )  
						
						
						
						
					 
					
						2024-11-03 19:34:08 +01:00 
						 
				 
			
				
					
						
							
							
								Xuan Son Nguyen 
							
						 
					 
					
						
						
							
						
						cda0e4b648 
					 
					
						
						
							
							llama : remove all_pos_0, all_pos_1, all_seq_id from llama_batch ( #9745 )  
						
						... 
						
						
						
						* refactor llama_batch_get_one
* adapt all examples
* fix simple.cpp
* fix llama_bench
* fix
* fix context shifting
* free batch before return
* use common_batch_add, reuse llama_batch in loop
* null terminated seq_id list
* fix save-load-state example
* fix perplexity
* correct token pos in llama_batch_allocr 
						
						
					 
					
						2024-10-18 23:18:01 +02:00 
						 
				 
			
				
					
						
							
							
								Daniel Bevenius 
							
						 
					 
					
						
						
							
						
						dbf18e4de9 
					 
					
						
						
							
							llava : fix typo in error message [no ci] ( #9884 )  
						
						
						
						
					 
					
						2024-10-16 20:24:05 +03:00 
						 
				 
			
				
					
						
							
							
								Diego Devesa 
							
						 
					 
					
						
						
							
						
						7eee341bee 
					 
					
						
						
							
							common : use common_ prefix for common library functions ( #9805 )  
						
						... 
						
						
						
						* common : use common_ prefix for common library functions
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com > 
						
						
					 
					
						2024-10-10 22:57:42 +02:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						cad341d889 
					 
					
						
						
							
							metal : reduce command encoding overhead ( #9698 )  
						
						... 
						
						
						
						* metal : reduce command encoding overhead
ggml-ci
* metal : add comments 
						
						
					 
					
						2024-10-01 16:00:25 +03:00 
						 
				 
			
				
					
						
							
							
								compilade 
							
						 
					 
					
						
						
							
						
						511636df0c 
					 
					
						
						
							
							ci : reduce severity of unused Pyright ignore comments ( #9697 )  
						
						
						
						
					 
					
						2024-09-30 14:13:16 -04:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						6262d13e0b 
					 
					
						
						
							
							common : reimplement logging ( #9418 )  
						
						... 
						
						
						
						https://github.com/ggerganov/llama.cpp/pull/9418  
					
						2024-09-15 20:46:12 +03:00