Tatsuya Tanaka 
							
						 
					 
					
						
						
							
						
						ceda28ef8e 
					 
					
						
						
							
							llava : remove duplicate include ( #13207 )  
						
						
						
						
					 
					
						2025-04-30 15:25:20 +02:00 
						 
				 
			
				
					
						
							
							
								HimariO 
							
						 
					 
					
						
						
							
						
						ca2bb89eac 
					 
					
						
						
							
							clip : Add Qwen2.5VL support ( #12402 )  
						
						... 
						
						
						
						* implment vision model architecture, gguf convertor
* handle window attention inputs
* add debug utils
* fix few incorrect tensor memory layout
* move position id remap out of ggml to avoid int32 cuda operations
* cleaning up
* ignore transformers Qwen2_5_xxx type check
* remove not so often use `qwen2vl-cli` debug functions
* remove commented-out code blocks
* fix attn weight scaling after rebase
* add `PROJECTOR_TYPE_QWEN2_5_VL`
* remove `KEY_USE_GLU_MLP`, `KEY_USE_RMS_NORM`
* replace `KEY_FULLATTN_BLK_IDX` with `KEY_WIN_ATTN_PATTERN`
* remove `attn_window_size` from gguf
* fix model conversion
* clean up
* fix merging problem
* add test
---------
Co-authored-by: Xuan Son Nguyen <son@huggingface.co > 
						
						
					 
					
						2025-04-27 10:10:34 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						4753791e70 
					 
					
						
						
							
							clip : improve projector naming ( #13118 )  
						
						... 
						
						
						
						* clip : improve projector naming
* no more kv has_llava_projector
* rm unused kv
* rm more unused 
						
						
					 
					
						2025-04-26 22:39:47 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						13be08daf9 
					 
					
						
						
							
							clip : remove boi/eoi embeddings for GLM-edge model ( #13081 )  
						
						
						
						
					 
					
						2025-04-24 22:17:04 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						ecda2ec4b3 
					 
					
						
						
							
							mtmd : Support Pixtral 12B ( #13065 )  
						
						... 
						
						
						
						* add pixtral text model (vision is wip)
* cgraph ok, just missing 2D RoPE
* fix bad rebase
* first working version
* fix problem with img_break token
* support dynamic image size
* update docs
* update test script 
						
						
					 
					
						2025-04-23 20:21:59 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						dc39a5e7a8 
					 
					
						
						
							
							mtmd : support SmolVLM (version 1 and 2) ( #13050 )  
						
						... 
						
						
						
						* mtmd : support SmolVLM (version 1 and 2)
* correct chat template
* fix n_patches
* scale_factor is an int
* add more models to test 
						
						
					 
					
						2025-04-22 16:24:54 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						2016f07bd1 
					 
					
						
						
							
							convert : experimental support for --mmproj flag ( #13023 )  
						
						... 
						
						
						
						* convert : experimental support for `--mmproj` flag
* fix bad ctrl+f replace
* fix style
* split into subclasses TextModel and VisionModel
* rename Mode --> ModelBase
* small fix
* correct CLIP_VISION arch name (because existing GGUF already use it)
* Apply suggestions from code review
Co-authored-by: compilade <git@compilade.net >
* fix Mistral3Model
* fix typo
Co-authored-by: compilade <git@compilade.net >
---------
Co-authored-by: compilade <git@compilade.net > 
						
						
					 
					
						2025-04-20 23:29:36 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						0c50923944 
					 
					
						
						
							
							clip : use smart pointer ( ⚠️  breaking change) ( #12869 )  
						
						... 
						
						
						
						* clip : use smart pointers
* fix warmup
* add forward declaration
* misisng include
* fix include (2)
* composite
* simplify batch ptr
* fix conflict 
						
						
					 
					
						2025-04-11 12:09:39 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						8b9cc7cdd8 
					 
					
						
						
							
							llava : introduce libmtmd ( #12849 )  
						
						... 
						
						
						
						* wip llava2
* migrated gemma3 to llava2
* add timings
* correct pre/postfix
* fix missing include
* fix compilation unused var warn
* update llava2_tokenize
* change name llava2 --> mtmd
* improve api
* refine helpers
* Update examples/llava/mtmd.cpp
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com > 
						
						
					 
					
						2025-04-10 22:57:16 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						0364178ca2 
					 
					
						
						
							
							clip : refactor clip_init, add tests ( #12757 )  
						
						... 
						
						
						
						* refactor clip_init
* fix loading file
* fix style
* test ok
* better test with report
* add missing headers
* clarify
* add KEY_MM_PATCH_MERGE_TYPE
* remove bool has_* pattern
* Apply suggestions from code review
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
* Update examples/llava/clip.cpp
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
* use ggml_soft_max_ext
* refactor logging system
* add minicpm-v-o 2.6 for testing
* use nullptr everywhere
* fix Yi-VL model
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com > 
						
						
					 
					
						2025-04-05 17:17:40 +02:00