Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						eaea325324 
					 
					
						
						
							
							clip : fix model size display ( #13153 )  
						
						
						
						
					 
					
						2025-04-28 21:23:19 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						5fa9e63be8 
					 
					
						
						
							
							clip : refactor set input for cgraph + fix qwen2.5vl input ( #13136 )  
						
						... 
						
						
						
						* clip : refactor set input for cgraph
* more strict assert
* minicpmv : use clip_n_mmproj_embd instead of copying the same code everywhere
* split qwen2 and qwen2.5 code blocks
* minor style fix 
						
						
					 
					
						2025-04-28 12:18:59 +02:00 
						 
				 
			
				
					
						
							
							
								LostRuins Concedo 
							
						 
					 
					
						
						
							
						
						59e991c23c 
					 
					
						
						
							
							Fixes Qwen2.5VL segfault during inference with  https://github.com/ggml-org/llama.cpp/pull/12402  as has_qwen2vl_merger migration was incomplete ( #13133 )  
						
						
						
						
					 
					
						2025-04-27 12:43:37 +02:00 
						 
				 
			
				
					
						
							
							
								HimariO 
							
						 
					 
					
						
						
							
						
						ca2bb89eac 
					 
					
						
						
							
							clip : Add Qwen2.5VL support ( #12402 )  
						
						... 
						
						
						
						* implment vision model architecture, gguf convertor
* handle window attention inputs
* add debug utils
* fix few incorrect tensor memory layout
* move position id remap out of ggml to avoid int32 cuda operations
* cleaning up
* ignore transformers Qwen2_5_xxx type check
* remove not so often use `qwen2vl-cli` debug functions
* remove commented-out code blocks
* fix attn weight scaling after rebase
* add `PROJECTOR_TYPE_QWEN2_5_VL`
* remove `KEY_USE_GLU_MLP`, `KEY_USE_RMS_NORM`
* replace `KEY_FULLATTN_BLK_IDX` with `KEY_WIN_ATTN_PATTERN`
* remove `attn_window_size` from gguf
* fix model conversion
* clean up
* fix merging problem
* add test
---------
Co-authored-by: Xuan Son Nguyen <son@huggingface.co > 
						
						
					 
					
						2025-04-27 10:10:34 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						4753791e70 
					 
					
						
						
							
							clip : improve projector naming ( #13118 )  
						
						... 
						
						
						
						* clip : improve projector naming
* no more kv has_llava_projector
* rm unused kv
* rm more unused 
						
						
					 
					
						2025-04-26 22:39:47 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						edb18b6e8f 
					 
					
						
						
							
							clip : fix pixtral on some GPU backends ( #13097 )  
						
						... 
						
						
						
						* clip : fix pixtral on some GPU backends
* refactor inp_raw set
* rm outdated comment
* fix dynamic size
* add TODO 
						
						
					 
					
						2025-04-25 14:31:42 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						13be08daf9 
					 
					
						
						
							
							clip : remove boi/eoi embeddings for GLM-edge model ( #13081 )  
						
						
						
						
					 
					
						2025-04-24 22:17:04 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						ecda2ec4b3 
					 
					
						
						
							
							mtmd : Support Pixtral 12B ( #13065 )  
						
						... 
						
						
						
						* add pixtral text model (vision is wip)
* cgraph ok, just missing 2D RoPE
* fix bad rebase
* first working version
* fix problem with img_break token
* support dynamic image size
* update docs
* update test script 
						
						
					 
					
						2025-04-23 20:21:59 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						dc39a5e7a8 
					 
					
						
						
							
							mtmd : support SmolVLM (version 1 and 2) ( #13050 )  
						
						... 
						
						
						
						* mtmd : support SmolVLM (version 1 and 2)
* correct chat template
* fix n_patches
* scale_factor is an int
* add more models to test 
						
						
					 
					
						2025-04-22 16:24:54 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						37b9f0d29d 
					 
					
						
						
							
							clip : refactor, add image_manipulation and llava_uhd classes ( #13011 )  
						
						... 
						
						
						
						* clip : refactor, add `image_manipulation` and `llava_uhd`
* refactor llava-1.6 preprocessing
* simplify logic for llava-1.5
* missing include 
						
						
					 
					
						2025-04-19 09:15:45 +02:00 
						 
				 
			
				
					
						
							
							
								Matt Clayton 
							
						 
					 
					
						
						
							
						
						e59ea539b8 
					 
					
						
						
							
							llava: Fix cpu-only clip image encoding sefault ( #12907 )  
						
						... 
						
						
						
						* llava: Fix cpu-only clip image encoding
* clip : no smart ptr for ggml_backend_t
* Fix for backend_ptr push_back
---------
Co-authored-by: Xuan Son Nguyen <son@huggingface.co > 
						
						
					 
					
						2025-04-12 07:29:03 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						0c50923944 
					 
					
						
						
							
							clip : use smart pointer ( ⚠️  breaking change) ( #12869 )  
						
						... 
						
						
						
						* clip : use smart pointers
* fix warmup
* add forward declaration
* misisng include
* fix include (2)
* composite
* simplify batch ptr
* fix conflict 
						
						
					 
					
						2025-04-11 12:09:39 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						8b9cc7cdd8 
					 
					
						
						
							
							llava : introduce libmtmd ( #12849 )  
						
						... 
						
						
						
						* wip llava2
* migrated gemma3 to llava2
* add timings
* correct pre/postfix
* fix missing include
* fix compilation unused var warn
* update llava2_tokenize
* change name llava2 --> mtmd
* improve api
* refine helpers
* Update examples/llava/mtmd.cpp
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com > 
						
						
					 
					
						2025-04-10 22:57:16 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						65a69e6e1b 
					 
					
						
						
							
							clip : do not print ftype ( #12832 )  
						
						
						
						
					 
					
						2025-04-09 10:09:53 +02:00 
						 
				 
			
				
					
						
							
							
								Matt Clayton 
							
						 
					 
					
						
						
							
						
						b32efad2bc 
					 
					
						
						
							
							llava: improve clip_ctx destructor to not memleak load_image_size ( #12834 )  
						
						
						
						
					 
					
						2025-04-08 22:01:58 +02:00 
						 
				 
			
				
					
						
							
							
								dm4 
							
						 
					 
					
						
						
							
						
						2dabf759e7 
					 
					
						
						
							
							llava: add more helper functions to check projector types in clip context ( #12824 )  
						
						... 
						
						
						
						Signed-off-by: dm4 <sunrisedm4@gmail.com > 
						
						
					 
					
						2025-04-08 15:49:13 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						0364178ca2 
					 
					
						
						
							
							clip : refactor clip_init, add tests ( #12757 )  
						
						... 
						
						
						
						* refactor clip_init
* fix loading file
* fix style
* test ok
* better test with report
* add missing headers
* clarify
* add KEY_MM_PATCH_MERGE_TYPE
* remove bool has_* pattern
* Apply suggestions from code review
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
* Update examples/llava/clip.cpp
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
* use ggml_soft_max_ext
* refactor logging system
* add minicpm-v-o 2.6 for testing
* use nullptr everywhere
* fix Yi-VL model
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com > 
						
						
					 
					
						2025-04-05 17:17:40 +02:00 
						 
				 
			
				
					
						
							
							
								Sigbjørn Skjæret 
							
						 
					 
					
						
						
							
						
						1a85949067 
					 
					
						
						
							
							llava : proper description fix ( #12668 )  
						
						
						
						
					 
					
						2025-03-31 11:28:30 +02:00 
						 
				 
			
				
					
						
							
							
								Sigbjørn Skjæret 
							
						 
					 
					
						
						
							
						
						f52d59d771 
					 
					
						
						
							
							llava : fix clip loading GGUFs with missing description ( #12660 )  
						
						
						
						
					 
					
						2025-03-31 11:07:07 +02:00 
						 
				 
			
				
					
						
							
							
								Ivy233 
							
						 
					 
					
						
						
							
						
						02082f1519 
					 
					
						
						
							
							clip: Fix llama-llava-clip-quantize-cli quantization error under CUDA backend ( #12566 )  
						
						... 
						
						
						
						* [Fix] Compiling clip-quantize-cli and running it in a CUDA environment will cause ggml_fp16_to_fp32 to report an error when trying to access video memory. You need to switch to the CPU backend to run quantize.
After the fix, it will automatically run in the CPU backend and will no longer be bound to CUDA.
* [Fix]Roll back the signature and implementation of clip_model_load, and change the call in clip_model_quantize to clip_init. 
						
						
					 
					
						2025-03-26 15:06:04 +01:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						7841fc723e 
					 
					
						
						
							
							llama : Add Gemma 3 support (+ experimental vision capability) ( #12343 )  
						
						... 
						
						
						
						* llama : Add Gemma 3 text-only support
* fix python coding style
* fix compile on ubuntu
* python: fix style
* fix ubuntu compile
* fix build on ubuntu (again)
* fix ubuntu build, finally
* clip : Experimental support for Gemma 3 vision (#12344 )
* clip : Experimental support for Gemma 3 vision
* fix build
* PRId64 
						
						
					 
					
						2025-03-12 09:30:24 +01:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						96e1280839 
					 
					
						
						
							
							clip : bring back GPU support ( #12322 )  
						
						... 
						
						
						
						* clip : bring back GPU support
* use n_gpu_layers param
* fix double free
* ggml_backend_init_by_type
* clean up 
						
						
					 
					
						2025-03-11 09:20:16 +01:00 
						 
				 
			
				
					
						
							
							
								tc-mb 
							
						 
					 
					
						
						
							
						
						8352cdc87b 
					 
					
						
						
							
							llava : fix bug in minicpm-v code ( #11513 )  
						
						... 
						
						
						
						* fix bug in minicpm-v code
* update readme of minicpm-v 
						
						
					 
					
						2025-03-10 10:33:24 +02:00 
						 
				 
			
				
					
						
							
							
								Alex Brooks 
							
						 
					 
					
						
						
							
						
						7a2c913e66 
					 
					
						
						
							
							llava : Add Granite Vision Support ( #11794 )  
						
						... 
						
						
						
						* Add super wip scripts for multimodal granite gguf
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Add example for converting mmgranite to gguf
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* remove hardcoded path
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Add vision feature layer to gguf params
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Clean up llava surgery and remove name substitution hacks
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Add transformers llava next tensor name mapping
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Make siglip / openclip mutuall exclusive
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Fix projector linear substitution
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Fix linear 2 substitution index
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Increase max flattened gridpoints to 64
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Fix hardcoded concat for multiple feature layers
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Pull vision feature layers out of gguf keys
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* fix num gridpoints and use all layers
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Avoid dropping last image encoder layer in llava models
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Use 10 for max number of patches
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Standardize vision feature layers
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Cleanup logs
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Update comment for vision feature layer init
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Update notes for alternative to legacy llm conversion script
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Fix notes rendering
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Add v prefix to vision feature layer log
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Use current defaults for feature layer
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Use constant for max gridpoints / feat layers, style fixes
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* clarify non-negative feature layers
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Remove CLIP_API from func signature
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* USE MAX_IMAGE_FEATURE_LAYERS const in layer calc
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Clarify feature layers are non negative ints and not uint
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Fix condition for reading feature layers
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* pop last llava layer when feature layers are unset
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Fix unset vision layer 0
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Update examples/llava/clip.cpp
Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com >
* Reenable assertion for out of bounds get_rows
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Use std vector for gridpoints and feature layers
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Caculate max feature layer at load time
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Include base patch for granite vision allocation
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Fix trailing whitespace
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Add max num patches = 10 back for minicpmv
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Use unordered set to store feature layers
Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com >
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Use max feature layer for postnorm
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Apply suggestions from code review
---------
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com > 
						
						
					 
					
						2025-02-24 17:09:51 +01:00 
						 
				 
			
				
					
						
							
							
								Ting Lou 
							
						 
					 
					
						
						
							
						
						36c258ee92 
					 
					
						
						
							
							llava: build clip image from pixels ( #11999 )  
						
						... 
						
						
						
						* llava: export function `clip_build_img_from_pixels` to build image from pixels decoded by other libraries instead of stb_image.h for better performance
* Apply suggestions from code review
---------
Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com > 
						
						
					 
					
						2025-02-22 15:28:28 +01:00 
						 
				 
			
				
					
						
							
							
								Alex Brooks 
							
						 
					 
					
						
						
							
						
						ee02ad02c5 
					 
					
						
						
							
							clip : fix visual encoders with no CLS ( #11982 )  
						
						... 
						
						
						
						Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com > 
						
						
					 
					
						2025-02-21 08:11:03 +02:00 
						 
				 
			
				
					
						
							
							
								SAMI 
							
						 
					 
					
						
						
							
						
						1ec208083c 
					 
					
						
						
							
							llava: add quantization for the visual projector LLAVA, Qwen2VL ( #11644 )  
						
						... 
						
						
						
						* Added quantization for visual projector
* Added README
* Fixed the clip quantize implementation in the file
* Fixed the gcc warning regarding minor linting
* Removed trailing whitespace 
						
						
					 
					
						2025-02-05 10:45:40 +03:00 
						 
				 
			
				
					
						
							
							
								piDack 
							
						 
					 
					
						
						
							
						
						0cec062a63 
					 
					
						
						
							
							llama : add support for GLM-Edge and GLM-Edge-V series models ( #10573 )  
						
						... 
						
						
						
						* add glm edge chat model
* use config partial_rotary_factor as rope ratio
* support for glm edge model
* vision model support
* remove debug info
* fix format
* llava.cpp trailing whitespace
* remove unused AutoTokenizer
* Update src/llama.cpp for not contain <|end|> or </s>
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com >
* add edge template
* fix chat template
* fix confict
* fix confict
* fix ci err
* fix format err
* fix template err
* 9b hf chat support
* format
* format clip.cpp
* fix format
* Apply suggestions from code review
* Apply suggestions from code review
* Update examples/llava/clip.cpp
* fix format
* minor : style
---------
Co-authored-by: liyuhang <yuhang.li@zhipuai.cn >
Co-authored-by: piDack <pcdack@hotmail.co >
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com >
Co-authored-by: liyuhang <yuhang.li@aminer.cn >
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com > 
						
						
					 
					
						2025-02-02 09:48:46 +02:00 
						 
				 
			
				
					
						
							
							
								tc-mb 
							
						 
					 
					
						
						
							
						
						3e3357fd77 
					 
					
						
						
							
							llava : support Minicpm-omni ( #11289 )  
						
						... 
						
						
						
						* init
* add readme
* update readme
* no use make
* update readme
* update fix code
* fix editorconfig-checker
* no change convert py
* use clip_image_u8_free 
						
						
					 
					
						2025-01-22 09:35:48 +02:00 
						 
				 
			
				
					
						
							
							
								Johannes Gäßler 
							
						 
					 
					
						
						
							
						
						53ff6b9b9f 
					 
					
						
						
							
							GGUF: C++ refactor, backend support, misc fixes ( #11030 )  
						
						... 
						
						
						
						* GGUF: C++ refactor, backend support, misc fixes
remove ggml_tensor.backend
update CODEOWNERS [no ci]
remove gguf_get_data from API
revise GGUF API data types 
						
						
					 
					
						2025-01-07 18:01:58 +01:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						d408bb9268 
					 
					
						
						
							
							clip : disable GPU support ( #10896 )  
						
						... 
						
						
						
						ggml-ci 
						
						
					 
					
						2024-12-19 18:47:15 +02:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						0bf2d10c55 
					 
					
						
						
							
							tts : add OuteTTS support ( #10784 )  
						
						... 
						
						
						
						* server : add "tokens" output
ggml-ci
* server : output embeddings for all tokens when pooling = none
ggml-ci
* server : be explicit about the pooling type in the tests
ggml-ci
* server : do not normalize embeddings when there is no pooling
ggml-ci
* llama : add OuteTTS support (wip)
* wip
* extract features
* first conv
* group norm
* resnet conv
* resnet
* attn
* pos net
* layer norm
* convnext
* head
* hann window
* fix n_embd + remove llama.cpp hacks
* compute hann window
* fft
* spectrum processing
* clean-up
* tts : receive input text and generate codes
* clip : fix new conv name
* tts : minor fix
* tts : add header + minor fixes
ggml-ci
* tts : add matchematical constant
ggml-ci
* tts : fix sampling + cut initial noise
* tts : fixes
* tts : update default samplers
ggml-ci
* tts : text pre-processing
* tts : outetts-voc -> wavtokenizer-dec
* tts : remove hardcoded constants
ggml-ci
* tts : fix tensor shapes
* llama : refactor wavtokenizer tensors
ggml-ci
* cont
ggml-ci
* cont [no ci]
* llama : update WavTokenizer to non-causal attn
* llama : handle no-vocab detokenization
* tts : add Python example for OuteTTS (wip)
* tts : extend python example to generate spectrogram
ggml-ci
* server : fix rebase artifacts
* tts : enable "return_tokens" in Python example
ggml-ci
* tts : minor fixes
* common : support HF download for vocoder 
						
						
					 
					
						2024-12-18 19:27:21 +02:00 
						 
				 
			
				
					
						
							
							
								HimariO 
							
						 
					 
					
						
						
							
						
						ba1cb19cdd 
					 
					
						
						
							
							llama : add Qwen2VL support + multimodal RoPE ( #10361 )  
						
						... 
						
						
						
						* Barebone Qwen2VL LLM convertor
* Add Qwen2VL cli entrypoint
* [WIP] add qwen2vl arch
* Verify m-rope output
* Add vl-rope/2d-rope support for qwen2vl ViT
* update qwen2vl cli tool
* update 5D tensor op workaround
* [WIP] qwen2vl vision model
* make batch and clip utils compatible with qwen2vl
* [WIP] create inference workflow, gguf convert script but fix
* correcting vision-rope behavior, add the missing last layer back to ViT
* add arg parser to qwen2vl_surgery
* replace variable size array with vector
* cuda-gdb cmake preset
* add fp32 mrope, vision rope kernel
* add fp16 support for qwen2vl and m-rope
* add `GGML_ROPE_TYPE_MROPE`, `GGML_ROPE_TYPE_VISION`
* fix rope op mode switching, out dated func args
* update `llama_hparams`
* update to keep up stream changes
* resolve linter, test errors
* add makefile entry, update speical image padding token
* add mrope unit test, fix few compiler warnings
* rename `mrope` related function, params
* minor updates on debug util, bug fixs
* add `m-rope` testcase to `test-backend-ops`
* Apply suggestions from code review
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
* fix traililng whitespce
* store `llama_hparams.rope_sections` with fixed size array
* update position id tensor size check in GGML_OP_ROPE
* minor updates
* update `ggml_backend_*_supports_op` of unsupported backends
* remote old `rope_section` compare operator
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com > 
						
						
					 
					
						2024-12-14 14:43:46 +02:00 
						 
				 
			
				
					
						
							
							
								piDack 
							
						 
					 
					
						
						
							
						
						01e6d9bb71 
					 
					
						
						
							
							clip : add sycl support ( #10574 )  
						
						... 
						
						
						
						Co-authored-by: piDack <pcdack@hotmail.co > 
						
						
					 
					
						2024-12-04 01:26:37 +01:00 
						 
				 
			
				
					
						
							
							
								Ting Lou 
							
						 
					 
					
						
						
							
						
						678d7994f4 
					 
					
						
						
							
							llava: return false instead of exit ( #10546 )  
						
						
						
						
					 
					
						2024-11-29 01:09:46 +01:00 
						 
				 
			
				
					
						
							
							
								Diego Devesa 
							
						 
					 
					
						
						
							
						
						9f40989351 
					 
					
						
						
							
							ggml : move CPU backend to a separate file ( #10144 )  
						
						
						
						
					 
					
						2024-11-03 19:34:08 +01:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						cad341d889 
					 
					
						
						
							
							metal : reduce command encoding overhead ( #9698 )  
						
						... 
						
						
						
						* metal : reduce command encoding overhead
ggml-ci
* metal : add comments 
						
						
					 
					
						2024-10-01 16:00:25 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						6262d13e0b 
					 
					
						
						
							
							common : reimplement logging ( #9418 )  
						
						... 
						
						
						
						https://github.com/ggerganov/llama.cpp/pull/9418  
					
						2024-09-15 20:46:12 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						d6a04f872d 
					 
					
						
						
							
							ggml : hide ggml_object, ggml_cgraph, ggml_hash_set ( #9408 )  
						
						... 
						
						
						
						* ggml : hide ggml_object, ggml_cgraph, ggml_hash_set
ggml-ci
* ggml : add ggml-impl.h to backends
* ggml : fix compiler warnings
ggml-ci
* ggml : add assert upon adding nodes 
						
						
					 
					
						2024-09-12 14:23:49 +03:00 
						 
				 
			
				
					
						
							
							
								tc-mb 
							
						 
					 
					
						
						
							
						
						7ea8d80d53 
					 
					
						
						
							
							llava : the function "clip" should be int ( #9237 )  
						
						
						
						
					 
					
						2024-08-30 07:21:57 +02:00 
						 
				 
			
				
					
						
							
							
								Justine Tunney 
							
						 
					 
					
						
						
							
						
						436787f170 
					 
					
						
						
							
							llama : fix time complexity of string replacement ( #9163 )  
						
						... 
						
						
						
						This change fixes a bug where replacing text in a very long string could
cause llama.cpp to hang indefinitely. This is because the algorithm used
was quadratic, due to memmove() when s.replace() is called in a loop. It
seems most search results and LLM responses actually provide the O(n**2)
algorithm, which is a great tragedy. Using a builder string fixes things 
						
						
					 
					
						2024-08-26 09:09:53 +03:00 
						 
				 
			
				
					
						
							
							
								fairydreaming 
							
						 
					 
					
						
						
							
						
						f63f603c87 
					 
					
						
						
							
							llava : zero-initialize clip_ctx structure fields with aggregate initialization 908)  
						
						... 
						
						
						
						Co-authored-by: Stanisław Szymczyk <sszymczy@gmail.com > 
						
						
					 
					
						2024-08-21 09:45:49 +02:00 
						 
				 
			
				
					
						
							
							
								Changyeon Kim 
							
						 
					 
					
						
						
							
						
						2f3c1466ff 
					 
					
						
						
							
							llava: Add ACC OP for GPU acceleration to the Vulkan backend in the LLAVA CLIP model. ( #8984 )  
						
						... 
						
						
						
						* llava: Add ACC OP for GPU acceleration to the Vulkan backend in the LLAVA CLIP model.
- The CLIP model now prioritizes the Vulkan backend over the CPU when vulkan available.
- A GGML_OP_ACC shader has been added.
- The encoding performance of the CLIP model improved from 4.2s on the CPU to 0.9s on the GPU.
Signed-off-by: Changyeon Kim <cyzero.kim@samsung.com >
* fix-up coding style.
Signed-off-by: Changyeon Kim <cyzero.kim@samsung.com >
* Fix-up the missing initial parameter to resolve the compilation warning.
Signed-off-by: Changyeon Kim <cyzero.kim@samsung.com >
* [fix] Add missing parameters.
Signed-off-by: Changyeon Kim <cyzero.kim@samsung.com >
* [fix] Use nb1 and nb2 for dst.
Signed-off-by: Changyeon Kim <cyzero.kim@samsung.com >
* Fix check results ggml_acc call
---------
Signed-off-by: Changyeon Kim <cyzero.kim@samsung.com >
Co-authored-by: 0cc4m <picard12@live.de > 
						
						
					 
					
						2024-08-20 21:00:00 +02:00 
						 
				 
			
				
					
						
							
							
								tc-mb 
							
						 
					 
					
						
						
							
						
						d565bb2fd5 
					 
					
						
						
							
							llava : support MiniCPM-V-2.6 ( #8967 )  
						
						... 
						
						
						
						* init
* rename
* add run android for termux in readme
* add android readme
* add instructions in readme
* change name in readme
* Update README.md
* fixed line
* add result in readme
* random pos_embed
* add positions index
* change for ollama
* change for ollama
* better pos_embed in clip
* support ollama
* updata cmakelist
* updata cmakelist
* rename wrapper
* clear code
* replace and organize code
* add link
* sync master
* fix warnings
* fix warnings
* fix bug in bicubic resize when need resize iamge smaller
* receive review comments and modify
* receive review comments and modify
* put all code into llava dir
* fix quality problem in pr code
* change n_layer
* add space in "-1"
* imitate reshape bug of python code
* fix bug in clip
* fix issues for merging
* fix llama-minicpmv-cli in cmake file
* change pr readme
* fix code review
* remove in line 33 directory in the /cmakelists.txt (not in example, in the main dir
* fix cmakefile
* add warn
* fix KEY_HAS_MINICPMV_PROJ
* remove load_image_size into clip_ctx
* remove the extern "C", MINICPMV_API
* fix uhd code for review comment
* delete minicpmv-wrapper in pr
* remove uhd_image_embed
* Modify 2 notes
* support minicpmv2.6
* modify convert script of minicpmv
* modify convert
* modify convert
* add readme
* add resampler of v2.6
* modify clip
* modify readme
* fix type-check
* fix type-check
* fix type-check
* fix type-check
* modify convert script and readme
* fix convert script and readme
* fix convert
* fix num in convert
* fix type-check
---------
Co-authored-by: Hongji Zhu <fireyoucan@gmail.com >
Co-authored-by: harvestingmoon <leewenyeong@gmail.com > 
						
						
					 
					
						2024-08-16 16:34:41 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						45a55b91aa 
					 
					
						
						
							
							llama : better replace_all (cont) ( #8926 )  
						
						... 
						
						
						
						* llama : better replace_all (cont)
ggml-ci
* code : deduplicate replace_all
ggml-ci 
						
						
					 
					
						2024-08-09 18:23:52 +03:00 
						 
				 
			
				
					
						
							
							
								tc-mb 
							
						 
					 
					
						
						
							
						
						3071c0a5f2 
					 
					
						
						
							
							llava : support MiniCPM-V-2.5 ( #7599 )  
						
						... 
						
						
						
						* init
* rename
* add run android for termux in readme
* add android readme
* add instructions in readme
* change name in readme
* Update README.md
* fixed line
* add result in readme
* random pos_embed
* add positions index
* change for ollama
* change for ollama
* better pos_embed in clip
* support ollama
* updata cmakelist
* updata cmakelist
* rename wrapper
* clear code
* replace and organize code
* add link
* sync master
* fix warnings
* fix warnings
* fix bug in bicubic resize when need resize iamge smaller
* receive review comments and modify
* receive review comments and modify
* put all code into llava dir
* fix quality problem in pr code
* change n_layer
* add space in "-1"
* imitate reshape bug of python code
* fix bug in clip
* fix issues for merging
* fix llama-minicpmv-cli in cmake file
* change pr readme
* fix code review
* remove in line 33 directory in the /cmakelists.txt (not in example, in the main dir
* fix cmakefile
* add warn
* fix KEY_HAS_MINICPMV_PROJ
* remove load_image_size into clip_ctx
* remove the extern "C", MINICPMV_API
* fix uhd code for review comment
* delete minicpmv-wrapper in pr
* remove uhd_image_embed
* Modify 2 notes
* clip : style changes
* del common.h in clip
* fix Type-Check error
* fix Type-Check error
* fix Type-Check error
* fix Type-Check error
* fix makefile error
* fix ubuntu-make error
* try fix clip
* try fix 1
---------
Co-authored-by: Hongji Zhu <fireyoucan@gmail.com >
Co-authored-by: harvestingmoon <leewenyeong@gmail.com >
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com > 
						
						
					 
					
						2024-08-09 13:33:53 +03:00 
						 
				 
			
				
					
						
							
							
								slaren 
							
						 
					 
					
						
						
							
						
						2b1f616b20 
					 
					
						
						
							
							ggml : reduce hash table reset cost ( #8698 )  
						
						... 
						
						
						
						* ggml : reduce hash table reset cost
* fix unreachable code warnings after GGML_ASSERT(false)
* GGML_ASSERT(false) -> GGML_ABORT("fatal error")
* GGML_ABORT use format string 
						
						
					 
					
						2024-07-27 04:41:55 +02:00 
						 
				 
			
				
					
						
							
							
								hipudding 
							
						 
					 
					
						
						
							
						
						1bdd8ae19f 
					 
					
						
						
							
							[CANN] Add Ascend NPU backend ( #6035 )  
						
						... 
						
						
						
						* [CANN] Add Ascend NPU backend
Ascend is a full-stack AI computing infrastructure for industry
applications and services based on Huawei Ascend processors and
software.
CANN (Compute Architecture of Neural Networks), developped by
Huawei, is a heterogeneous computing architecture for AI.
Co-authored-by: wangshuai09 <391746016@qq.com >
* delete trailing whitespaces
* Modify the code based on review comment
* Rename LLAMA_CANN to GGML_CANN
* Make ggml-common.h private
* add ggml_cann prefix for acl funcs
* Add logging for CANN backend
* Delete Trailing whitespace
---------
Co-authored-by: wangshuai09 <391746016@qq.com > 
						
						
					 
					
						2024-07-17 14:23:50 +03:00 
						 
				 
			
				
					
						
							
							
								Daniel Bevenius 
							
						 
					 
					
						
						
							
						
						9b31a40c6d 
					 
					
						
						
							
							clip : suppress unused variable warnings ( #8105 )  
						
						... 
						
						
						
						* clip : suppress unused variable warnings
This commit suppresses unused variable warnings for the variables e in
the catch blocks.
The motivation for this change is to suppress the warnings that are
generated on Windows when using the MSVC compiler. The warnings are
not displayed when using GCC because GCC will mark all catch parameters
as used.
Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com >
* squash! clip : suppress unused variable warnings
Remove e (/*e*/) instead instead of using GGML_UNUSED.
---------
Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com > 
						
						
					 
					
						2024-06-27 01:50:09 +02:00 
						 
				 
			
				
					
						
							
							
								Andrei 
							
						 
					 
					
						
						
							
						
						d11afd6652 
					 
					
						
						
							
							llava : fix moondream support ( #7163 )  
						
						... 
						
						
						
						* Revert "Revert "llava : add support for moondream vision language model (#6899 )""
This reverts commit 9da243b36a 
						
						
					 
					
						2024-05-10 09:41:10 +03:00