Matt Clayton 
							
						 
					 
					
						
						
							
						
						b32efad2bc 
					 
					
						
						
							
							llava: improve clip_ctx destructor to not memleak load_image_size ( #12834 )  
						
						
						
						
					 
					
						2025-04-08 22:01:58 +02:00 
						 
				 
			
				
					
						
							
							
								dm4 
							
						 
					 
					
						
						
							
						
						2dabf759e7 
					 
					
						
						
							
							llava: add more helper functions to check projector types in clip context ( #12824 )  
						
						... 
						
						
						
						Signed-off-by: dm4 <sunrisedm4@gmail.com > 
						
						
					 
					
						2025-04-08 15:49:13 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						0364178ca2 
					 
					
						
						
							
							clip : refactor clip_init, add tests ( #12757 )  
						
						... 
						
						
						
						* refactor clip_init
* fix loading file
* fix style
* test ok
* better test with report
* add missing headers
* clarify
* add KEY_MM_PATCH_MERGE_TYPE
* remove bool has_* pattern
* Apply suggestions from code review
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
* Update examples/llava/clip.cpp
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
* use ggml_soft_max_ext
* refactor logging system
* add minicpm-v-o 2.6 for testing
* use nullptr everywhere
* fix Yi-VL model
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com > 
						
						
					 
					
						2025-04-05 17:17:40 +02:00 
						 
				 
			
				
					
						
							
							
								Sigbjørn Skjæret 
							
						 
					 
					
						
						
							
						
						1a85949067 
					 
					
						
						
							
							llava : proper description fix ( #12668 )  
						
						
						
						
					 
					
						2025-03-31 11:28:30 +02:00 
						 
				 
			
				
					
						
							
							
								Sigbjørn Skjæret 
							
						 
					 
					
						
						
							
						
						f52d59d771 
					 
					
						
						
							
							llava : fix clip loading GGUFs with missing description ( #12660 )  
						
						
						
						
					 
					
						2025-03-31 11:07:07 +02:00 
						 
				 
			
				
					
						
							
							
								Ivy233 
							
						 
					 
					
						
						
							
						
						02082f1519 
					 
					
						
						
							
							clip: Fix llama-llava-clip-quantize-cli quantization error under CUDA backend ( #12566 )  
						
						... 
						
						
						
						* [Fix] Compiling clip-quantize-cli and running it in a CUDA environment will cause ggml_fp16_to_fp32 to report an error when trying to access video memory. You need to switch to the CPU backend to run quantize.
After the fix, it will automatically run in the CPU backend and will no longer be bound to CUDA.
* [Fix]Roll back the signature and implementation of clip_model_load, and change the call in clip_model_quantize to clip_init. 
						
						
					 
					
						2025-03-26 15:06:04 +01:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						7841fc723e 
					 
					
						
						
							
							llama : Add Gemma 3 support (+ experimental vision capability) ( #12343 )  
						
						... 
						
						
						
						* llama : Add Gemma 3 text-only support
* fix python coding style
* fix compile on ubuntu
* python: fix style
* fix ubuntu compile
* fix build on ubuntu (again)
* fix ubuntu build, finally
* clip : Experimental support for Gemma 3 vision (#12344 )
* clip : Experimental support for Gemma 3 vision
* fix build
* PRId64 
						
						
					 
					
						2025-03-12 09:30:24 +01:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						96e1280839 
					 
					
						
						
							
							clip : bring back GPU support ( #12322 )  
						
						... 
						
						
						
						* clip : bring back GPU support
* use n_gpu_layers param
* fix double free
* ggml_backend_init_by_type
* clean up 
						
						
					 
					
						2025-03-11 09:20:16 +01:00 
						 
				 
			
				
					
						
							
							
								tc-mb 
							
						 
					 
					
						
						
							
						
						8352cdc87b 
					 
					
						
						
							
							llava : fix bug in minicpm-v code ( #11513 )  
						
						... 
						
						
						
						* fix bug in minicpm-v code
* update readme of minicpm-v 
						
						
					 
					
						2025-03-10 10:33:24 +02:00 
						 
				 
			
				
					
						
							
							
								Alex Brooks 
							
						 
					 
					
						
						
							
						
						7a2c913e66 
					 
					
						
						
							
							llava : Add Granite Vision Support ( #11794 )  
						
						... 
						
						
						
						* Add super wip scripts for multimodal granite gguf
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Add example for converting mmgranite to gguf
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* remove hardcoded path
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Add vision feature layer to gguf params
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Clean up llava surgery and remove name substitution hacks
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Add transformers llava next tensor name mapping
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Make siglip / openclip mutuall exclusive
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Fix projector linear substitution
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Fix linear 2 substitution index
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Increase max flattened gridpoints to 64
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Fix hardcoded concat for multiple feature layers
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Pull vision feature layers out of gguf keys
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* fix num gridpoints and use all layers
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Avoid dropping last image encoder layer in llava models
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Use 10 for max number of patches
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Standardize vision feature layers
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Cleanup logs
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Update comment for vision feature layer init
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Update notes for alternative to legacy llm conversion script
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Fix notes rendering
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Add v prefix to vision feature layer log
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Use current defaults for feature layer
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Use constant for max gridpoints / feat layers, style fixes
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* clarify non-negative feature layers
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Remove CLIP_API from func signature
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* USE MAX_IMAGE_FEATURE_LAYERS const in layer calc
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Clarify feature layers are non negative ints and not uint
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Fix condition for reading feature layers
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* pop last llava layer when feature layers are unset
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Fix unset vision layer 0
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Update examples/llava/clip.cpp
Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com >
* Reenable assertion for out of bounds get_rows
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Use std vector for gridpoints and feature layers
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Caculate max feature layer at load time
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Include base patch for granite vision allocation
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Fix trailing whitespace
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Add max num patches = 10 back for minicpmv
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Use unordered set to store feature layers
Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com >
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Use max feature layer for postnorm
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
* Apply suggestions from code review
---------
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com > 
						
						
					 
					
						2025-02-24 17:09:51 +01:00 
						 
				 
			
				
					
						
							
							
								Ting Lou 
							
						 
					 
					
						
						
							
						
						36c258ee92 
					 
					
						
						
							
							llava: build clip image from pixels ( #11999 )  
						
						... 
						
						
						
						* llava: export function `clip_build_img_from_pixels` to build image from pixels decoded by other libraries instead of stb_image.h for better performance
* Apply suggestions from code review
---------
Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com > 
						
						
					 
					
						2025-02-22 15:28:28 +01:00 
						 
				 
			
				
					
						
							
							
								Alex Brooks 
							
						 
					 
					
						
						
							
						
						ee02ad02c5 
					 
					
						
						
							
							clip : fix visual encoders with no CLS ( #11982 )  
						
						... 
						
						
						
						Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com > 
						
						
					 
					
						2025-02-21 08:11:03 +02:00 
						 
				 
			
				
					
						
							
							
								SAMI 
							
						 
					 
					
						
						
							
						
						1ec208083c 
					 
					
						
						
							
							llava: add quantization for the visual projector LLAVA, Qwen2VL ( #11644 )  
						
						... 
						
						
						
						* Added quantization for visual projector
* Added README
* Fixed the clip quantize implementation in the file
* Fixed the gcc warning regarding minor linting
* Removed trailing whitespace 
						
						
					 
					
						2025-02-05 10:45:40 +03:00 
						 
				 
			
				
					
						
							
							
								piDack 
							
						 
					 
					
						
						
							
						
						0cec062a63 
					 
					
						
						
							
							llama : add support for GLM-Edge and GLM-Edge-V series models ( #10573 )  
						
						... 
						
						
						
						* add glm edge chat model
* use config partial_rotary_factor as rope ratio
* support for glm edge model
* vision model support
* remove debug info
* fix format
* llava.cpp trailing whitespace
* remove unused AutoTokenizer
* Update src/llama.cpp for not contain <|end|> or </s>
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com >
* add edge template
* fix chat template
* fix confict
* fix confict
* fix ci err
* fix format err
* fix template err
* 9b hf chat support
* format
* format clip.cpp
* fix format
* Apply suggestions from code review
* Apply suggestions from code review
* Update examples/llava/clip.cpp
* fix format
* minor : style
---------
Co-authored-by: liyuhang <yuhang.li@zhipuai.cn >
Co-authored-by: piDack <pcdack@hotmail.co >
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com >
Co-authored-by: liyuhang <yuhang.li@aminer.cn >
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com > 
						
						
					 
					
						2025-02-02 09:48:46 +02:00 
						 
				 
			
				
					
						
							
							
								tc-mb 
							
						 
					 
					
						
						
							
						
						3e3357fd77 
					 
					
						
						
							
							llava : support Minicpm-omni ( #11289 )  
						
						... 
						
						
						
						* init
* add readme
* update readme
* no use make
* update readme
* update fix code
* fix editorconfig-checker
* no change convert py
* use clip_image_u8_free 
						
						
					 
					
						2025-01-22 09:35:48 +02:00 
						 
				 
			
				
					
						
							
							
								Johannes Gäßler 
							
						 
					 
					
						
						
							
						
						53ff6b9b9f 
					 
					
						
						
							
							GGUF: C++ refactor, backend support, misc fixes ( #11030 )  
						
						... 
						
						
						
						* GGUF: C++ refactor, backend support, misc fixes
remove ggml_tensor.backend
update CODEOWNERS [no ci]
remove gguf_get_data from API
revise GGUF API data types 
						
						
					 
					
						2025-01-07 18:01:58 +01:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						d408bb9268 
					 
					
						
						
							
							clip : disable GPU support ( #10896 )  
						
						... 
						
						
						
						ggml-ci 
						
						
					 
					
						2024-12-19 18:47:15 +02:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						0bf2d10c55 
					 
					
						
						
							
							tts : add OuteTTS support ( #10784 )  
						
						... 
						
						
						
						* server : add "tokens" output
ggml-ci
* server : output embeddings for all tokens when pooling = none
ggml-ci
* server : be explicit about the pooling type in the tests
ggml-ci
* server : do not normalize embeddings when there is no pooling
ggml-ci
* llama : add OuteTTS support (wip)
* wip
* extract features
* first conv
* group norm
* resnet conv
* resnet
* attn
* pos net
* layer norm
* convnext
* head
* hann window
* fix n_embd + remove llama.cpp hacks
* compute hann window
* fft
* spectrum processing
* clean-up
* tts : receive input text and generate codes
* clip : fix new conv name
* tts : minor fix
* tts : add header + minor fixes
ggml-ci
* tts : add matchematical constant
ggml-ci
* tts : fix sampling + cut initial noise
* tts : fixes
* tts : update default samplers
ggml-ci
* tts : text pre-processing
* tts : outetts-voc -> wavtokenizer-dec
* tts : remove hardcoded constants
ggml-ci
* tts : fix tensor shapes
* llama : refactor wavtokenizer tensors
ggml-ci
* cont
ggml-ci
* cont [no ci]
* llama : update WavTokenizer to non-causal attn
* llama : handle no-vocab detokenization
* tts : add Python example for OuteTTS (wip)
* tts : extend python example to generate spectrogram
ggml-ci
* server : fix rebase artifacts
* tts : enable "return_tokens" in Python example
ggml-ci
* tts : minor fixes
* common : support HF download for vocoder 
						
						
					 
					
						2024-12-18 19:27:21 +02:00 
						 
				 
			
				
					
						
							
							
								HimariO 
							
						 
					 
					
						
						
							
						
						ba1cb19cdd 
					 
					
						
						
							
							llama : add Qwen2VL support + multimodal RoPE ( #10361 )  
						
						... 
						
						
						
						* Barebone Qwen2VL LLM convertor
* Add Qwen2VL cli entrypoint
* [WIP] add qwen2vl arch
* Verify m-rope output
* Add vl-rope/2d-rope support for qwen2vl ViT
* update qwen2vl cli tool
* update 5D tensor op workaround
* [WIP] qwen2vl vision model
* make batch and clip utils compatible with qwen2vl
* [WIP] create inference workflow, gguf convert script but fix
* correcting vision-rope behavior, add the missing last layer back to ViT
* add arg parser to qwen2vl_surgery
* replace variable size array with vector
* cuda-gdb cmake preset
* add fp32 mrope, vision rope kernel
* add fp16 support for qwen2vl and m-rope
* add `GGML_ROPE_TYPE_MROPE`, `GGML_ROPE_TYPE_VISION`
* fix rope op mode switching, out dated func args
* update `llama_hparams`
* update to keep up stream changes
* resolve linter, test errors
* add makefile entry, update speical image padding token
* add mrope unit test, fix few compiler warnings
* rename `mrope` related function, params
* minor updates on debug util, bug fixs
* add `m-rope` testcase to `test-backend-ops`
* Apply suggestions from code review
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
* fix traililng whitespce
* store `llama_hparams.rope_sections` with fixed size array
* update position id tensor size check in GGML_OP_ROPE
* minor updates
* update `ggml_backend_*_supports_op` of unsupported backends
* remote old `rope_section` compare operator
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com > 
						
						
					 
					
						2024-12-14 14:43:46 +02:00 
						 
				 
			
				
					
						
							
							
								piDack 
							
						 
					 
					
						
						
							
						
						01e6d9bb71 
					 
					
						
						
							
							clip : add sycl support ( #10574 )  
						
						... 
						
						
						
						Co-authored-by: piDack <pcdack@hotmail.co > 
						
						
					 
					
						2024-12-04 01:26:37 +01:00 
						 
				 
			
				
					
						
							
							
								Ting Lou 
							
						 
					 
					
						
						
							
						
						678d7994f4 
					 
					
						
						
							
							llava: return false instead of exit ( #10546 )  
						
						
						
						
					 
					
						2024-11-29 01:09:46 +01:00 
						 
				 
			
				
					
						
							
							
								Diego Devesa 
							
						 
					 
					
						
						
							
						
						9f40989351 
					 
					
						
						
							
							ggml : move CPU backend to a separate file ( #10144 )  
						
						
						
						
					 
					
						2024-11-03 19:34:08 +01:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						cad341d889 
					 
					
						
						
							
							metal : reduce command encoding overhead ( #9698 )  
						
						... 
						
						
						
						* metal : reduce command encoding overhead
ggml-ci
* metal : add comments 
						
						
					 
					
						2024-10-01 16:00:25 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						6262d13e0b 
					 
					
						
						
							
							common : reimplement logging ( #9418 )  
						
						... 
						
						
						
						https://github.com/ggerganov/llama.cpp/pull/9418  
					
						2024-09-15 20:46:12 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						d6a04f872d 
					 
					
						
						
							
							ggml : hide ggml_object, ggml_cgraph, ggml_hash_set ( #9408 )  
						
						... 
						
						
						
						* ggml : hide ggml_object, ggml_cgraph, ggml_hash_set
ggml-ci
* ggml : add ggml-impl.h to backends
* ggml : fix compiler warnings
ggml-ci
* ggml : add assert upon adding nodes 
						
						
					 
					
						2024-09-12 14:23:49 +03:00 
						 
				 
			
				
					
						
							
							
								tc-mb 
							
						 
					 
					
						
						
							
						
						7ea8d80d53 
					 
					
						
						
							
							llava : the function "clip" should be int ( #9237 )  
						
						
						
						
					 
					
						2024-08-30 07:21:57 +02:00 
						 
				 
			
				
					
						
							
							
								Justine Tunney 
							
						 
					 
					
						
						
							
						
						436787f170 
					 
					
						
						
							
							llama : fix time complexity of string replacement ( #9163 )  
						
						... 
						
						
						
						This change fixes a bug where replacing text in a very long string could
cause llama.cpp to hang indefinitely. This is because the algorithm used
was quadratic, due to memmove() when s.replace() is called in a loop. It
seems most search results and LLM responses actually provide the O(n**2)
algorithm, which is a great tragedy. Using a builder string fixes things 
						
						
					 
					
						2024-08-26 09:09:53 +03:00 
						 
				 
			
				
					
						
							
							
								fairydreaming 
							
						 
					 
					
						
						
							
						
						f63f603c87 
					 
					
						
						
							
							llava : zero-initialize clip_ctx structure fields with aggregate initialization 908)  
						
						... 
						
						
						
						Co-authored-by: Stanisław Szymczyk <sszymczy@gmail.com > 
						
						
					 
					
						2024-08-21 09:45:49 +02:00 
						 
				 
			
				
					
						
							
							
								Changyeon Kim 
							
						 
					 
					
						
						
							
						
						2f3c1466ff 
					 
					
						
						
							
							llava: Add ACC OP for GPU acceleration to the Vulkan backend in the LLAVA CLIP model. ( #8984 )  
						
						... 
						
						
						
						* llava: Add ACC OP for GPU acceleration to the Vulkan backend in the LLAVA CLIP model.
- The CLIP model now prioritizes the Vulkan backend over the CPU when vulkan available.
- A GGML_OP_ACC shader has been added.
- The encoding performance of the CLIP model improved from 4.2s on the CPU to 0.9s on the GPU.
Signed-off-by: Changyeon Kim <cyzero.kim@samsung.com >
* fix-up coding style.
Signed-off-by: Changyeon Kim <cyzero.kim@samsung.com >
* Fix-up the missing initial parameter to resolve the compilation warning.
Signed-off-by: Changyeon Kim <cyzero.kim@samsung.com >
* [fix] Add missing parameters.
Signed-off-by: Changyeon Kim <cyzero.kim@samsung.com >
* [fix] Use nb1 and nb2 for dst.
Signed-off-by: Changyeon Kim <cyzero.kim@samsung.com >
* Fix check results ggml_acc call
---------
Signed-off-by: Changyeon Kim <cyzero.kim@samsung.com >
Co-authored-by: 0cc4m <picard12@live.de > 
						
						
					 
					
						2024-08-20 21:00:00 +02:00 
						 
				 
			
				
					
						
							
							
								tc-mb 
							
						 
					 
					
						
						
							
						
						d565bb2fd5 
					 
					
						
						
							
							llava : support MiniCPM-V-2.6 ( #8967 )  
						
						... 
						
						
						
						* init
* rename
* add run android for termux in readme
* add android readme
* add instructions in readme
* change name in readme
* Update README.md
* fixed line
* add result in readme
* random pos_embed
* add positions index
* change for ollama
* change for ollama
* better pos_embed in clip
* support ollama
* updata cmakelist
* updata cmakelist
* rename wrapper
* clear code
* replace and organize code
* add link
* sync master
* fix warnings
* fix warnings
* fix bug in bicubic resize when need resize iamge smaller
* receive review comments and modify
* receive review comments and modify
* put all code into llava dir
* fix quality problem in pr code
* change n_layer
* add space in "-1"
* imitate reshape bug of python code
* fix bug in clip
* fix issues for merging
* fix llama-minicpmv-cli in cmake file
* change pr readme
* fix code review
* remove in line 33 directory in the /cmakelists.txt (not in example, in the main dir
* fix cmakefile
* add warn
* fix KEY_HAS_MINICPMV_PROJ
* remove load_image_size into clip_ctx
* remove the extern "C", MINICPMV_API
* fix uhd code for review comment
* delete minicpmv-wrapper in pr
* remove uhd_image_embed
* Modify 2 notes
* support minicpmv2.6
* modify convert script of minicpmv
* modify convert
* modify convert
* add readme
* add resampler of v2.6
* modify clip
* modify readme
* fix type-check
* fix type-check
* fix type-check
* fix type-check
* modify convert script and readme
* fix convert script and readme
* fix convert
* fix num in convert
* fix type-check
---------
Co-authored-by: Hongji Zhu <fireyoucan@gmail.com >
Co-authored-by: harvestingmoon <leewenyeong@gmail.com > 
						
						
					 
					
						2024-08-16 16:34:41 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						45a55b91aa 
					 
					
						
						
							
							llama : better replace_all (cont) ( #8926 )  
						
						... 
						
						
						
						* llama : better replace_all (cont)
ggml-ci
* code : deduplicate replace_all
ggml-ci 
						
						
					 
					
						2024-08-09 18:23:52 +03:00 
						 
				 
			
				
					
						
							
							
								tc-mb 
							
						 
					 
					
						
						
							
						
						3071c0a5f2 
					 
					
						
						
							
							llava : support MiniCPM-V-2.5 ( #7599 )  
						
						... 
						
						
						
						* init
* rename
* add run android for termux in readme
* add android readme
* add instructions in readme
* change name in readme
* Update README.md
* fixed line
* add result in readme
* random pos_embed
* add positions index
* change for ollama
* change for ollama
* better pos_embed in clip
* support ollama
* updata cmakelist
* updata cmakelist
* rename wrapper
* clear code
* replace and organize code
* add link
* sync master
* fix warnings
* fix warnings
* fix bug in bicubic resize when need resize iamge smaller
* receive review comments and modify
* receive review comments and modify
* put all code into llava dir
* fix quality problem in pr code
* change n_layer
* add space in "-1"
* imitate reshape bug of python code
* fix bug in clip
* fix issues for merging
* fix llama-minicpmv-cli in cmake file
* change pr readme
* fix code review
* remove in line 33 directory in the /cmakelists.txt (not in example, in the main dir
* fix cmakefile
* add warn
* fix KEY_HAS_MINICPMV_PROJ
* remove load_image_size into clip_ctx
* remove the extern "C", MINICPMV_API
* fix uhd code for review comment
* delete minicpmv-wrapper in pr
* remove uhd_image_embed
* Modify 2 notes
* clip : style changes
* del common.h in clip
* fix Type-Check error
* fix Type-Check error
* fix Type-Check error
* fix Type-Check error
* fix makefile error
* fix ubuntu-make error
* try fix clip
* try fix 1
---------
Co-authored-by: Hongji Zhu <fireyoucan@gmail.com >
Co-authored-by: harvestingmoon <leewenyeong@gmail.com >
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com > 
						
						
					 
					
						2024-08-09 13:33:53 +03:00 
						 
				 
			
				
					
						
							
							
								slaren 
							
						 
					 
					
						
						
							
						
						2b1f616b20 
					 
					
						
						
							
							ggml : reduce hash table reset cost ( #8698 )  
						
						... 
						
						
						
						* ggml : reduce hash table reset cost
* fix unreachable code warnings after GGML_ASSERT(false)
* GGML_ASSERT(false) -> GGML_ABORT("fatal error")
* GGML_ABORT use format string 
						
						
					 
					
						2024-07-27 04:41:55 +02:00 
						 
				 
			
				
					
						
							
							
								hipudding 
							
						 
					 
					
						
						
							
						
						1bdd8ae19f 
					 
					
						
						
							
							[CANN] Add Ascend NPU backend ( #6035 )  
						
						... 
						
						
						
						* [CANN] Add Ascend NPU backend
Ascend is a full-stack AI computing infrastructure for industry
applications and services based on Huawei Ascend processors and
software.
CANN (Compute Architecture of Neural Networks), developped by
Huawei, is a heterogeneous computing architecture for AI.
Co-authored-by: wangshuai09 <391746016@qq.com >
* delete trailing whitespaces
* Modify the code based on review comment
* Rename LLAMA_CANN to GGML_CANN
* Make ggml-common.h private
* add ggml_cann prefix for acl funcs
* Add logging for CANN backend
* Delete Trailing whitespace
---------
Co-authored-by: wangshuai09 <391746016@qq.com > 
						
						
					 
					
						2024-07-17 14:23:50 +03:00 
						 
				 
			
				
					
						
							
							
								Daniel Bevenius 
							
						 
					 
					
						
						
							
						
						9b31a40c6d 
					 
					
						
						
							
							clip : suppress unused variable warnings ( #8105 )  
						
						... 
						
						
						
						* clip : suppress unused variable warnings
This commit suppresses unused variable warnings for the variables e in
the catch blocks.
The motivation for this change is to suppress the warnings that are
generated on Windows when using the MSVC compiler. The warnings are
not displayed when using GCC because GCC will mark all catch parameters
as used.
Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com >
* squash! clip : suppress unused variable warnings
Remove e (/*e*/) instead instead of using GGML_UNUSED.
---------
Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com > 
						
						
					 
					
						2024-06-27 01:50:09 +02:00 
						 
				 
			
				
					
						
							
							
								Andrei 
							
						 
					 
					
						
						
							
						
						d11afd6652 
					 
					
						
						
							
							llava : fix moondream support ( #7163 )  
						
						... 
						
						
						
						* Revert "Revert "llava : add support for moondream vision language model (#6899 )""
This reverts commit 9da243b36a 
						
						
					 
					
						2024-05-10 09:41:10 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						9da243b36a 
					 
					
						
						
							
							Revert "llava : add support for moondream vision language model ( #6899 )"  
						
						... 
						
						
						
						This reverts commit 46e12c4692 
						
						
					 
					
						2024-05-08 22:14:39 +03:00 
						 
				 
			
				
					
						
							
							
								vik 
							
						 
					 
					
						
						
							
						
						46e12c4692 
					 
					
						
						
							
							llava : add support for moondream vision language model ( #6899 )  
						
						... 
						
						
						
						* add support for moondream vision language model
This required making the following changes to the CLIP model:
1. Support for patch embedding bias.
2. Make class embedding and pre-layernorm optional.
3. Add support for post-layernorm.
* Update examples/llava/clip.cpp
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com > 
						
						
					 
					
						2024-04-25 22:38:31 +03:00 
						 
				 
			
				
					
						
							
							
								Daniel Bevenius 
							
						 
					 
					
						
						
							
						
						4ab99d8d47 
					 
					
						
						
							
							clip : rename lerp function to avoid conflict ( #6894 )  
						
						... 
						
						
						
						This commit renamesthe lerp (linear interpolation) function in clip.cpp
to avoid a conflict with the lerp function in the <cmath> standard C++
library when using c++20.
The motivation for this change is to enable projects that use c++20 to
be able to compile clip.cpp without having to resort to patching it. The
lerp function was added to cmath in version C++20 (202002L) and is why
this is not causing any issue at the moment as C++11/C++17 is currently
used by llama.cpp.
I realize that llama.cpp uses either C++11 (or C++17 in the case for
SYCL) but wanted to ask if this would be an acceptable change just the
same.
Refs: https://en.cppreference.com/w/cpp/numeric/lerp 
Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com > 
						
						
					 
					
						2024-04-25 15:38:14 +03:00 
						 
				 
			
				
					
						
							
							
								Justine Tunney 
							
						 
					 
					
						
						
							
						
						89b0bf0d5d 
					 
					
						
						
							
							llava : use logger in llava-cli ( #6797 )  
						
						... 
						
						
						
						This change removes printf() logging so llava-cli is shell scriptable. 
						
						
					 
					
						2024-04-21 15:19:04 +03:00 
						 
				 
			
				
					
						
							
							
								Ziang Wu 
							
						 
					 
					
						
						
							
						
						66ba560256 
					 
					
						
						
							
							llava : fix MobileVLM ( #6364 )  
						
						... 
						
						
						
						* fix empty bug
* Update MobileVLM-README.md
added more results on devices
* Update MobileVLM-README.md
* Update MobileVLM-README.md
* Update MobileVLM-README.md
* Update MobileVLM-README.md
* Update MobileVLM-README.md
* Update MobileVLM-README.md
* Update examples/llava/MobileVLM-README.md
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
* Update MobileVLM-README.md
remove gguf links
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com > 
						
						
					 
					
						2024-03-28 16:33:10 +02:00 
						 
				 
			
				
					
						
							
							
								slaren 
							
						 
					 
					
						
						
							
						
						280345968d 
					 
					
						
						
							
							cuda : rename build flag to LLAMA_CUDA ( #6299 )  
						
						
						
						
					 
					
						2024-03-26 01:16:01 +01:00 
						 
				 
			
				
					
						
							
							
								Ziang Wu 
							
						 
					 
					
						
						
							
						
						272935b281 
					 
					
						
						
							
							llava : add MobileVLM_V2 backup ( #6175 )  
						
						... 
						
						
						
						* Add MobileVLM_V2 backup
* Update MobileVLM-README.md
* Update examples/llava/MobileVLM-README.md
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
* Update examples/llava/convert-image-encoder-to-gguf.py
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
* clip :  fix whitespace
* fix deifinition mistake in clip.cpp
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com > 
						
						
					 
					
						2024-03-20 17:02:32 +02:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						d795988d9e 
					 
					
						
						
							
							Revert "llava : add a MobileVLM_V2-1.7B backup ( #6152 )"  
						
						... 
						
						
						
						This reverts commit f8c4e745e1 
						
						
					 
					
						2024-03-20 13:29:49 +02:00 
						 
				 
			
				
					
						
							
							
								Ziang Wu 
							
						 
					 
					
						
						
							
						
						f8c4e745e1 
					 
					
						
						
							
							llava : add a MobileVLM_V2-1.7B backup ( #6152 )  
						
						... 
						
						
						
						* Add MobileVLM_V2 backup
* Update MobileVLM-README.md
* Update examples/llava/MobileVLM-README.md
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
* Update examples/llava/convert-image-encoder-to-gguf.py
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
* clip :  fix whitespace
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com > 
						
						
					 
					
						2024-03-20 13:20:37 +02:00 
						 
				 
			
				
					
						
							
							
								Felix 
							
						 
					 
					
						
						
							
						
						104f5e0fc1 
					 
					
						
						
							
							clip : fix memory leak ( #6138 )  
						
						
						
						
					 
					
						2024-03-18 17:40:22 +02:00 
						 
				 
			
				
					
						
							
							
								Ting Lou 
							
						 
					 
					
						
						
							
						
						4e9a7f7f7f 
					 
					
						
						
							
							llava : change API to pure C style for Rust FFI bindgen ( #6079 )  
						
						... 
						
						
						
						Co-authored-by: Lou Ting <louting.t@alibaba-inc.com > 
						
						
					 
					
						2024-03-15 16:31:05 +02:00 
						 
				 
			
				
					
						
							
							
								Steve Grubb 
							
						 
					 
					
						
						
							
						
						6e0438da3c 
					 
					
						
						
							
							gguf : fix resource leaks ( #6061 )  
						
						... 
						
						
						
						There several places where a gguf context is allocated. A call to gguf_free
is missing in some error paths. Also on linux, llama-bench was missing a
fclose. 
						
						
					 
					
						2024-03-14 20:29:32 +02:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						5b09797321 
					 
					
						
						
							
							ggml : remove old quantization functions ( #5942 )  
						
						... 
						
						
						
						* ggml : remove old quantization functions
ggml-ci
* ggml : simplify ggml_quantize_chunk
ggml-ci
* ggml : restrict correctness
ggml-ci
* ggml : remove hist data from the quantization API
ggml-ci
* tests : remove hist usage in test-backend-ops
ggml-ci
* vulkan : remove hist and fix typo 
						
						
					 
					
						2024-03-09 15:53:59 +02:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						1387cf60f7 
					 
					
						
						
							
							llava : remove extra cont ( #5587 )  
						
						
						
						
					 
					
						2024-02-19 15:23:17 +02:00