Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						5215b91e93 
					 
					
						
						
							
							clip :  fix confused naming ffn_up and ffn_down ( #13290 )  
						
						... 
						
						
						
						* clip :  fix confused naming ffn_up and ffn_down
* rm ffn_i/o/g naming
* rename n_embd, n_ff
* small fix
* no check n_ff 
						
						
					 
					
						2025-05-05 12:54:44 +02:00 
						 
				 
			
				
					
						
							
							
								Sigbjørn Skjæret 
							
						 
					 
					
						
						
							
						
						ae803bfc3d 
					 
					
						
						
							
							convert : bailingmoe : set yarn metadata if present ( #13312 )  
						
						
						
						
					 
					
						2025-05-05 12:34:26 +02:00 
						 
				 
			
				
					
						
							
							
								ymcki 
							
						 
					 
					
						
						
							
						
						3bf785f3ef 
					 
					
						
						
							
							llama : Llama-3_1-Nemotron-Ultra-253B-v1 support ( #12843 )  
						
						
						
						
					 
					
						2025-05-03 17:39:51 +02:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
						
						2f567611c0 
					 
					
						
						
							
							llama-model : support Qwen2 embedding models and pooling_mode_lasttoken ( #13245 )  
						
						
						
						
					 
					
						2025-05-02 11:42:30 -04:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
						
						7d2123484e 
					 
					
						
						
							
							convert : use correct context length for nomic-embed-text-v2 ( #13216 )  
						
						
						
						
					 
					
						2025-05-02 11:41:54 -04:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						074e42ab31 
					 
					
						
						
							
							convert : converting mmproj for Qwen2/2.5VL from convert_hf_to_gguf ( #13209 )  
						
						... 
						
						
						
						* wip
* qwen2.5vl ok
* vision: fix models missing "text_config"
* add test
* fix test repo name
* fix 32B model
* Revert "fix 32B model"
This reverts commit 651752f1ae 
						
						
					 
					
						2025-05-02 17:17:15 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						dcf886007d 
					 
					
						
						
							
							convert : explicitly disable trust_remote_code for AutoConfig ( #13246 )  
						
						
						
						
					 
					
						2025-05-02 08:45:10 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						8936784f7a 
					 
					
						
						
							
							mtmd : add **vision** support for Mistral Small 3.1 ( #13231 )  
						
						... 
						
						
						
						* convert ok
* load ok, missing patch merger
* ah sheet it works
* update llava/readme
* add test
* fix test 
						
						
					 
					
						2025-05-01 17:05:42 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						3e168bede4 
					 
					
						
						
							
							convert : improve model arch handling ( #13122 )  
						
						... 
						
						
						
						* convert : improve model arch handling
* use AutoConfig
* rm trust_remote_code
* Update convert_hf_to_gguf.py
* fix self.block_count for vision
* fix NomicBertModel 
						
						
					 
					
						2025-04-30 16:56:24 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						07c2e2f76c 
					 
					
						
						
							
							convert : correct typo image_mean --> image_std ( #13208 )  
						
						
						
						
					 
					
						2025-04-30 13:06:15 +02:00 
						 
				 
			
				
					
						
							
							
								AT 
							
						 
					 
					
						
						
							
						
						5f5e39e1ba 
					 
					
						
						
							
							model : Nomic Embed Text V2 with Mixture-of-Experts (MoE) architecture ( #12466 )  
						
						... 
						
						
						
						* Nomic Embed Text V2 with Mixture-of-Experts (MoE) architecture
- Adds MoE-based embedding model supporting multilingual embeddings.
- Selects architecture variant based on hyperparameter detection (MoE layers).
- Removes unnecessary subclass initialization checks for clarity.
https://www.nomic.ai/blog/posts/nomic-embed-text-v2 
Co-authored-by: Jared Van Bortel <jared@nomic.ai >
* fix tokenizer
* don't rename this tensor
---------
Co-authored-by: Jared Van Bortel <jared@nomic.ai > 
						
						
					 
					
						2025-04-28 22:52:15 +03:00 
						 
				 
			
				
					
						
							
							
								matteo 
							
						 
					 
					
						
						
							
						
						ced44be342 
					 
					
						
						
							
							llama-chat : fix wrong template in GLM4-0414 ( #13140 )  
						
						... 
						
						
						
						* fix wrong template in GLM4-0414
* fix spaces
* no bos token since it is already in the template
* moved the chatgml4 check to higher priority
* restored template for old GLM models
* moved the GLM4 template check in the correct place with correct check 
						
						
					 
					
						2025-04-27 21:57:32 +02:00 
						 
				 
			
				
					
						
							
							
								HimariO 
							
						 
					 
					
						
						
							
						
						ca2bb89eac 
					 
					
						
						
							
							clip : Add Qwen2.5VL support ( #12402 )  
						
						... 
						
						
						
						* implment vision model architecture, gguf convertor
* handle window attention inputs
* add debug utils
* fix few incorrect tensor memory layout
* move position id remap out of ggml to avoid int32 cuda operations
* cleaning up
* ignore transformers Qwen2_5_xxx type check
* remove not so often use `qwen2vl-cli` debug functions
* remove commented-out code blocks
* fix attn weight scaling after rebase
* add `PROJECTOR_TYPE_QWEN2_5_VL`
* remove `KEY_USE_GLU_MLP`, `KEY_USE_RMS_NORM`
* replace `KEY_FULLATTN_BLK_IDX` with `KEY_WIN_ATTN_PATTERN`
* remove `attn_window_size` from gguf
* fix model conversion
* clean up
* fix merging problem
* add test
---------
Co-authored-by: Xuan Son Nguyen <son@huggingface.co > 
						
						
					 
					
						2025-04-27 10:10:34 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						ecda2ec4b3 
					 
					
						
						
							
							mtmd : Support Pixtral 12B ( #13065 )  
						
						... 
						
						
						
						* add pixtral text model (vision is wip)
* cgraph ok, just missing 2D RoPE
* fix bad rebase
* first working version
* fix problem with img_break token
* support dynamic image size
* update docs
* update test script 
						
						
					 
					
						2025-04-23 20:21:59 +02:00 
						 
				 
			
				
					
						
							
							
								piDack 
							
						 
					 
					
						
						
							
						
						eb1776b15a 
					 
					
						
						
							
							convert : Append mult-eos,half-rope,bos to GLM4-0414 and Z ( #13021 )  
						
						... 
						
						
						
						* append mult-eos,half-rope,bos to GLM4-0414
* remove unset var 
						
						
					 
					
						2025-04-23 16:59:14 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						dc39a5e7a8 
					 
					
						
						
							
							mtmd : support SmolVLM (version 1 and 2) ( #13050 )  
						
						... 
						
						
						
						* mtmd : support SmolVLM (version 1 and 2)
* correct chat template
* fix n_patches
* scale_factor is an int
* add more models to test 
						
						
					 
					
						2025-04-22 16:24:54 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						2016f07bd1 
					 
					
						
						
							
							convert : experimental support for --mmproj flag ( #13023 )  
						
						... 
						
						
						
						* convert : experimental support for `--mmproj` flag
* fix bad ctrl+f replace
* fix style
* split into subclasses TextModel and VisionModel
* rename Mode --> ModelBase
* small fix
* correct CLIP_VISION arch name (because existing GGUF already use it)
* Apply suggestions from code review
Co-authored-by: compilade <git@compilade.net >
* fix Mistral3Model
* fix typo
Co-authored-by: compilade <git@compilade.net >
---------
Co-authored-by: compilade <git@compilade.net > 
						
						
					 
					
						2025-04-20 23:29:36 +02:00 
						 
				 
			
				
					
						
							
							
								Juk Armstrong 
							
						 
					 
					
						
						
							
						
						daa422881a 
					 
					
						
						
							
							llama : DeepSeek V2/V3 MLA implementation ( #12801 )  
						
						... 
						
						
						
						* Merged using squash to remove all noise commit messages
* Force flash attention off for `LLM_ARCH_DEEPSEEK2` - embedding too large
* Removed 3 conts (2x RoPE and 1x RMS-norm)
* Changed to use `<cmath>` instead of `<math.h>`
* Reverted removal of the 3 conts
* Used `reshape` in `llm_graph_context::build_attn_mha()`
* Use `k_pe = ggml_reshape`
* Removed the 3 conts again
* Removed the 3D views of `wk_b` and `wv_b`, and just save and 3D in GGUF
* Removed MQA optimisation from `build_attn_mha()` as no gains now
* Simplified `is_mla` branch in `llm_build_deepseek2()`
* Removed `build_attn_mla` and added `nullptr` to all `build_atnn` calls
* Fixed call to `build_attn` in `llm_build_t5_enc` 
						
						
					 
					
						2025-04-15 09:49:57 +03:00 
						 
				 
			
				
					
						
							
							
								Yuxuan Zhang 
							
						 
					 
					
						
						
							
						
						06bb53ad9b 
					 
					
						
						
							
							llama-model : add Glm4Model implementation for GLM-4-0414 ( #12867 )  
						
						... 
						
						
						
						* GLM-4-0414
* use original one
* Using with tensor map
* fix bug
* change order
* change order
* format with flask8 
						
						
					 
					
						2025-04-11 12:10:10 +02:00 
						 
				 
			
				
					
						
							
							
								Daniel Han 
							
						 
					 
					
						
						
							
						
						ec6c09d0fa 
					 
					
						
						
							
							convert : Llama4 RoPE fix ( #12889 )  
						
						
						
						
					 
					
						2025-04-11 09:49:09 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						5b1f13cb64 
					 
					
						
						
							
							convert : proper tensor name mapping for llama4 ( #12870 )  
						
						... 
						
						
						
						* Llama-4 mapping
* remove hacky renaming
---------
Co-authored-by: Daniel Han <danielhanchen@gmail.com > 
						
						
					 
					
						2025-04-11 09:23:37 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						64eda5deb9 
					 
					
						
						
							
							convert : ability to lazy-load safetensors remotely without downloading to disk ( #12820 )  
						
						... 
						
						
						
						* gguf util : add SafetensorRemote
* fix style
* convert: add --remote option
* convert : allow using lazy remote tensors
It's a bit slow for now since everything is blocking and single-threaded.
* correct metadata.name
* small style fix
* support HF_TOKEN
* convert : use writeable buffer for remote lazy tensors
* convert : fix flake8 lint regarding lamdba assigment
* multithreaded download
* multithread: print debug
* fix style
* Revert "multithreaded download"
This reverts commit 42fc895acegit@compilade.net > 
						
						
					 
					
						2025-04-10 17:24:44 +02:00 
						 
				 
			
				
					
						
							
							
								Bo Zheng 
							
						 
					 
					
						
						
							
						
						d3bd7193ba 
					 
					
						
						
							
							llama : Support Qwen3 and Qwen3MoE ( #12828 )  
						
						... 
						
						
						
						* add qwen3 & qwen3moe support.
* fix
---------
Co-authored-by: bozheng-hit <dsoul0621@gmail.com > 
						
						
					 
					
						2025-04-09 11:47:36 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						1466621e73 
					 
					
						
						
							
							llama : Support llama 4 text-only ( #12791 )  
						
						... 
						
						
						
						* llama4 conversion
* initial support, no chat template
* clean up a bit
* fix tokenizer conversion
* correct hparams
* try this
* fix shexp
* ffn_inp_normed
* chat template
* clean up model conversion
* add_bos
* add scale_before_ffn
* fix order
* weight_before_ffn
* llm_graph_input_attn_temp
* add chunk attn mask
* build_inp_attn_scale()
* add comment about ggml_repeat
* clarify comments
* fix build 
						
						
					 
					
						2025-04-07 23:06:44 +02:00 
						 
				 
			
				
					
						
							
							
								Sigbjørn Skjæret 
							
						 
					 
					
						
						
							
						
						5936a616e4 
					 
					
						
						
							
							convert : BailingMoE : fix qkv split when head_dim is 0 ( #12687 )  
						
						... 
						
						
						
						NOTE: Ling-lite-base is broken, see https://huggingface.co/inclusionAI/Ling-lite-base/discussions/2  
						
						
					 
					
						2025-04-01 14:37:13 +02:00 
						 
				 
			
				
					
						
							
							
								Sigbjørn Skjæret 
							
						 
					 
					
						
						
							
						
						35782aeedb 
					 
					
						
						
							
							convert : BailingMoE : avoid setting rope_dim to 0 ( #12678 )  
						
						
						
						
					 
					
						2025-03-31 23:09:48 +02:00 
						 
				 
			
				
					
						
							
							
								Sigbjørn Skjæret 
							
						 
					 
					
						
						
							
						
						403fbacbbc 
					 
					
						
						
							
							convert : Qwerky : use lora_rank_tokenshift and lora_rank_decay if present ( #12667 )  
						
						
						
						
					 
					
						2025-03-31 16:36:25 +02:00 
						 
				 
			
				
					
						
							
							
								Sigbjørn Skjæret 
							
						 
					 
					
						
						
							
						
						2c3f8b850a 
					 
					
						
						
							
							llama : support BailingMoE (Ling) ( #12634 )  
						
						
						
						
					 
					
						2025-03-30 22:21:03 +02:00 
						 
				 
			
				
					
						
							
							
								Juyoung Suk 
							
						 
					 
					
						
						
							
						
						b3de7cac73 
					 
					
						
						
							
							llama : add Trillion 7B model support ( #12556 )  
						
						... 
						
						
						
						* Support Trillion 7B
* Update llama.h
* Update llama.h
* Update llama-vocab.cpp for Trillion
* Update llama-vocab.cpp 
						
						
					 
					
						2025-03-30 20:38:33 +02:00 
						 
				 
			
				
					
						
							
							
								Si1w 
							
						 
					 
					
						
						
							
						
						f125b8dccf 
					 
					
						
						
							
							llama : add PLM GGUF Conversion & Inference Support ( #12457 )  
						
						... 
						
						
						
						* add edgellm model arch[conversation feature doesn't work]
* remove output.weight layer for edgellm arch
* [Model] update the name of the model
* update the name of model arch in convert gguf
* [Model] Refarctor the model arch into llama-model
* [Bug] Fix the bug in create attn kv
* [Code] Fix editorconfig erros
* [Code] Remove Trailing whitespace
* [Code] Remove Trailing whitespace
* [Code] Change the order of model arch in list
* [Code] Fix flake8 Lint errors
* Remove trailing white space
* [Code] Remove  call in model arch 
						
						
					 
					
						2025-03-27 12:49:15 +02:00 
						 
				 
			
				
					
						
							
							
								Csaba Kecskemeti 
							
						 
					 
					
						
						
							
						
						d5c6309d91 
					 
					
						
						
							
							convert : Support Qwen2_5_VLForConditionalGeneration ( #12595 )  
						
						
						
						
					 
					
						2025-03-27 11:11:23 +01:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						df4d20cd53 
					 
					
						
						
							
							convert : fix squeeze for ssm_conv tensors ( #12573 )  
						
						... 
						
						
						
						* convert : fix squeeze for ssm_conv tensors
* convert : match ssm_conv tensors by type
---------
Co-authored-by: Francis Couture-Harpin <git@compilade.net > 
						
						
					 
					
						2025-03-26 08:21:05 -04:00 
						 
				 
			
				
					
						
							
							
								Sigbjørn Skjæret 
							
						 
					 
					
						
						
							
						
						53af4dba42 
					 
					
						
						
							
							convert: fix Mistral3/Gemma3 model hparams init ( #12571 )  
						
						... 
						
						
						
						* Fix Mistral3/Gemma3 model hparams init
* set positional args correctly
* use existing hparams if passed 
						
						
					 
					
						2025-03-25 23:03:10 +01:00 
						 
				 
			
				
					
						
							
							
								compilade 
							
						 
					 
					
						
						
							
						
						00d53800e0 
					 
					
						
						
							
							llama-vocab : add SuperBPE pre-tokenizer ( #12532 )  
						
						
						
						
					 
					
						2025-03-24 11:47:24 +01:00 
						 
				 
			
				
					
						
							
							
								Bartowski 
							
						 
					 
					
						
						
							
						
						732b5fbf5e 
					 
					
						
						
							
							convert : avoid calls to tokenizer.added_tokens_decoder ( #12473 )  
						
						... 
						
						
						
						tokenizer.added_tokens_decoder returns a fresh dict every time relatively slowly (~0.04s on average) which results in massive slowdowns when we have a huge number of added tokens 
						
						
					 
					
						2025-03-20 08:36:37 +02:00 
						 
				 
			
				
					
						
							
							
								Sigbjørn Skjæret 
							
						 
					 
					
						
						
							
						
						108e53c2f1 
					 
					
						
						
							
							llama : add support for GPT2, Bloom and CodeShell tied word embeddings ( #12456 )  
						
						... 
						
						
						
						* Add support for GPT2, Bloom and CodeShell tied word embeddings
* Deduplicate tied word embeddings weights
* Workaround for incorrect weight map
It appears transformer.wte.weight is in the weight map even though the weights are not there, remove it if output weights are encountered first.
* check++
* fatfingers-- 
						
						
					 
					
						2025-03-19 09:08:49 +01:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						29fff308c7 
					 
					
						
						
							
							llama : support converting Mistral Small text-only ( #12450 )  
						
						
						
						
					 
					
						2025-03-18 19:16:19 +01:00 
						 
				 
			
				
					
						
							
							
								Molly Sophia 
							
						 
					 
					
						
						
							
						
						7dfad387e3 
					 
					
						
						
							
							llama: Add support for RWKV v7 architecture ( #12412 )  
						
						... 
						
						
						
						* ggml: Add op l2_norm
Signed-off-by: Molly Sophia <mollysophia379@gmail.com >
* ggml: Add op rwkv_wkv7
Signed-off-by: Molly Sophia <mollysophia379@gmail.com >
* llama: Add support for RWKV7 and ARWKV7 models
Signed-off-by: Molly Sophia <mollysophia379@gmail.com >
* llama: fix inference with RWKV6Qwen2
Signed-off-by: Molly Sophia <mollysophia379@gmail.com >
* llama: add more (a)rwkv7 variants in size
Signed-off-by: Molly Sophia <mollysophia379@gmail.com >
* Apply code-format changes
Signed-off-by: Molly Sophia <mollysophia379@gmail.com >
* fix MUSA build
Signed-off-by: Molly Sophia <mollysophia379@gmail.com >
* llama: fix shape error with rwkv using llama-parallel
Signed-off-by: Molly Sophia <mollysophia379@gmail.com >
---------
Signed-off-by: Molly Sophia <mollysophia379@gmail.com > 
						
						
					 
					
						2025-03-18 07:27:50 +08:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						7841fc723e 
					 
					
						
						
							
							llama : Add Gemma 3 support (+ experimental vision capability) ( #12343 )  
						
						... 
						
						
						
						* llama : Add Gemma 3 text-only support
* fix python coding style
* fix compile on ubuntu
* python: fix style
* fix ubuntu compile
* fix build on ubuntu (again)
* fix ubuntu build, finally
* clip : Experimental support for Gemma 3 vision (#12344 )
* clip : Experimental support for Gemma 3 vision
* fix build
* PRId64 
						
						
					 
					
						2025-03-12 09:30:24 +01:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						c43a3e7996 
					 
					
						
						
							
							llama : add Phi-4-mini support (supersede  #12099 ) ( #12108 )  
						
						... 
						
						
						
						* Added Phi-4-mini-instruct support
* Update regex per ngxson
* Change the vocab base to Xenova/gpt-4o
* fix conversion update script
* no need to check longrope
* minor style fix
* fix python style
---------
Co-authored-by: Nicholas Sparks <nisparks@microsoft.com > 
						
						
					 
					
						2025-02-28 12:44:11 +01:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						68ff663a04 
					 
					
						
						
							
							repo : update links to new url ( #11886 )  
						
						... 
						
						
						
						* repo : update links to new url
ggml-ci
* cont : more urls
ggml-ci 
						
						
					 
					
						2025-02-15 16:40:57 +02:00 
						 
				 
			
				
					
						
							
							
								piDack 
							
						 
					 
					
						
						
							
						
						0cec062a63 
					 
					
						
						
							
							llama : add support for GLM-Edge and GLM-Edge-V series models ( #10573 )  
						
						... 
						
						
						
						* add glm edge chat model
* use config partial_rotary_factor as rope ratio
* support for glm edge model
* vision model support
* remove debug info
* fix format
* llava.cpp trailing whitespace
* remove unused AutoTokenizer
* Update src/llama.cpp for not contain <|end|> or </s>
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com >
* add edge template
* fix chat template
* fix confict
* fix confict
* fix ci err
* fix format err
* fix template err
* 9b hf chat support
* format
* format clip.cpp
* fix format
* Apply suggestions from code review
* Apply suggestions from code review
* Update examples/llava/clip.cpp
* fix format
* minor : style
---------
Co-authored-by: liyuhang <yuhang.li@zhipuai.cn >
Co-authored-by: piDack <pcdack@hotmail.co >
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com >
Co-authored-by: liyuhang <yuhang.li@aminer.cn >
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com > 
						
						
					 
					
						2025-02-02 09:48:46 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan Son Nguyen 
							
						 
					 
					
						
						
							
						
						ec7f3ac9ab 
					 
					
						
						
							
							llama : add support for Deepseek-R1-Qwen distill model ( #11310 )  
						
						... 
						
						
						
						* llama : add support for Deepseek-R1-Qwen distill model
* coding style 
						
						
					 
					
						2025-01-20 14:35:07 +01:00 
						 
				 
			
				
					
						
							
							
								RunningLeon 
							
						 
					 
					
						
						
							
						
						4dbc8b9cb7 
					 
					
						
						
							
							llama : add internlm3 support ( #11233 )  
						
						... 
						
						
						
						* support internlm3
* fix lint 
						
						
					 
					
						2025-01-16 20:10:38 +02:00 
						 
				 
			
				
					
						
							
							
								Daniel Bevenius 
							
						 
					 
					
						
						
							
						
						2739a71e4b 
					 
					
						
						
							
							convert : sort print supported models [no ci] ( #11179 )  
						
						... 
						
						
						
						This commit sorts the list of supported models when printing them out.
The motivation for this change is to make it easier to find a specific
model in the list of supported models. For example:
```console
$ ./convert_hf_to_gguf.py --print-supported-models
Supported models:
- ArcticForCausalLM
- BaiChuanForCausalLM
- BaichuanForCausalLM
- BertForMaskedLM
- BertModel
- BitnetForCausalLM
- BloomForCausalLM
- BloomModel
- CamembertModel
- ChameleonForCausalLM
- ChameleonForConditionalGeneration
- ChatGLMForConditionalGeneration
- ChatGLMModel
- CodeShellForCausalLM
- Cohere2ForCausalLM
- CohereForCausalLM
- DbrxForCausalLM
- DeciLMForCausalLM
- DeepseekForCausalLM
- DeepseekV2ForCausalLM
- DeepseekV3ForCausalLM
- ExaoneForCausalLM
- FalconForCausalLM
- FalconMambaForCausalLM
- GPT2LMHeadModel
- GPTBigCodeForCausalLM
- GPTNeoXForCausalLM
- GPTRefactForCausalLM
- Gemma2ForCausalLM
- GemmaForCausalLM
- GraniteForCausalLM
- GraniteMoeForCausalLM
- GrokForCausalLM
- InternLM2ForCausalLM
- JAISLMHeadModel
- JinaBertForMaskedLM
- JinaBertModel
- LLaMAForCausalLM
- LlamaForCausalLM
- LlavaStableLMEpochForCausalLM
- MPTForCausalLM
- MT5ForConditionalGeneration
- MambaForCausalLM
- MambaLMHeadModel
- MiniCPM3ForCausalLM
- MiniCPMForCausalLM
- MistralForCausalLM
- MixtralForCausalLM
- NemotronForCausalLM
- NomicBertModel
- OLMoForCausalLM
- Olmo2ForCausalLM
- OlmoForCausalLM
- OlmoeForCausalLM
- OpenELMForCausalLM
- OrionForCausalLM
- Phi3ForCausalLM
- PhiForCausalLM
- PhiMoEForCausalLM
- PlamoForCausalLM
- QWenLMHeadModel
- Qwen2ForCausalLM
- Qwen2MoeForCausalLM
- Qwen2VLForConditionalGeneration
- RWForCausalLM
- RWKV6Qwen2ForCausalLM
- RobertaModel
- Rwkv6ForCausalLM
- StableLMEpochForCausalLM
- StableLmForCausalLM
- Starcoder2ForCausalLM
- T5EncoderModel
- T5ForConditionalGeneration
- T5WithLMHeadModel
- UMT5ForConditionalGeneration
- WavTokenizerDec
- XLMRobertaForSequenceClassification
- XLMRobertaModel
- XverseForCausalLM
``` 
						
						
					 
					
						2025-01-11 05:50:33 +01:00 
						 
				 
			
				
					
						
							
							
								Daniel Bevenius 
							
						 
					 
					
						
						
							
						
						ff3fcabc72 
					 
					
						
						
							
							convert : add --print-supported-models option ( #11172 )  
						
						... 
						
						
						
						* convert : add --print-supported-models option
This commit adds a new option to the convert_hf_to_gguf.py script to
print the supported models.
The motivation for this is that it can be useful to know which models
are supported by the script without having to look at the code.
Example usage:
```console
$ ./convert_hf_to_gguf.py --print-supported-models
Supported models:
- GPTNeoXForCausalLM
- BloomForCausalLM
- BloomModel
- MPTForCausalLM
- OrionForCausalLM
- BaichuanForCausalLM
- BaiChuanForCausalLM
- XverseForCausalLM
- FalconForCausalLM
- RWForCausalLM
- GPTBigCodeForCausalLM
- GPTRefactForCausalLM
- StableLmForCausalLM
- StableLMEpochForCausalLM
- LlavaStableLMEpochForCausalLM
- LLaMAForCausalLM
- LlamaForCausalLM
- MistralForCausalLM
- MixtralForCausalLM
- DeciLMForCausalLM
- BitnetForCausalLM
- GrokForCausalLM
- DbrxForCausalLM
- MiniCPMForCausalLM
- MiniCPM3ForCausalLM
- QWenLMHeadModel
- Qwen2ForCausalLM
- Qwen2VLForConditionalGeneration
- WavTokenizerDec
- Qwen2MoeForCausalLM
- GPT2LMHeadModel
- PhiForCausalLM
- Phi3ForCausalLM
- PhiMoEForCausalLM
- PlamoForCausalLM
- CodeShellForCausalLM
- InternLM2ForCausalLM
- BertModel
- BertForMaskedLM
- CamembertModel
- RobertaModel
- NomicBertModel
- XLMRobertaModel
- XLMRobertaForSequenceClassification
- GemmaForCausalLM
- Gemma2ForCausalLM
- Starcoder2ForCausalLM
- Rwkv6ForCausalLM
- RWKV6Qwen2ForCausalLM
- MambaForCausalLM
- MambaLMHeadModel
- FalconMambaForCausalLM
- CohereForCausalLM
- Cohere2ForCausalLM
- OLMoForCausalLM
- OlmoForCausalLM
- Olmo2ForCausalLM
- OlmoeForCausalLM
- JinaBertModel
- JinaBertForMaskedLM
- OpenELMForCausalLM
- ArcticForCausalLM
- DeepseekForCausalLM
- DeepseekV3ForCausalLM
- DeepseekV2ForCausalLM
- UMT5ForConditionalGeneration
- MT5ForConditionalGeneration
- T5ForConditionalGeneration
- T5WithLMHeadModel
- T5EncoderModel
- JAISLMHeadModel
- ChatGLMModel
- ChatGLMForConditionalGeneration
- NemotronForCausalLM
- ExaoneForCausalLM
- GraniteForCausalLM
- GraniteMoeForCausalLM
- ChameleonForCausalLM
- ChameleonForConditionalGeneration
```
* squash! convert : add --print-supported-models option
Fix flake8 error. 
						
						
					 
					
						2025-01-10 11:30:53 +01:00 
						 
				 
			
				
					
						
							
							
								Molly Sophia 
							
						 
					 
					
						
						
							
						
						ee7136c6d1 
					 
					
						
						
							
							llama: add support for QRWKV6 model architecture ( #11001 )  
						
						... 
						
						
						
						llama: add support for QRWKV6 model architecture (#11001 )
* WIP: Add support for RWKV6Qwen2
Signed-off-by: Molly Sophia <mollysophia379@gmail.com >
* RWKV: Some graph simplification
Signed-off-by: Molly Sophia <mollysophia379@gmail.com >
* Add support for RWKV6Qwen2 with cpu and cuda GLA
Signed-off-by: Molly Sophia <mollysophia379@gmail.com >
* RWKV6[QWEN2]: Concat lerp weights together to reduce cpu overhead
Signed-off-by: Molly Sophia <mollysophia379@gmail.com >
* Fix some typos
Signed-off-by: Molly Sophia <mollysophia379@gmail.com >
* code format changes
Signed-off-by: Molly Sophia <mollysophia379@gmail.com >
* Fix wkv test & add gla test
Signed-off-by: Molly Sophia <mollysophia379@gmail.com >
* Fix cuda warning
Signed-off-by: Molly Sophia <mollysophia379@gmail.com >
* Update README.md
Signed-off-by: Molly Sophia <mollysophia379@gmail.com >
* Update ggml/src/ggml-cuda/gla.cu
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
* Fix fused lerp weights loading with RWKV6
Signed-off-by: Molly Sophia <mollysophia379@gmail.com >
* better sanity check skipping for QRWKV6 in llama-quant
thanks @compilade
Signed-off-by: Molly Sophia <mollysophia379@gmail.com >
Co-authored-by: compilade <git@compilade.net >
---------
Signed-off-by: Molly Sophia <mollysophia379@gmail.com >
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
Co-authored-by: compilade <git@compilade.net > 
						
						
					 
					
						2025-01-10 09:58:08 +08:00 
						 
				 
			
				
					
						
							
							
								Pierrick Hymbert 
							
						 
					 
					
						
						
							
						
						f8feb4b01a 
					 
					
						
						
							
							model: Add support for PhiMoE arch ( #11003 )  
						
						... 
						
						
						
						* model: support phimoe
* python linter
* doc: minor
Co-authored-by: ThiloteE <73715071+ThiloteE@users.noreply.github.com >
* doc: minor
Co-authored-by: ThiloteE <73715071+ThiloteE@users.noreply.github.com >
* doc: add phimoe as supported model
ggml-ci
---------
Co-authored-by: ThiloteE <73715071+ThiloteE@users.noreply.github.com > 
						
						
					 
					
						2025-01-09 11:21:41 +01:00 
						 
				 
			
				
					
						
							
							
								fairydreaming 
							
						 
					 
					
						
						
							
						
						9394bbd484 
					 
					
						
						
							
							llama : Add support for DeepSeek V3 ( #11049 )  
						
						... 
						
						
						
						* convert : extend DEEPSEEK2 model architecture to support DeepseekV3ForCausalLM by adding EXPERT_WEIGHTS_NORM and EXPERT_GATING_FUNC model parameters and FFN_EXP_PROBS_B tensor type
* vocab : add DeepSeek V3 pre-tokenizer regexes
* unicode : handle ACCENT_MARK and SYMBOL categories in regex
* llama : add DeepSeek V3 chat template, handle new model parameters and tensor types
---------
Co-authored-by: Stanisław Szymczyk <sszymczy@gmail.com > 
						
						
					 
					
						2025-01-04 21:06:11 +01:00 
						 
				 
			
				
					
						
							
							
								DAN™ 
							
						 
					 
					
						
						
							
						
						46be942214 
					 
					
						
						
							
							llama : add support for the cohere2 model architecture ( #10900 )  
						
						
						
						
					 
					
						2025-01-04 16:33:31 +02:00