Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						07e4351ce6 
					 
					
						
						
							
							convert : allow partial update to the chkhsh pre-tokenizer list ( #13847 )  
						
						... 
						
						
						
						* convert : allow partial update to the chkhsh pre-tokenizer list
* code style
* update tokenizer out
* rm inp/out files for models not having gguf
* fixed hash for glm
* skip nomic-bert-moe test
* Update convert_hf_to_gguf_update.py
* fix minerva-7b hash
* rm redundant import 
						
						
					 
					
						2025-05-30 12:24:37 +02:00 
						 
				 
			
				
					
						
							
							
								Alex Fanthome 
							
						 
					 
					
						
						
							
						
						f7873fc698 
					 
					
						
						
							
							tests : change umlaut test ( #11600 )  
						
						
						
						
					 
					
						2025-05-28 15:49:28 +02:00 
						 
				 
			
				
					
						
							
							
								Sigbjørn Skjæret 
							
						 
					 
					
						
						
							
						
						d2a4ef05c6 
					 
					
						
						
							
							vocab : add ByteDance-Seed/Seed-Coder ( #13423 )  
						
						
						
						
					 
					
						2025-05-10 22:08:07 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						ecda2ec4b3 
					 
					
						
						
							
							mtmd : Support Pixtral 12B ( #13065 )  
						
						... 
						
						
						
						* add pixtral text model (vision is wip)
* cgraph ok, just missing 2D RoPE
* fix bad rebase
* first working version
* fix problem with img_break token
* support dynamic image size
* update docs
* update test script 
						
						
					 
					
						2025-04-23 20:21:59 +02:00 
						 
				 
			
				
					
						
							
							
								Yuxuan Zhang 
							
						 
					 
					
						
						
							
						
						06bb53ad9b 
					 
					
						
						
							
							llama-model : add Glm4Model implementation for GLM-4-0414 ( #12867 )  
						
						... 
						
						
						
						* GLM-4-0414
* use original one
* Using with tensor map
* fix bug
* change order
* change order
* format with flask8 
						
						
					 
					
						2025-04-11 12:10:10 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						1466621e73 
					 
					
						
						
							
							llama : Support llama 4 text-only ( #12791 )  
						
						... 
						
						
						
						* llama4 conversion
* initial support, no chat template
* clean up a bit
* fix tokenizer conversion
* correct hparams
* try this
* fix shexp
* ffn_inp_normed
* chat template
* clean up model conversion
* add_bos
* add scale_before_ffn
* fix order
* weight_before_ffn
* llm_graph_input_attn_temp
* add chunk attn mask
* build_inp_attn_scale()
* add comment about ggml_repeat
* clarify comments
* fix build 
						
						
					 
					
						2025-04-07 23:06:44 +02:00 
						 
				 
			
				
					
						
							
							
								Sigbjørn Skjæret 
							
						 
					 
					
						
						
							
						
						2c3f8b850a 
					 
					
						
						
							
							llama : support BailingMoE (Ling) ( #12634 )  
						
						
						
						
					 
					
						2025-03-30 22:21:03 +02:00 
						 
				 
			
				
					
						
							
							
								Juyoung Suk 
							
						 
					 
					
						
						
							
						
						b3de7cac73 
					 
					
						
						
							
							llama : add Trillion 7B model support ( #12556 )  
						
						... 
						
						
						
						* Support Trillion 7B
* Update llama.h
* Update llama.h
* Update llama-vocab.cpp for Trillion
* Update llama-vocab.cpp 
						
						
					 
					
						2025-03-30 20:38:33 +02:00 
						 
				 
			
				
					
						
							
							
								compilade 
							
						 
					 
					
						
						
							
						
						00d53800e0 
					 
					
						
						
							
							llama-vocab : add SuperBPE pre-tokenizer ( #12532 )  
						
						
						
						
					 
					
						2025-03-24 11:47:24 +01:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						c43a3e7996 
					 
					
						
						
							
							llama : add Phi-4-mini support (supersede  #12099 ) ( #12108 )  
						
						... 
						
						
						
						* Added Phi-4-mini-instruct support
* Update regex per ngxson
* Change the vocab base to Xenova/gpt-4o
* fix conversion update script
* no need to check longrope
* minor style fix
* fix python style
---------
Co-authored-by: Nicholas Sparks <nisparks@microsoft.com > 
						
						
					 
					
						2025-02-28 12:44:11 +01:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						68ff663a04 
					 
					
						
						
							
							repo : update links to new url ( #11886 )  
						
						... 
						
						
						
						* repo : update links to new url
ggml-ci
* cont : more urls
ggml-ci 
						
						
					 
					
						2025-02-15 16:40:57 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan Son Nguyen 
							
						 
					 
					
						
						
							
						
						ec7f3ac9ab 
					 
					
						
						
							
							llama : add support for Deepseek-R1-Qwen distill model ( #11310 )  
						
						... 
						
						
						
						* llama : add support for Deepseek-R1-Qwen distill model
* coding style 
						
						
					 
					
						2025-01-20 14:35:07 +01:00 
						 
				 
			
				
					
						
							
							
								fairydreaming 
							
						 
					 
					
						
						
							
						
						9394bbd484 
					 
					
						
						
							
							llama : Add support for DeepSeek V3 ( #11049 )  
						
						... 
						
						
						
						* convert : extend DEEPSEEK2 model architecture to support DeepseekV3ForCausalLM by adding EXPERT_WEIGHTS_NORM and EXPERT_GATING_FUNC model parameters and FFN_EXP_PROBS_B tensor type
* vocab : add DeepSeek V3 pre-tokenizer regexes
* unicode : handle ACCENT_MARK and SYMBOL categories in regex
* llama : add DeepSeek V3 chat template, handle new model parameters and tensor types
---------
Co-authored-by: Stanisław Szymczyk <sszymczy@gmail.com > 
						
						
					 
					
						2025-01-04 21:06:11 +01:00 
						 
				 
			
				
					
						
							
							
								Yun Dou 
							
						 
					 
					
						
						
							
						
						b92a14a841 
					 
					
						
						
							
							llama : support InfiniAI Megrez 3b ( #10893 )  
						
						... 
						
						
						
						* Support InfiniAI Megrez 3b
* Fix tokenizer_clean_spaces for megrez 
						
						
					 
					
						2024-12-23 01:35:44 +01:00 
						 
				 
			
				
					
						
							
							
								Billel Mokeddem 
							
						 
					 
					
						
						
							
						
						7ae33a616f 
					 
					
						
						
							
							llama : add Falcon3 support ( #10883 )  
						
						... 
						
						
						
						* Add Falcon3 model support
* Add fix for adding bos to added special tokens
* Add comment explaining the logic behind the if statement
* Add a log message to better track the when the following line of code is triggered
* Update log to only print when input and output characters are different
* Fix handling pre-normalized tokens
* Refactoring 
						
						
					 
					
						2024-12-23 00:09:58 +02:00 
						 
				 
			
				
					
						
							
							
								Diego Devesa 
							
						 
					 
					
						
						
							
						
						4da69d1abd 
					 
					
						
						
							
							Revert "llama : add Falcon3 support ( #10864 )" ( #10876 )  
						
						... 
						
						
						
						This reverts commit 382bc7f2e8 
						
						
					 
					
						2024-12-18 01:36:46 +01:00 
						 
				 
			
				
					
						
							
							
								Billel Mokeddem 
							
						 
					 
					
						
						
							
						
						382bc7f2e8 
					 
					
						
						
							
							llama : add Falcon3 support ( #10864 )  
						
						
						
						
					 
					
						2024-12-17 17:24:56 +02:00 
						 
				 
			
				
					
						
							
							
								Valentin Mamedov 
							
						 
					 
					
						
						
							
						
						a0974156f3 
					 
					
						
						
							
							llama : add Deepseek MoE v1 & GigaChat models ( #10827 )  
						
						... 
						
						
						
						* Add deepseek v1 arch & gigachat template
* improve template code
* add readme
* delete comments
* remove comment
* fix format
* lint llama.cpp
* fix order of deepseek and deepseek2, move gigachat temlate to the end of func
* fix order of deepseek and deepseek2 in constants; mark shared exp as deepseek arch need
* remove comments
* move deepseek above deepseek2
* change placement of gigachat chat template 
						
						
					 
					
						2024-12-15 19:02:46 +02:00 
						 
				 
			
				
					
						
							
							
								Sukriti Sharma 
							
						 
					 
					
						
						
							
						
						784a14aa49 
					 
					
						
						
							
							convert : add support for Roberta embeddings ( #10695 )  
						
						
						
						
					 
					
						2024-12-07 09:02:14 +02:00 
						 
				 
			
				
					
						
							
							
								Riccardo Orlando 
							
						 
					 
					
						
						
							
						
						6fe6247831 
					 
					
						
						
							
							llama : add Minerva 7B model support ( #10673 )  
						
						... 
						
						
						
						* Support for Minerva 7B
* Update convert_hf_to_gguf_update.py 
						
						
					 
					
						2024-12-05 20:30:59 +02:00 
						 
				 
			
				
					
						
							
							
								Daniel Bevenius 
							
						 
					 
					
						
						
							
						
						d405804be8 
					 
					
						
						
							
							py : update outdated copy-paste instructions [no ci] ( #10667 )  
						
						... 
						
						
						
						This commit updates the copy-paste instruction in
convert_hf_to_gguf_update.py to reflect that convert_hf_to_gguf.py
will have already been updated with the new get_vocab_base_pre()
function when this script completes. 
						
						
					 
					
						2024-12-05 09:47:55 +02:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						bc5ba007b2 
					 
					
						
						
							
							server : check that the prompt fits in the slot's context ( #10030 )  
						
						... 
						
						
						
						ggml-ci 
						
						
					 
					
						2024-10-25 10:13:46 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						f4d2b8846a 
					 
					
						
						
							
							llama : add reranking support ( #9510 )  
						
						... 
						
						
						
						* py : add XLMRobertaForSequenceClassification [no ci]
* py : fix scalar-tensor conversion [no ci]
* py : fix position embeddings chop [no ci]
* llama : read new cls tensors [no ci]
* llama : add classigication head (wip) [no ci]
* llama : add "rank" pooling type
ggml-ci
* server : add rerank endpoint
ggml-ci
* llama : aboud ggml_repeat during classification
* rerank : cleanup + comments
* server : accept /rerank endpoint in addition to /v1/rerank [no ci]
* embedding : parse special tokens
* jina : support v1 reranker
* vocab : minor style
ggml-ci
* server : initiate tests for later
ggml-ci
* server : add docs
* llama : add comment [no ci]
* llama : fix uninitialized tensors
* ci : add rerank tests
ggml-ci
* add reranking test
* change test data
* Update examples/server/server.cpp
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com >
* add `--reranking` argument
* update server docs
* llama : fix comment [no ci]
ggml-ci
---------
Co-authored-by: Xuan Son Nguyen <son@huggingface.co >
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com > 
						
						
					 
					
						2024-09-28 17:42:03 +03:00 
						 
				 
			
				
					
						
							
							
								nopperl 
							
						 
					 
					
						
						
							
						
						9a913110cf 
					 
					
						
						
							
							llama : add support for Chameleon ( #8543 )  
						
						... 
						
						
						
						* convert chameleon hf to gguf
* add chameleon tokenizer tests
* fix lint
* implement chameleon graph
* add swin norm param
* return qk norm weights and biases to original format
* implement swin norm
* suppress image token output
* rem tabs
* add comment to conversion
* fix ci
* check for k norm separately
* adapt to new lora implementation
* fix layer input for swin norm
* move swin_norm in gguf writer
* add comment regarding special token regex in chameleon pre-tokenizer
* Update src/llama.cpp
Co-authored-by: compilade <git@compilade.net >
* fix punctuation regex in chameleon pre-tokenizer (@compilade)
Co-authored-by: compilade <git@compilade.net >
* fix lint
* trigger ci
---------
Co-authored-by: compilade <git@compilade.net > 
						
						
					 
					
						2024-09-28 15:08:43 +03:00 
						 
				 
			
				
					
						
							
							
								daminho 
							
						 
					 
					
						
						
							
						
						c837981bba 
					 
					
						
						
							
							py : add Phi-1.5/Phi-2 tokenizer ( #9361 )  
						
						... 
						
						
						
						* add phi2 tokenizer
* add phi name to convert_hf_to_gguf_update.py
* make tokenizer_pre consistent; llama.cpp work 
						
						
					 
					
						2024-09-12 14:28:20 +03:00 
						 
				 
			
				
					
						
							
							
								Pavel Zloi 
							
						 
					 
					
						
						
							
						
						8db003a19d 
					 
					
						
						
							
							py : support converting local models ( #7547 )  
						
						... 
						
						
						
						* Support of converting local models added to convert-hf-to-gguf-update.py
* Description fixed
* shutil added to imports 
						
						
					 
					
						2024-09-11 15:29:51 +03:00 
						 
				 
			
				
					
						
							
							
								Minsoo Cheong 
							
						 
					 
					
						
						
							
						
						c679e0cb5c 
					 
					
						
						
							
							llama : add EXAONE model support ( #9025 )  
						
						... 
						
						
						
						* add exaone model support
* add chat template
* fix whitespace
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
* add ftype
* add exaone pre-tokenizer in `llama-vocab.cpp`
Co-Authored-By: compilade <113953597+compilade@users.noreply.github.com >
* fix lint
Co-Authored-By: compilade <113953597+compilade@users.noreply.github.com >
* add `EXAONE` to supported models in `README.md`
* fix space
Co-authored-by: compilade <git@compilade.net >
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
Co-authored-by: compilade <113953597+compilade@users.noreply.github.com >
Co-authored-by: compilade <git@compilade.net > 
						
						
					 
					
						2024-08-16 09:35:18 +03:00 
						 
				 
			
				
					
						
							
							
								Esko Toivonen 
							
						 
					 
					
						
						
							
						
						6bda7ce6c3 
					 
					
						
						
							
							llama : add pre-tokenizer regexes for BLOOM and gpt3-finnish ( #8850 )  
						
						
						
						
					 
					
						2024-08-15 10:17:12 +03:00 
						 
				 
			
				
					
						
							
							
								Keke Han 
							
						 
					 
					
						
						
							
						
						081fe431aa 
					 
					
						
						
							
							llama : fix codeshell support ( #8599 )  
						
						... 
						
						
						
						* llama : fix codeshell support
* llama : move codeshell after smollm below to respect the enum order 
						
						
					 
					
						2024-07-22 19:43:43 +03:00 
						 
				 
			
				
					
						
							
							
								Jason Stillerman 
							
						 
					 
					
						
						
							
						
						d94c6e0ccb 
					 
					
						
						
							
							llama : add support for SmolLm pre-tokenizer ( #8609 )  
						
						... 
						
						
						
						* Adding SmolLM Pre Tokenizer
* Update convert_hf_to_gguf_update.py
Co-authored-by: compilade <git@compilade.net >
* Update src/llama.cpp
Co-authored-by: compilade <git@compilade.net >
* handle regex
* removed .inp and out .out ggufs
---------
Co-authored-by: compilade <git@compilade.net > 
						
						
					 
					
						2024-07-22 17:43:01 +03:00 
						 
				 
			
				
					
						
							
							
								Jiří Podivín 
							
						 
					 
					
						
						
							
						
						566daa5a5b 
					 
					
						
						
							
							*.py: Stylistic adjustments for python ( #8233 )  
						
						... 
						
						
						
						* Superflous parens in conditionals were removed.
* Unused args in function were removed.
* Replaced unused `idx` var with `_`
* Initializing file_format and format_version attributes
* Renaming constant to capitals
* Preventing redefinition of the `f` var
Signed-off-by: Jiri Podivin <jpodivin@redhat.com > 
						
						
					 
					
						2024-07-22 23:44:53 +10:00 
						 
				 
			
				
					
						
							
							
								Michael Coppola 
							
						 
					 
					
						
						
							
						
						940362224d 
					 
					
						
						
							
							llama : add support for Tekken pre-tokenizer ( #8579 )  
						
						... 
						
						
						
						* llama : Added support for Tekken pre-tokenizer (#8577 )
Removed uneeded `vocab.tokenizer_clean_spaces` assignment
* llama : fix order of pre-tokenizers
* * Tekken pre-tokenizer no longer uses clean_up_tokenization_spaces
* Updated chkhsh for Tekken tokenizer
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com > 
						
						
					 
					
						2024-07-20 16:43:51 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						e235b267a2 
					 
					
						
						
							
							py : switch to snake_case ( #8305 )  
						
						... 
						
						
						
						* py : switch to snake_case
ggml-ci
* cont
ggml-ci
* cont
ggml-ci
* cont : fix link
* gguf-py : use snake_case in scripts entrypoint export
* py : rename requirements for convert_legacy_llama.py
Needed for scripts/check-requirements.sh
---------
Co-authored-by: Francis Couture-Harpin <git@compilade.net > 
						
						
					 
					
						2024-07-05 07:53:33 +03:00 
						 
				 
			
				
					
						
							
							
								ditsuke 
							
						 
					 
					
						
						
							
						
						01a5f06550 
					 
					
						
						
							
							chore: Remove rebase artifacts  
						
						
						
						
					 
					
						2024-07-04 15:39:13 +00:00 
						 
				 
			
				
					
						
							
							
								ditsuke 
							
						 
					 
					
						
						
							
						
						b0a46993df 
					 
					
						
						
							
							build(python): Package scripts with pip-0517 compliance  
						
						
						
						
					 
					
						2024-07-04 15:39:13 +00:00