Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						b8e2194efc 
					 
					
						
						
							
							sync : ggml  
						
						... 
						
						
						
						ggml-ci 
						
						
							
 
						
					 
					
						2025-06-10 09:21:56 +03:00 
						 
				 
			
				
					
						
							
							
								Kai Pastor 
							
						 
					 
					
						
						
							
						
						1a3b5e80f7 
					 
					
						
						
							
							Add in-build ggml::ggml ALIAS library (ggml/1260)  
						
						... 
						
						
						
						Enable uniform linking with subproject and with find_package. 
						
						
							
						
					 
					
						2025-06-10 09:21:56 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						1f63e75f3b 
					 
					
						
						
							
							metal : use less stack memory in FA kernel ( #14088 )  
						
						... 
						
						
						
						* metal : use less stack memory in FA kernel
ggml-ci
* cont : fix BF16 variant 
						
						
							
 
						
					 
					
						2025-06-09 23:05:02 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						40cbf571c9 
					 
					
						
						
							
							kv-cache : fix shift and defrag logic ( #14081 )  
						
						... 
						
						
						
						* kv-cache : fix shift
ggml-ci
* cont : reset shift[i]
ggml-ci
* cont : fix defrag erasing cells that didn't move
ggml-ci 
						
						
							
 
						
					 
					
						2025-06-09 23:04:35 +03:00 
						 
				 
			
				
					
						
							
							
								Diego Devesa 
							
						 
					 
					
						
						
							
						
						7f4fbe5183 
					 
					
						
						
							
							llama : allow building all tests on windows when not using shared libs ( #13980 )  
						
						... 
						
						
						
						* llama : allow building all tests on windows when not using shared libraries
* add static windows build to ci
* tests : enable debug logs for test-chat
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com > 
						
						
							
 
						
					 
					
						2025-06-09 20:03:09 +02:00 
						 
				 
			
				
					
						
							
							
								xctan 
							
						 
					 
					
						
						
							
						
						f470bc36be 
					 
					
						
						
							
							ggml-cpu : split arch-specific implementations ( #13892 )  
						
						... 
						
						
						
						* move ggml-cpu-aarch64 to repack
* split quantize_row_q8_0/1
* split helper functions
* split ggml_vec_dot_q4_0_q8_0
* split ggml_vec_dot_q4_1_q8_1
* split ggml_vec_dot_q5_0_q8_0
* split ggml_vec_dot_q5_1_q8_1
* split ggml_vec_dot_q8_0_q8_0
* split ggml_vec_dot_tq1_0_q8_K
* split ggml_vec_dot_tq2_0_q8_K
* split ggml_vec_dot_q2_K_q8_K
* split ggml_vec_dot_q3_K_q8_K
* split ggml_vec_dot_q4_K_q8_K
* split ggml_vec_dot_q5_K_q8_K
* split ggml_vec_dot_q6_K_q8_K
* split ggml_vec_dot_iq2_xxs_q8_K
* split ggml_vec_dot_iq2_xs_q8_K
* split ggml_vec_dot_iq2_s_q8_K
* split ggml_vec_dot_iq3_xxs_q8_K
* split ggml_vec_dot_iq3_s_q8_K
* split ggml_vec_dot_iq1_s_q8_K
* split ggml_vec_dot_iq1_m_q8_K
* split ggml_vec_dot_iq4_nl_q8_0
* split ggml_vec_dot_iq4_xs_q8_K
* fix typos
* fix missing prototypes
* rename ggml-cpu-quants.c
* rename ggml-cpu-traits
* rename arm folder
* move cpu-feats-x86.cpp
* rename ggml-cpu-hbm
* update arm detection macro in quants.c
* move iq quant tables
* split ggml_quantize_mat_q8_0/K
* split ggml_gemv_*
* split ggml_gemm_*
* rename namespace aarch64 to repack
* use weak aliases to replace test macros
* rename GGML_CPU_AARCH64 to GGML_CPU_REPACK
* rename more aarch64 to repack
* clean up rebase leftover
* fix compilation errors
* remove trailing spaces
* try to fix clang compilation errors
* try to fix clang compilation errors again
* try to fix clang compilation errors, 3rd attempt
* try to fix clang compilation errors, 4th attempt
* try to fix clang compilation errors, 5th attempt
* try to fix clang compilation errors, 6th attempt
* try to fix clang compilation errors, 7th attempt
* try to fix clang compilation errors, 8th attempt
* try to fix clang compilation errors, 9th attempt
* more cleanup
* fix compilation errors
* fix apple targets
* fix a typo in arm version of ggml_vec_dot_q4_K_q8_K
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com > 
						
						
							
 
						
					 
					
						2025-06-09 16:47:13 +02:00 
						 
				 
			
				
					
						
							
							
								Diego Devesa 
							
						 
					 
					
						
						
							
						
						8f47e25f56 
					 
					
						
						
							
							cuda : fix device sync on buffer clear ( #14033 )  
						
						
						
						
							
 
						
					 
					
						2025-06-09 16:36:26 +02:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						201b31dc2e 
					 
					
						
						
							
							graph : fix geglu ( #14077 )  
						
						... 
						
						
						
						ggml-ci 
						
						
							
 
						
					 
					
						2025-06-09 17:17:31 +03:00 
						 
				 
			
				
					
						
							
							
								Xinpeng Dou 
							
						 
					 
					
						
						
							
						
						e21d2d4ae2 
					 
					
						
						
							
							CANN: Simplify the environment variable setting( #13104 )  
						
						... 
						
						
						
						* Simplify the environment variable setting to specify the memory pool type.
* Adjust the GGML_CANN_ASYNC_MODE setting to accept yes, enable, 1, or on (case-insensitive) as valid options.
* update
* fix CI
* update
* delete whitespace
* fix according to review
* update CANN.md
* update CANN.md 
						
						
							
 
						
					 
					
						2025-06-09 19:47:39 +08:00 
						 
				 
			
				
					
						
							
							
								R0CKSTAR 
							
						 
					 
					
						
						
							
						
						dc0623fddb 
					 
					
						
						
							
							webui: fix sidebar being covered by main content ( #14082 )  
						
						... 
						
						
						
						* webui: fix sidebar being covered by main content
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
* webui: update index.html.gz
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
---------
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com > 
						
						
							
						
					 
					
						2025-06-09 12:01:17 +02:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						87d34b381d 
					 
					
						
						
							
							server : fix LRU check ( #14079 )  
						
						... 
						
						
						
						ggml-ci 
						
						
							
 
						
					 
					
						2025-06-09 12:57:58 +03:00 
						 
				 
			
				
					
						
							
							
								Nicolò Scipione 
							
						 
					 
					
						
						
							
						
						b460d16ae8 
					 
					
						
						
							
							sycl: Add reorder to Q6_K mmvq implementation ( #13885 )  
						
						... 
						
						
						
						* Add Reorder to Q6_K mmvq implementation
* Address PR comments: clean up comments
* Remove unused parameter after refactoring q4_k
* Adding inline to function and removing unnecessary reference to int
---------
Signed-off-by: nscipione <nicolo.scipione@codeplay.com > 
						
						
							
 
						
					 
					
						2025-06-09 11:47:07 +02:00 
						 
				 
			
				
					
						
							
							
								Đinh Trọng Huy 
							
						 
					 
					
						
						
							
						
						91a8ee6a6f 
					 
					
						
						
							
							add geglu activation function ( #14074 )  
						
						... 
						
						
						
						Co-authored-by: dinhhuy <huy.dinh@brains-tech.co.jp > 
						
						
							
 
						
					 
					
						2025-06-09 05:15:31 +01:00 
						 
				 
			
				
					
						
							
							
								Yuanhao Ji 
							
						 
					 
					
						
						
							
						
						056eb74534 
					 
					
						
						
							
							CANN: Enable labeler for Ascend NPU ( #13914 )  
						
						
						
						
							
						
					 
					
						2025-06-09 11:20:06 +08:00 
						 
				 
			
				
					
						
							
							
								Diego Devesa 
							
						 
					 
					
						
						
							
						
						247e5c6e44 
					 
					
						
						
							
							cuda : fix buffer type check with integrated GPUs ( #14069 )  
						
						
						
						
							
 
						
					 
					
						2025-06-08 11:39:56 -07:00 
						 
				 
			
				
					
						
							
							
								吴小白 
							
						 
					 
					
						
						
							
						
						5787b5da57 
					 
					
						
						
							
							ci: add LoongArch cross-compile build ( #13944 )  
						
						
						
						
							
						
					 
					
						2025-06-07 10:39:11 -03:00 
						 
				 
			
				
					
						
							
							
								Akarshan Biswas 
							
						 
					 
					
						
						
							
						
						228f34c9ce 
					 
					
						
						
							
							SYCL: Implement few same quantized type copy kernels ( #13739 )  
						
						... 
						
						
						
						* SYCL: Implement few same quantized type copy kernels
* Use memcpy for copying contiguous tensors
ggml-ci
* feat(sycl): add contiguous tensor copy support and device checks
Adds a memcpy path for contiguous tensors of the same type to optimize data transfer. Updates device support checks to recognize contiguous tensor operations, improving compatibility and performance.
* refactor: replace specific block copy functions with template
The changes replace multiple redundant block copy functions (e.g., cpy_block_q8_0_q8_0, cpy_block_q5_0_q5_0) with a single templated function cpy_blck_q_q. This reduces code duplication by using a generic template that works for any block type, improving maintainability while preserving the same functionality. The template is instantiated with specific block types (e.g., block_q8_0) where needed.
* Exclude BF16 support for COPY tensors for now
ggml-ci
* perf: adjust SYCL copy kernel block sizes for efficiency
Use ceil_div to ensure full element coverage and update nd_range parameters to better align with SYCL block sizes, improving parallelism and device utilization in copy operations. 
						
						
							
 
						
					 
					
						2025-06-07 18:58:20 +05:30 
						 
				 
			
				
					
						
							
							
								Sigbjørn Skjæret 
							
						 
					 
					
						
						
							
						
						0974ad7a7c 
					 
					
						
						
							
							llama : fix llama_model_chat_template with template name (LLM_KV with suffix) ( #14050 )  
						
						
						
						
							
 
						
					 
					
						2025-06-07 14:13:12 +02:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						745aa5319b 
					 
					
						
						
							
							llama : deprecate llama_kv_self_ API ( #14030 )  
						
						... 
						
						
						
						* llama : deprecate llama_kv_self_ API
ggml-ci
* llama : allow llama_memory_(nullptr)
ggml-ci
* memory : add flag for optional data clear in llama_memory_clear
ggml-ci 
						
						
							
 
						
					 
					
						2025-06-06 14:11:15 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						487a5e0401 
					 
					
						
						
							
							context : fix SWA-related warning for multiple sequences ( #14045 )  
						
						
						
						
							
 
						
					 
					
						2025-06-06 13:29:18 +03:00 
						 
				 
			
				
					
						
							
							
								Sigbjørn Skjæret 
							
						 
					 
					
						
						
							
						
						d17a809ef0 
					 
					
						
						
							
							llama : support multiple classifier outputs and labels ( #13940 )  
						
						
						
						
							
 
						
					 
					
						2025-06-06 09:03:25 +02:00 
						 
				 
			
				
					
						
							
							
								Sigbjørn Skjæret 
							
						 
					 
					
						
						
							
						
						1caae7fc6c 
					 
					
						
						
							
							gguf-py : add add_classifier_output_labels method to writer ( #14031 )  
						
						... 
						
						
						
						* add add_classifier_output_labels
* use add_classifier_output_labels 
						
						
							
						
					 
					
						2025-06-05 17:42:31 +02:00 
						 
				 
			
				
					
						
							
							
								Masato Nakasaka 
							
						 
					 
					
						
						
							
						
						669c13e0f6 
					 
					
						
						
							
							vulkan: Enable VK_KHR_cooperative_matrix extension for Intel Xe2 GPUs ( #14001 )  
						
						... 
						
						
						
						* allowing B580 and U9-288V
* experimenting code to detect Xe2
* allowing coopmat only for Xe2 GPUs
* fixed comment wording
* fixed comment wording
* removed unnecessary driver check 
						
						
							
 
						
					 
					
						2025-06-05 16:00:29 +02:00 
						 
				 
			
				
					
						
							
							
								pockers21 
							
						 
					 
					
						
						
							
						
						146b88e8b3 
					 
					
						
						
							
							ci: fix CUDA build failure on autodl cloud machines ( #14005 )  
						
						... 
						
						
						
						Replace CMAKE_CUDA_ARCHITECTURES=native with nvidia-smi detection
as 'native' fails on autodl cloud environments.
Co-authored-by: pockers21 <liyang2@uniontech.com > 
						
						
							
						
					 
					
						2025-06-05 16:25:29 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						7f37b6cf1e 
					 
					
						
						
							
							memory : migrate from llama_kv_cache to more generic llama_memory ( #14006 )  
						
						... 
						
						
						
						* memory : merge llama_kv_cache into llama_memory + new `llama_memory` API
ggml-ci
* context : fix casts
ggml-ci 
						
						
							
 
						
					 
					
						2025-06-05 15:29:22 +03:00 
						 
				 
			
				
					
						
							
							
								Diego Devesa 
							
						 
					 
					
						
						
							
						
						3a077146a4 
					 
					
						
						
							
							llama : allow using mmap without PrefetchVirtualMemory, apply GGML_WIN_VER to llama.cpp sources ( #14013 )  
						
						
						
						
							
 
						
					 
					
						2025-06-05 11:57:42 +02:00 
						 
				 
			
				
					
						
							
							
								Olexandr88 
							
						 
					 
					
						
						
							
						
						d01d112abb 
					 
					
						
						
							
							readme : add badge ( #13938 )  
						
						
						
						
							
						
					 
					
						2025-06-05 10:50:55 +03:00 
						 
				 
			
				
					
						
							
							
								Sigbjørn Skjæret 
							
						 
					 
					
						
						
							
						
						9f47fa5792 
					 
					
						
						
							
							vocab : warn about missing mask token ( #14022 )  
						
						
						
						
							
 
						
					 
					
						2025-06-05 09:29:18 +02:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						9e31bec4fd 
					 
					
						
						
							
							context : fix pos_min initialization upon error decode ( #14008 )  
						
						... 
						
						
						
						ggml-ci 
						
						
							
 
						
					 
					
						2025-06-05 09:06:29 +03:00 
						 
				 
			
				
					
						
							
							
								Jeff Bolz 
							
						 
					 
					
						
						
							
						
						5a8ae3053c 
					 
					
						
						
							
							vulkan: automatically deduce size of push constants ( #13936 )  
						
						
						
						
							
 
						
					 
					
						2025-06-05 07:17:58 +02:00 
						 
				 
			
				
					
						
							
							
								Ervin Áron Tasnádi 
							
						 
					 
					
						
						
							
						
						0d3984424f 
					 
					
						
						
							
							ggml-vulkan: adds support for op CONV_TRANSPOSE_1D ( #13813 )  
						
						... 
						
						
						
						* * ggml-vulkan: adds op CONV_TRANSPOSE_1D
* test-backend-ops: adds more spohisticated tests for CONV_TRANSPOSE_1D
* Missing barrier added to shader.
Number of additional tests reduced to 108.
* * Fixes typo in variable name.
* Removes extra whitespaces.
* Adds int64->int32 casts to prevent possible warnings.
* Problem size reduced in tests to pass tests with llvmpipe.
* supports_op condition moved from unintended position 
						
						
							
 
						
					 
					
						2025-06-04 22:02:00 +02:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						3e63a58ef7 
					 
					
						
						
							
							kv-cache : refactor the update/defrag mechanism ( #13988 )  
						
						... 
						
						
						
						* kv-cache : refactor update mechanism
ggml-ci
* memory : improve status handling
* defrag : reset head + add comments
ggml-ci
* cont : minor fixes
ggml-ci 
						
						
							
 
						
					 
					
						2025-06-04 18:58:20 +03:00 
						 
				 
			
				
					
						
							
							
								Diego Devesa 
							
						 
					 
					
						
						
							
						
						2589ad3704 
					 
					
						
						
							
							ci : remove cuda 11.7 releases, switch runner to windows 2022 ( #13997 )  
						
						
						
						
							
 
						
					 
					
						2025-06-04 15:37:40 +02:00 
						 
				 
			
				
					
						
							
							
								Diego Devesa 
							
						 
					 
					
						
						
							
						
						482548716f 
					 
					
						
						
							
							releases : use dl backend for linux release, remove arm64 linux release ( #13996 )  
						
						
						
						
							
 
						
					 
					
						2025-06-04 13:15:54 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						3ac67535c8 
					 
					
						
						
							
							llama-graph : use ggml_repeat_4d ( #13998 )  
						
						
						
						
							
 
						
					 
					
						2025-06-04 10:11:26 +02:00 
						 
				 
			
				
					
						
							
							
								Johannes Gäßler 
							
						 
					 
					
						
						
							
						
						0b4be4c435 
					 
					
						
						
							
							CUDA: fix FTZ in FA for Gemma 3 ( #13991 )  
						
						
						
						
							
 
						
					 
					
						2025-06-04 08:57:05 +02:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						e0e806f52e 
					 
					
						
						
							
							kv-cache : fix unified::seq_rm to work with seq_id < 0 ( #13985 )  
						
						... 
						
						
						
						ggml-ci 
						
						
							
 
						
					 
					
						2025-06-04 09:50:32 +03:00 
						 
				 
			
				
					
						
							
							
								Jeff Bolz 
							
						 
					 
					
						
						
							
						
						7e00e60ef8 
					 
					
						
						
							
							vulkan: fix warnings in perf logger querypool code ( #13937 )  
						
						
						
						
							
						
					 
					
						2025-06-03 20:30:22 +02:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						ea1431b0fa 
					 
					
						
						
							
							docs : add "Quick start" section for new users ( #13862 )  
						
						... 
						
						
						
						* docs : add "Quick start" section for non-technical users
* rm flox
* Update README.md 
						
						
							
						
					 
					
						2025-06-03 13:09:36 +02:00 
						 
				 
			
				
					
						
							
							
								lhez 
							
						 
					 
					
						
						
							
						
						71e74a3ac9 
					 
					
						
						
							
							opencl: add backend_synchronize ( #13939 )  
						
						... 
						
						
						
						* This is not needed by the normal use where the result is read
  using `tensor_get`, but it allows perf mode of `test-backend-ops`
  to properly measure performance. 
						
						
							
 
						
					 
					
						2025-06-02 16:54:58 -07:00 
						 
				 
			
				
					
						
							
							
								rmatif 
							
						 
					 
					
						
						
							
						
						bfb1e012a0 
					 
					
						
						
							
							OpenCL: Add concat, tsembd, upscale, tanh, pad and repeat ( #13840 )  
						
						... 
						
						
						
						* add concat, pad, repeat, tsembd, tanh, upscale
* small fixes 
						
						
							
 
						
					 
					
						2025-06-02 16:53:36 -07:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						3637576288 
					 
					
						
						
							
							server : disable speculative decoding for SWA models ( #13970 )  
						
						... 
						
						
						
						* server : use swa-full fo draft context
ggml-ci
* server : disable speculative decoding for SWA models 
						
						
							
 
						
					 
					
						2025-06-02 21:34:40 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						ea394d7ab1 
					 
					
						
						
							
							metal : use F32 accumulators in FA kernels ( #13975 )  
						
						... 
						
						
						
						ggml-ci 
						
						
							
 
						
					 
					
						2025-06-02 21:33:40 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						5582c49c39 
					 
					
						
						
							
							gemma : more consistent attention scaling for v2 and v3 ( #13951 )  
						
						... 
						
						
						
						* gemma : fix attn scale for 27B
* cont : apply scale before attn
* cont : consistent attention scaling 
						
						
							
 
						
					 
					
						2025-06-02 20:54:26 +03:00 
						 
				 
			
				
					
						
							
							
								Olivier Chafik 
							
						 
					 
					
						
						
							
						
						c9bbc77931 
					 
					
						
						
							
							server: update deepseek reasoning format (pass reasoning_content as diffs) (#13933 )  
						
						... 
						
						
						
						* server: update deepseek reasoning format (now in reasoning_content diffs), add legacy option for compat
* update unit/test_tool_call.py::test_thoughts 
						
						
							
 
						
					 
					
						2025-06-02 10:15:44 -07:00 
						 
				 
			
				
					
						
							
							
								Xuan-Son Nguyen 
							
						 
					 
					
						
						
							
						
						bfd322796c 
					 
					
						
						
							
							mtmd : fix memory leak in mtmd_helper_eval_chunk_single ( #13961 )  
						
						... 
						
						
						
						* mtmd : fix memory in mtmd_helper_eval_chunk_single
* mtmd-cli : fix mem leak
* Update tools/mtmd/mtmd-cli.cpp
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com > 
						
						
							
 
						
					 
					
						2025-06-02 16:29:28 +02:00 
						 
				 
			
				
					
						
							
							
								shalinib-ibm 
							
						 
					 
					
						
						
							
						
						093e3f1feb 
					 
					
						
						
							
							cmake : Handle mixed-case 'Power' strings in POWER CPU detection ( #13966 )  
						
						... 
						
						
						
						Some systems report the CPU implementation as "Power11" instead of "POWER11".
The existing CMake logic uses a case-sensitive regular expression to extract
the CPU generation, which fails when the casing doesn't exactly match "POWER".
This patch provides a fix by first converting the string to uppercase before applying the regex.
Signed-off-by: root <root@rheldb2v.pperf.tadn.ibm.com >
Co-authored-by: root <root@rheldb2v.pperf.tadn.ibm.com > 
						
						
							
 
						
					 
					
						2025-06-02 15:18:36 +03:00 
						 
				 
			
				
					
						
							
							
								Atharva Dubey 
							
						 
					 
					
						
						
							
						
						663445b0de 
					 
					
						
						
							
							sycl: quantize and reorder the input to q8_1 when reorder is enabled ( #13826 )  
						
						... 
						
						
						
						* [WIP]: fuse q8 quantization and reorder
* wip2: fuse q8 quantization and reorder
* working q8 reorder commit
* restored common.hpp
* remove debug prints
* remove unnecessary headers and remove trailing whitespace
* Update ggml/src/ggml-sycl/ggml-sycl.cpp
Co-authored-by: Alberto Cabrera Pérez <alberto.cabrera@intel.com >
---------
Co-authored-by: Alberto Cabrera Pérez <alberto.cabrera@intel.com > 
						
						
							
 
						
					 
					
						2025-06-02 10:12:20 +01:00 
						 
				 
			
				
					
						
							
							
								Johannes Gäßler 
							
						 
					 
					
						
						
							
						
						7675c555a1 
					 
					
						
						
							
							gguf: fix failure on version == 0 ( #13956 )  
						
						
						
						
							
 
						
					 
					
						2025-06-01 18:08:05 +02:00 
						 
				 
			
				
					
						
							
							
								Sigbjørn Skjæret 
							
						 
					 
					
						
						
							
						
						5e1c3aed40 
					 
					
						
						
							
							convert : fix nomic-bert-moe mask token ( #13757 )  
						
						
						
						
							
 
						
					 
					
						2025-06-01 18:07:21 +02:00