llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-08 10:07:01 +00:00

Author	SHA1	Message	Date
ibrahimkhadraoui	b6df0a49d5	add bos False	2025-07-07 16:57:52 +04:00
ibrahimkhadraoui	ae937f442c	rm unused key	2025-07-07 16:57:36 +04:00
ibrahimkhadraoui	53446f7e42	rm unused MAMBA_CHUNK_SIZE	2025-07-07 15:29:56 +04:00
ibrahimkhadraoui	0ad3502839	rm extra space	2025-07-07 15:26:46 +04:00
ibrahim khadraoui	3afb2a89eb	Merge pull request #1 from tiiuae/injected-mup injected mup	2025-07-07 15:20:08 +04:00
younesbelkada	e96cc73390	clean ups	2025-07-07 15:13:06 +04:00
younesbelkada	a9f3a63dc1	injected mup	2025-07-07 15:00:25 +04:00
ibrahimkhadraoui	b3bc1fb237	Merge branch 'add-fh1-rebased' of https://github.com/tiiuae/llama.cpp-public into add-fh1-rebased	2025-07-07 14:36:55 +04:00
ibrahimkhadraoui	286e1fa569	fix rope_theta	2025-07-07 14:36:51 +04:00
ibrahimkhadraoui	97011d7a1f	mup_vec create as float64	2025-07-07 14:25:32 +04:00
ibrahimkhadraoui	49d7420964	inp_out_ids moved outside of layers loop	2025-07-07 14:18:48 +04:00
ibrahimkhadraoui	8c50893820	added some cb functions for debugging puposes	2025-07-07 14:10:45 +04:00
Younes B	6c39e775dd	fix conversion and d_inner	2025-07-07 10:56:49 +02:00
ibrahimkhadraoui	441d8d66bd	override modify_tensors instead of get_tensors	2025-07-07 12:00:57 +04:00
ibrahimkhadraoui	53304c84db	remove unused functions from gguf_writer.py	2025-07-07 11:18:14 +04:00
ibrahimkhadraoui	c4af0f3ca5	mamba_d_ssm added to d_inner find_hparam	2025-07-07 11:17:31 +04:00
ibrahimkhadraoui	c56ec07a9a	read arch from gguf.MODEL_ARCH	2025-07-07 10:34:46 +04:00
ibrahimkhadraoui	280dd2dcb7	falcon-h1 specefic vocab resolved	2025-07-07 10:25:57 +04:00
ibrahimkhadraoui	7a25441e13	fixed multipliers	2025-07-04 17:41:03 +04:00
ibrahimkhadraoui	9760c8bc9d	conflict solve	2025-07-04 16:28:48 +04:00
ibrahimkhadraoui	2aa48dd853	Merge branch 'add-fh1-rebased' of https://github.com/tiiuae/llama.cpp-public into add-fh1-rebased	2025-07-04 16:25:54 +04:00
ibrahimkhadraoui	3ee7983961	fix vocab size	2025-07-04 16:25:27 +04:00
younesbelkada	250b4f1074	mix instead of max	2025-07-04 15:53:47 +04:00
younesbelkada	1fd0574adc	try	2025-07-04 15:50:43 +04:00
ibrahimkhadraoui	a6d0067dd7	Merge branch 'add-fh1-rebased' of https://github.com/tiiuae/llama.cpp-public into add-fh1-rebased	2025-07-04 15:37:44 +04:00
ibrahimkhadraoui	15138df48f	small fix ffn_norm	2025-07-04 15:37:40 +04:00
younesbelkada	6c7d9e26e7	fix	2025-07-04 15:25:59 +04:00
ibrahimkhadraoui	d22b4ea425	Merge branch 'add-fh1-rebased' of https://github.com/tiiuae/llama.cpp-public into add-fh1-rebased	2025-07-04 15:10:11 +04:00
ibrahimkhadraoui	2fe057cc40	Revert "fix" This reverts commit `243e4d1a50`.	2025-07-04 15:04:13 +04:00
younesbelkada	22de62cf56	fix	2025-07-04 15:02:14 +04:00
younesbelkada	cce35498d5	pre-norm -> norm	2025-07-04 14:58:33 +04:00
younesbelkada	243e4d1a50	fix	2025-07-04 14:55:31 +04:00
younesbelkada	1415cd8782	another fix	2025-07-04 14:49:59 +04:00
younesbelkada	a39a8423f7	merge	2025-07-04 14:48:22 +04:00
younesbelkada	50eadc7b33	fixes	2025-07-04 14:47:31 +04:00
ibrahimkhadraoui	071f4b7fd8	changed precision for multipliers float 32->64	2025-07-04 14:37:02 +04:00
ibrahimkhadraoui	8bea92261e	python fixes	2025-07-04 14:32:11 +04:00
younesbelkada	14c37ec047	more cleaning on python code	2025-07-03 18:09:30 +04:00
younesbelkada	fdd5cff4ba	minor fix	2025-07-03 17:12:05 +04:00
younesbelkada	0c93ef6a9c	more fixes	2025-07-03 15:26:33 +04:00
younesbelkada	03568c9358	fix	2025-07-03 15:10:18 +04:00
younesbelkada	71a6848e2d	another fix	2025-07-03 15:08:23 +04:00
younesbelkada	f897efdaf6	push more fixes	2025-07-03 15:05:01 +04:00
younesbelkada	991de6cbe4	v1	2025-07-03 14:49:56 +04:00
Nicolò Scipione	7b63a71a6b	Fix conditional enabling following arch checks for ggml-sycl (#14504 ) Signed-off-by: nscipione <nicolo.scipione@codeplay.com> b5819	2025-07-03 11:00:03 +02:00
Xuan-Son Nguyen	0c2ee38ab7	convert : correct gemma 3n conversion (#14450 ) * convert : correct gemma 3n conversion * rm redundant code	2025-07-03 10:03:06 +02:00
Georgi Gerganov	a70c8a0c4b	kv-cache : use ggml_set_rows (#14285 ) * kv-cache : use ggml_set_rows ggml-ci * graph : separate k and v indices ggml-ci * cont : remove redundant ifs ggml-ci * kv-cache : improve find_slot impl * kv-cache : bounds-check when accessing slot_info indices * kv-cache : add comments ggml-ci * ggml : add TODOs for adding GGML_OP_SET_ROWS support in the backends ggml-ci b5817	2025-07-03 10:53:35 +03:00
Georgi Gerganov	9067487c44	ggml : fix FA mask dim 2 and 3 (#14505 ) * ggml : fix FA mask dim 2 and 3 ggml-ci * backends : unsupport batched FA in CUDA and Vulkan ggml-ci * vulkan : disable FA for mask->ne[2] != 1 b5816	2025-07-03 10:46:57 +03:00
Georgi Gerganov	d4cdd9c1c3	ggml : remove kompute backend (#14501 ) ggml-ci b5815	2025-07-03 07:48:32 +03:00
Aman Gupta	55c2646b45	CUDA: add dynamic shared mem to softmax, refactor general usage (#14497 ) b5814	2025-07-03 07:45:11 +08:00

1 2 3 4 5 ...

5863 Commits