ibrahimkhadraoui
b6df0a49d5
add bos False
2025-07-07 16:57:52 +04:00
ibrahimkhadraoui
ae937f442c
rm unused key
2025-07-07 16:57:36 +04:00
ibrahimkhadraoui
53446f7e42
rm unused MAMBA_CHUNK_SIZE
2025-07-07 15:29:56 +04:00
ibrahimkhadraoui
0ad3502839
rm extra space
2025-07-07 15:26:46 +04:00
ibrahim khadraoui
3afb2a89eb
Merge pull request #1 from tiiuae/injected-mup
...
injected mup
2025-07-07 15:20:08 +04:00
younesbelkada
e96cc73390
clean ups
2025-07-07 15:13:06 +04:00
younesbelkada
a9f3a63dc1
injected mup
2025-07-07 15:00:25 +04:00
ibrahimkhadraoui
b3bc1fb237
Merge branch 'add-fh1-rebased' of https://github.com/tiiuae/llama.cpp-public into add-fh1-rebased
2025-07-07 14:36:55 +04:00
ibrahimkhadraoui
286e1fa569
fix rope_theta
2025-07-07 14:36:51 +04:00
ibrahimkhadraoui
97011d7a1f
mup_vec create as float64
2025-07-07 14:25:32 +04:00
ibrahimkhadraoui
49d7420964
inp_out_ids moved outside of layers loop
2025-07-07 14:18:48 +04:00
ibrahimkhadraoui
8c50893820
added some cb functions for debugging puposes
2025-07-07 14:10:45 +04:00
Younes B
6c39e775dd
fix conversion and d_inner
2025-07-07 10:56:49 +02:00
ibrahimkhadraoui
441d8d66bd
override modify_tensors instead of get_tensors
2025-07-07 12:00:57 +04:00
ibrahimkhadraoui
53304c84db
remove unused functions from gguf_writer.py
2025-07-07 11:18:14 +04:00
ibrahimkhadraoui
c4af0f3ca5
mamba_d_ssm added to d_inner find_hparam
2025-07-07 11:17:31 +04:00
ibrahimkhadraoui
c56ec07a9a
read arch from gguf.MODEL_ARCH
2025-07-07 10:34:46 +04:00
ibrahimkhadraoui
280dd2dcb7
falcon-h1 specefic vocab resolved
2025-07-07 10:25:57 +04:00
ibrahimkhadraoui
7a25441e13
fixed multipliers
2025-07-04 17:41:03 +04:00
ibrahimkhadraoui
9760c8bc9d
conflict solve
2025-07-04 16:28:48 +04:00
ibrahimkhadraoui
2aa48dd853
Merge branch 'add-fh1-rebased' of https://github.com/tiiuae/llama.cpp-public into add-fh1-rebased
2025-07-04 16:25:54 +04:00
ibrahimkhadraoui
3ee7983961
fix vocab size
2025-07-04 16:25:27 +04:00
younesbelkada
250b4f1074
mix instead of max
2025-07-04 15:53:47 +04:00
younesbelkada
1fd0574adc
try
2025-07-04 15:50:43 +04:00
ibrahimkhadraoui
a6d0067dd7
Merge branch 'add-fh1-rebased' of https://github.com/tiiuae/llama.cpp-public into add-fh1-rebased
2025-07-04 15:37:44 +04:00
ibrahimkhadraoui
15138df48f
small fix ffn_norm
2025-07-04 15:37:40 +04:00
younesbelkada
6c7d9e26e7
fix
2025-07-04 15:25:59 +04:00
ibrahimkhadraoui
d22b4ea425
Merge branch 'add-fh1-rebased' of https://github.com/tiiuae/llama.cpp-public into add-fh1-rebased
2025-07-04 15:10:11 +04:00
ibrahimkhadraoui
2fe057cc40
Revert "fix"
...
This reverts commit 243e4d1a50 .
2025-07-04 15:04:13 +04:00
younesbelkada
22de62cf56
fix
2025-07-04 15:02:14 +04:00
younesbelkada
cce35498d5
pre-norm -> norm
2025-07-04 14:58:33 +04:00
younesbelkada
243e4d1a50
fix
2025-07-04 14:55:31 +04:00
younesbelkada
1415cd8782
another fix
2025-07-04 14:49:59 +04:00
younesbelkada
a39a8423f7
merge
2025-07-04 14:48:22 +04:00
younesbelkada
50eadc7b33
fixes
2025-07-04 14:47:31 +04:00
ibrahimkhadraoui
071f4b7fd8
changed precision for multipliers float 32->64
2025-07-04 14:37:02 +04:00
ibrahimkhadraoui
8bea92261e
python fixes
2025-07-04 14:32:11 +04:00
younesbelkada
14c37ec047
more cleaning on python code
2025-07-03 18:09:30 +04:00
younesbelkada
fdd5cff4ba
minor fix
2025-07-03 17:12:05 +04:00
younesbelkada
0c93ef6a9c
more fixes
2025-07-03 15:26:33 +04:00
younesbelkada
03568c9358
fix
2025-07-03 15:10:18 +04:00
younesbelkada
71a6848e2d
another fix
2025-07-03 15:08:23 +04:00
younesbelkada
f897efdaf6
push more fixes
2025-07-03 15:05:01 +04:00
younesbelkada
991de6cbe4
v1
2025-07-03 14:49:56 +04:00
Nicolò Scipione
7b63a71a6b
Fix conditional enabling following arch checks for ggml-sycl ( #14504 )
...
Signed-off-by: nscipione <nicolo.scipione@codeplay.com >
b5819
2025-07-03 11:00:03 +02:00
Xuan-Son Nguyen
0c2ee38ab7
convert : correct gemma 3n conversion ( #14450 )
...
* convert : correct gemma 3n conversion
* rm redundant code
2025-07-03 10:03:06 +02:00
Georgi Gerganov
a70c8a0c4b
kv-cache : use ggml_set_rows ( #14285 )
...
* kv-cache : use ggml_set_rows
ggml-ci
* graph : separate k and v indices
ggml-ci
* cont : remove redundant ifs
ggml-ci
* kv-cache : improve find_slot impl
* kv-cache : bounds-check when accessing slot_info indices
* kv-cache : add comments
ggml-ci
* ggml : add TODOs for adding GGML_OP_SET_ROWS support in the backends
ggml-ci
b5817
2025-07-03 10:53:35 +03:00
Georgi Gerganov
9067487c44
ggml : fix FA mask dim 2 and 3 ( #14505 )
...
* ggml : fix FA mask dim 2 and 3
ggml-ci
* backends : unsupport batched FA in CUDA and Vulkan
ggml-ci
* vulkan : disable FA for mask->ne[2] != 1
b5816
2025-07-03 10:46:57 +03:00
Georgi Gerganov
d4cdd9c1c3
ggml : remove kompute backend ( #14501 )
...
ggml-ci
b5815
2025-07-03 07:48:32 +03:00
Aman Gupta
55c2646b45
CUDA: add dynamic shared mem to softmax, refactor general usage ( #14497 )
b5814
2025-07-03 07:45:11 +08:00