younesbelkada
|
097df0ed85
|
remove final_norm
|
2025-07-08 11:26:04 +04:00 |
|
younesbelkada
|
adff470c8a
|
more cleanups and fixed conversion
|
2025-07-08 11:19:38 +04:00 |
|
younesbelkada
|
823696bab1
|
remove unneeded attributes
|
2025-07-08 11:15:21 +04:00 |
|
ibrahimkhadraoui
|
2834a4ac10
|
clean
|
2025-07-08 11:00:30 +04:00 |
|
younesbelkada
|
4bc9e0ca89
|
tensor not required
|
2025-07-08 10:56:34 +04:00 |
|
ibrahimkhadraoui
|
f266d145fc
|
added falcon-h1
|
2025-07-08 10:53:48 +04:00 |
|
ibrahimkhadraoui
|
d41f111462
|
Merge branch 'add-fh1-rebased' of https://github.com/tiiuae/llama.cpp-public into add-fh1-rebased
|
2025-07-08 10:48:07 +04:00 |
|
ibrahimkhadraoui
|
f028a43a91
|
Merge branch 'add-fh1-rebased' of https://github.com/tiiuae/llama.cpp-public into add-fh1-rebased
|
2025-07-08 10:48:01 +04:00 |
|
younesbelkada
|
a846d02327
|
remove todo
|
2025-07-08 10:44:59 +04:00 |
|
Younes B
|
2dee7cf964
|
Apply suggestions from code review
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|
2025-07-08 10:43:50 +04:00 |
|
ibrahimkhadraoui
|
7846c67e5c
|
minor cleanups
|
2025-07-08 10:42:15 +04:00 |
|
younesbelkada
|
8555ee8b2c
|
more cleanups on python conversion;
|
2025-07-08 10:41:33 +04:00 |
|
younesbelkada
|
d473d42832
|
more cleanups
|
2025-07-08 10:39:12 +04:00 |
|
ibrahimkhadraoui
|
e63ee4649e
|
cleanup
|
2025-07-08 10:31:12 +04:00 |
|
ibrahimkhadraoui
|
da8a338531
|
Merge branch 'add-fh1-rebased' of https://github.com/tiiuae/llama.cpp-public into add-fh1-rebased
|
2025-07-08 10:23:18 +04:00 |
|
ibrahimkhadraoui
|
67b2664290
|
cleaning unused hparams
|
2025-07-08 10:20:17 +04:00 |
|
younesbelkada
|
7d7da0b37e
|
d_ssm -> d_inner;
|
2025-07-08 10:18:43 +04:00 |
|
younesbelkada
|
d2f46f18ac
|
moe cleanuips
|
2025-07-07 17:36:22 +04:00 |
|
younesbelkada
|
68cb7845e9
|
more cleanups
|
2025-07-07 17:34:20 +04:00 |
|
Younes B
|
fd203302aa
|
Update src/llama-model-loader.cpp
|
2025-07-07 17:29:50 +04:00 |
|
younesbelkada
|
084873c215
|
some cleanups
|
2025-07-07 17:28:08 +04:00 |
|
younesbelkada
|
632861e6c1
|
some cleanups
|
2025-07-07 17:27:34 +04:00 |
|
younesbelkada
|
f74e266f04
|
fix comment
|
2025-07-07 17:23:47 +04:00 |
|
ibrahimkhadraoui
|
042e5ff90b
|
cleaning debug quant
|
2025-07-07 17:21:54 +04:00 |
|
ibrahimkhadraoui
|
624699c53f
|
cleaning debugging stuff
|
2025-07-07 17:20:24 +04:00 |
|
ibrahimkhadraoui
|
935d46fab0
|
changed ROPE_TYPE
|
2025-07-07 17:01:54 +04:00 |
|
ibrahimkhadraoui
|
b6df0a49d5
|
add bos False
|
2025-07-07 16:57:52 +04:00 |
|
ibrahimkhadraoui
|
ae937f442c
|
rm unused key
|
2025-07-07 16:57:36 +04:00 |
|
ibrahimkhadraoui
|
53446f7e42
|
rm unused MAMBA_CHUNK_SIZE
|
2025-07-07 15:29:56 +04:00 |
|
ibrahimkhadraoui
|
0ad3502839
|
rm extra space
|
2025-07-07 15:26:46 +04:00 |
|
ibrahim khadraoui
|
3afb2a89eb
|
Merge pull request #1 from tiiuae/injected-mup
injected mup
|
2025-07-07 15:20:08 +04:00 |
|
younesbelkada
|
e96cc73390
|
clean ups
|
2025-07-07 15:13:06 +04:00 |
|
younesbelkada
|
a9f3a63dc1
|
injected mup
|
2025-07-07 15:00:25 +04:00 |
|
ibrahimkhadraoui
|
b3bc1fb237
|
Merge branch 'add-fh1-rebased' of https://github.com/tiiuae/llama.cpp-public into add-fh1-rebased
|
2025-07-07 14:36:55 +04:00 |
|
ibrahimkhadraoui
|
286e1fa569
|
fix rope_theta
|
2025-07-07 14:36:51 +04:00 |
|
ibrahimkhadraoui
|
97011d7a1f
|
mup_vec create as float64
|
2025-07-07 14:25:32 +04:00 |
|
ibrahimkhadraoui
|
49d7420964
|
inp_out_ids moved outside of layers loop
|
2025-07-07 14:18:48 +04:00 |
|
ibrahimkhadraoui
|
8c50893820
|
added some cb functions for debugging puposes
|
2025-07-07 14:10:45 +04:00 |
|
Younes B
|
6c39e775dd
|
fix conversion and d_inner
|
2025-07-07 10:56:49 +02:00 |
|
ibrahimkhadraoui
|
441d8d66bd
|
override modify_tensors instead of get_tensors
|
2025-07-07 12:00:57 +04:00 |
|
ibrahimkhadraoui
|
53304c84db
|
remove unused functions from gguf_writer.py
|
2025-07-07 11:18:14 +04:00 |
|
ibrahimkhadraoui
|
c4af0f3ca5
|
mamba_d_ssm added to d_inner find_hparam
|
2025-07-07 11:17:31 +04:00 |
|
ibrahimkhadraoui
|
c56ec07a9a
|
read arch from gguf.MODEL_ARCH
|
2025-07-07 10:34:46 +04:00 |
|
ibrahimkhadraoui
|
280dd2dcb7
|
falcon-h1 specefic vocab resolved
|
2025-07-07 10:25:57 +04:00 |
|
ibrahimkhadraoui
|
7a25441e13
|
fixed multipliers
|
2025-07-04 17:41:03 +04:00 |
|
ibrahimkhadraoui
|
9760c8bc9d
|
conflict solve
|
2025-07-04 16:28:48 +04:00 |
|
ibrahimkhadraoui
|
2aa48dd853
|
Merge branch 'add-fh1-rebased' of https://github.com/tiiuae/llama.cpp-public into add-fh1-rebased
|
2025-07-04 16:25:54 +04:00 |
|
ibrahimkhadraoui
|
3ee7983961
|
fix vocab size
|
2025-07-04 16:25:27 +04:00 |
|
younesbelkada
|
250b4f1074
|
mix instead of max
|
2025-07-04 15:53:47 +04:00 |
|
younesbelkada
|
1fd0574adc
|
try
|
2025-07-04 15:50:43 +04:00 |
|