Default Branch

945501f5ea · llama: fix leaked buffers for mmap + split files (#16765) · Updated 2025-10-27 08:17:31 +00:00

Branches

62a9f34bae · llama-graph : fix recurrent state copy · Updated 2025-06-10 04:26:30 +00:00    CS348Project

1140
3

c257a8871c · cont : fix defrag erasing cells that didn't move · Updated 2025-06-09 17:45:56 +00:00    CS348Project

1147
3

ac35e50c16 · Update tools/llama-bench/llama-bench.cpp · Updated 2025-05-31 22:38:37 +00:00    CS348Project

1207
3

d3a2eb592d · disable on windows · Updated 2025-05-31 21:17:18 +00:00    CS348Project

1198
12

9065ca71a2 · tests : sampling tests use min_keep == 0 · Updated 2025-05-27 08:30:41 +00:00    CS348Project

1253
3

108d484ab2 · tts : fix n_ubatch + make WavTokenizer cache-less · Updated 2025-05-22 18:58:10 +00:00    CS348Project

1297
1

b06a954bbc · llama_encode : only force non-causal attention for enc-dec models · Updated 2025-05-19 17:43:59 +00:00    CS348Project

1330
1

8282d74692 · bench : handle decode errors · Updated 2025-05-14 19:36:29 +00:00    CS348Project

1368
1

237acc7cd5 · server : update readme + return json for "meta" field · Updated 2025-05-14 12:30:12 +00:00    CS348Project

1377
2

78d70223c3 · metal : use FA-vec kernel up to batch size 20 · Updated 2025-05-13 07:38:06 +00:00    CS348Project

1394
3

6107303ab0 · llama : remove logits_all flag + reorder llama_context_params · Updated 2025-05-08 10:01:41 +00:00    CS348Project

1447
2

16843dba33 · metal : pad mm results · Updated 2025-05-04 06:13:52 +00:00    CS348Project

1483
1

15dea7bbdf · opt : remove print [no ci] · Updated 2025-05-02 18:25:29 +00:00    CS348Project

1487
4

65202d2985 · sync : ggml · Updated 2025-05-01 06:59:02 +00:00    CS348Project

1518
3

b710758323 · readme : update hot topics · Updated 2025-04-28 08:04:28 +00:00    CS348Project

1552
1

37ae6a281a · Fixes Qwen2.5VL segfault during inference with https://github.com/ggml-org/llama.cpp/pull/12402 as has_qwen2vl_merger migration was incomplete · Updated 2025-04-27 10:36:57 +00:00    CS348Project

1559
1

ed68474f76 · wip · Updated 2025-04-25 16:07:09 +00:00    CS348Project

1582
2

3fe362fe49 · gguf-py : use ThreadPoolExecutor when writing tensors · Updated 2025-04-12 04:00:51 +00:00    CS348Project

1638
4

098f0e5eea · test · Updated 2025-04-10 09:35:16 +00:00    CS348Project

1658
1

e9e1882d2d · rm tail space · Updated 2025-04-08 05:43:11 +00:00    CS348Project

1681
4