Default Branch

945501f5ea · llama: fix leaked buffers for mmap + split files (#16765) · Updated 2025-10-27 08:17:31 +00:00

Branches

e582f1ac63 · convert : fix no-lazy dtypes from direct safetensors · Updated 2025-09-09 18:33:01 +00:00    CS348Project

324
8

3f62ee8bee · metal : back to a single queue per device · Updated 2025-09-09 14:06:46 +00:00    CS348Project

328
9

3f62ee8bee · metal : back to a single queue per device · Updated 2025-09-09 14:06:46 +00:00    CS348Project

328
9

7b717fb4b2 · Rewrite llama-run to use llama-server · Updated 2025-09-05 16:22:36 +00:00    CS348Project

365
1

9f2636b7dc · wip · Updated 2025-09-01 08:17:56 +00:00    CS348Project

414
1

d8c17629ac · examples : add compare-mlx · Updated 2025-09-01 06:10:01 +00:00    CS348Project

440
1

4317d5abf5 · wip · Updated 2025-08-28 10:55:21 +00:00    CS348Project

458
1

dc2187d48d · ggml : fix SSM_SCAN for n_groups > 1 · Updated 2025-08-27 21:37:04 +00:00    CS348Project

463
1

7a152de3bb · vulkan: enable Conv2D for Apple after MoltenVK fixed the bug · Updated 2025-08-23 13:57:15 +00:00    CS348Project

508
1

fb573f4440 · ggml-quants : avoid division by zero in make_q3_quants · Updated 2025-08-17 22:26:02 +00:00    CS348Project

578
2

220860aa0c · graph : use F32 accumulators for gpt-oss · Updated 2025-08-14 13:08:31 +00:00    CS348Project

605
1

d9b625edb6 · ggml-quants : handle imatrix for MXFP4 · Updated 2025-08-12 02:12:10 +00:00    CS348Project

631
1

2763dc8b53 · ggml-quants : handle zero amax for MXFP4 · Updated 2025-08-06 20:26:25 +00:00    CS348Project

669
2

ea5e55d03e · Merge branch 'master' into compilade/imatrix-neutral-prior · Updated 2025-08-05 17:34:40 +00:00    CS348Project

671
4

2ec70c964b · tests: Fix OPT_STEP_SGD test-backend-ops · Updated 2025-08-05 04:57:14 +00:00    CS348Project

677
4

145401c9e3 · context : fix logits size overflow for huge batches · Updated 2025-08-05 02:26:46 +00:00    CS348Project

676
2

342e7014db · imatrix : only warn about suffix when output format is unspecified · Updated 2025-08-04 19:12:27 +00:00    CS348Project

681
2

e549515cb3 · memory : handle kv_unified for hybrid models · Updated 2025-08-03 04:45:47 +00:00    CS348Project

690
1

91e67b8583 · imatrix : fix 3d tensor counts · Updated 2025-07-31 15:56:38 +00:00    CS348Project

718
4

b98f80a6b4 · server : test alternative LRU logic · Updated 2025-07-29 18:19:21 +00:00    CS348Project

739
1