This website requires JavaScript.
Explore
Help
Sign In
CS348Project
/
llama.cpp
Watch
5
Star
0
Fork
0
You've already forked llama.cpp
mirror of
https://github.com/ggml-org/llama.cpp.git
synced
2025-11-03 09:22:01 +00:00
Code
Issues
Packages
Projects
Releases
Wiki
Activity
Files
2ec7cda70675c04a9d08afd036e2d44219151390
llama.cpp
/
ggml
History
Max Krasnyansky
517b7170e1
cpu: introduce chunking for repack matmuls and enable matmul-id chunking on ARM64 (
#16833
)
...
Very similar implementation to the flash-attention chunking, with similar benefits.
2025-10-30 09:06:13 -07:00
..
cmake
ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (
#15094
)
2025-08-07 13:45:41 +02:00
include
model: add support for qwen3vl series (
#16780
)
2025-10-30 16:19:14 +01:00
src
cpu: introduce chunking for repack matmuls and enable matmul-id chunking on ARM64 (
#16833
)
2025-10-30 09:06:13 -07:00
.gitignore
vulkan : cmake integration (
#8119
)
2024-07-13 18:12:39 +02:00
CMakeLists.txt
Add experimental ggml-hexagon backend for the Hexagon NPU (
#16547
)
2025-10-22 13:47:09 -07:00