llama.cpp/ggml/include/ggml-backend.h at 725f23f1f3f0d3adf49f95d8dfa6e7c74adff149 - llama.cpp - Gitea - Peisong Xiao

CS348Project/llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-31 08:51:55 +00:00

Files

David Huang 7f323a589f Add --no-op-offload to improve -ot pp perf in MoE models like llama4 400B (#13386 )

2025-05-11 14:18:39 +02:00

20 KiB

Raw Blame History

View Raw