llama.cpp/llama.cpp at cd9aea63b577a83def84dbd6dcd90a6fa02af745

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-01 09:01:57 +00:00

Files

Paul Tsochantaris e5ca3937c6 llama : do not cap thread count when MoE on CPU (#5419 )

* Not capping thread count when MoE inference is running on CPU

* Whitespace

2024-02-09 12:48:06 +02:00

View Raw