Files
llama.cpp/ggml/src/ggml-cann
hipudding 7a395f67a7 CANN: Add support for async operator submission (#12864)
Submit operators using asynchronous threads to improve performance.

Use the environment variable GGML_CANN_ASYNC_MODE to control whether
asynchronous submission is enabled. It is disabled by default.

Testing shows a 10%–20% performance improvement in scenarios with
small parameter sizes, especially in quantized models.
2025-04-17 20:34:16 +08:00
..
2025-04-10 08:51:52 +08:00
2024-09-08 11:05:55 +03:00