mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2025-10-28 08:31:25 +00:00
CANN: Support eager execution mode under ACL graph compilation (#15712)
* [CANN] Support eager execution mode under ACL graph compilation Add support for running operators in eager mode while ACL graph compilation is enabled. This allows bypassing graph execution and directly submitting ops, which is useful for debugging and reducing graph build overhead in certain scenarios. Signed-off-by: noemotiovon <757486878@qq.com> * fix typo Signed-off-by: noemotiovon <757486878@qq.com> * rename to acl_graph_mode Signed-off-by: noemotiovon <757486878@qq.com> --------- Signed-off-by: noemotiovon <757486878@qq.com>
This commit is contained in:
@@ -314,3 +314,7 @@ Controls automatic cleanup of the memory pool. This option is only effective whe
|
||||
|
||||
Converting the matmul weight format from ND to NZ can significantly improve performance on the 310I DUO NPU.
|
||||
|
||||
### GGML_CANN_DISABLE_ACL_GRAPH
|
||||
|
||||
When this variable is set, ACL graph execution is disabled and operators are executed in an op-by-op (eager) mode.
|
||||
This mode is mainly intended for debugging or for cases where the overhead of graph construction and execution is not desirable.
|
||||
|
||||
Reference in New Issue
Block a user