llama : add --no-host to disable host buffers (#16310)

* implement --no-host to disable host buffer * fix equal_mparams * move no-host enumeration order together with other model params --------- Co-authored-by: slaren <slarengh@gmail.com>
2025-10-27 08:21:30 +00:00 · 2025-10-06 12:55:53 -05:00
parent c08002a198
commit 3df2244df4
6 changed files with 56 additions and 10 deletions
--- a/include/llama.h
+++ b/include/llama.h
@@ -296,6 +296,7 @@ extern "C" {
        bool use_mlock;       // force system to keep model in RAM
        bool check_tensors;   // validate model tensor data
        bool use_extra_bufts; // use extra buffer types (used for weight repacking)
+        bool no_host;         // bypass host buffer allowing extra buffers to be used
    };

    // NOTE: changing the default values of parameters marked as [EXPERIMENTAL] may cause crashes or incorrect results in certain configurations