clip : use FA (#16837)

* clip : use FA * cont : add warning about unsupported ops * implement "auto" mode for clip flash attn * clip : print more detailed op support info during warmup * cont : remove obsolete comment [no ci] * improve debugging message * trailing space * metal : remove stray return --------- Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
2025-11-14 11:07:10 +00:00 · 2025-11-02 22:21:48 +02:00
parent cd5e3b5754
commit 2f966b8ed8
9 changed files with 194 additions and 43 deletions
--- a/tools/mtmd/mtmd.h
+++ b/tools/mtmd/mtmd.h
@@ -82,6 +82,7 @@ struct mtmd_context_params {
    enum ggml_log_level verbosity;
    const char * image_marker; // deprecated, use media_marker instead
    const char * media_marker;
+    enum llama_flash_attn_type flash_attn_type;
 };

 MTMD_API const char * mtmd_default_marker(void);