llama.cpp/ggml/src/ggml-impl.h at ac261bea669cb79d6cf754d110528b09d24ed524

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-06 09:46:50 +00:00

Files

Jeff Bolz 10fcc41290 vulkan: Update topk_moe fusion to handle gpt's late softmax (#16656 )

* vulkan: Update topk_moe fusion to handle gpt's late softmax

Based on #16649.

* Add ggml_check_edges

* Add sync logging to show fusion effects

* handle clamp added in #16655

* Update ggml/src/ggml-impl.h

Co-authored-by: Diego Devesa <slarengh@gmail.com>

2025-10-29 14:44:29 +01:00

23 KiB

Raw Blame History

View Raw

23 KiB Raw Blame History

23 KiB

Raw Blame History