llama.cpp/ggml/src/ggml-sycl/softmax.hpp at 53ff6b9b9fb25ed0ec0a213e05534fe7c3d0040f

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-13 10:57:15 +00:00

Files

luoyu-intel a9554e20b6 [SYCL] Fix WARP_SIZE=16 bug of Intel GPU (#8266 )

* fix group_norm ut

* split softmax

* fix softmax

* add concat support condition

* revert debug code

* move QK_WARP_SIZE to presets.hpp

2024-07-05 13:06:13 +08:00

View Raw