Files
llama.cpp/ggml/src/ggml-cuda
Diego Devesa a5e47592b6 cuda : optimize argmax (#10441)
* cuda : optimize argmax

* remove unused parameter

ggml-ci

* fixup : use full warps

ggml-ci

* Apply suggestions from code review

Co-authored-by: Johannes Gäßler <johannesg@5d6.de>

* fix ub

* ggml : check ne00 <= INT32_MAX in argmax and argsort

---------

Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
2024-11-21 18:18:50 +01:00
..
2024-11-21 18:18:50 +01:00
2024-11-21 18:18:50 +01:00
2024-11-08 13:47:22 +02:00
2024-09-20 21:15:05 +03:00
2024-11-21 18:18:50 +01:00
2024-08-27 22:41:27 +03:00
2024-08-27 22:41:27 +03:00