llama.cpp/ggml-cuda/softmax.cu at c460ff1a1c7591f9a700dcc62f7dd69b66fc5f19

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-10 10:27:03 +00:00

Files

DAN™ e00b4a8f81 Fix more int overflow during quant (PPL/CUDA). (#6563 )

* Fix more int overflow during quant.

* Fix some more int overflow in softmax.

* Revert back to int64_t.

2024-04-29 00:38:44 +02:00

View Raw