llama.cpp/ggml-cuda.cu at 6028879f56a9b8c2ac1b0d14270f38998c8ec0f2

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-19 11:57:07 +00:00

Files

slaren 7e2b9974d1 ggml-cuda : update rope implementation for parallel decoding (#3254 )

* ggml-cuda : update rope implementation for parallel decoding

* better solution for p0 computation

* fix rope

* simpler rope implementation

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

2023-09-19 11:31:36 +03:00

268 KiB

Raw Blame History

View Raw

268 KiB Raw Blame History

268 KiB

Raw Blame History