llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-27 08:21:30 +00:00

Files

Johannes Gäßler defe2158dd CUDA: mul_mat_v support for batch sizes > 1 (#14262 )

* CUDA: mul_mat_v support for batch sizes > 1

* use 64 bit math for initial offset calculation

2025-06-23 13:11:31 +02:00

2025-06-16 13:54:15 +08:00

Add ggml_roll (ggml/1274)

2025-06-20 21:02:47 +03:00

2025-06-23 13:11:31 +02:00

.gitignore

2024-07-13 18:12:39 +02:00

CMakeLists.txt

2025-06-18 09:59:21 +03:00