llama.cpp/ggml.c at 3587a9498773203f10f66814f67568797f1ce7a0

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-11 10:36:54 +00:00

Files

Francis Couture-Harpin 3587a94987 llama : use equal-sequence-length sub-batches for recurrent models

* ggml : simplify SSM-related operators

* llama : make recurrent state slot allocation contiguous

* llama : adapt internal uses of batches to llama_ubatch

2024-06-01 11:49:17 -04:00

740 KiB

Raw Blame History

View Raw

740 KiB Raw Blame History

740 KiB

Raw Blame History