llama.cpp/ggml.h at 3587a9498773203f10f66814f67568797f1ce7a0

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-14 11:07:10 +00:00

Files

Francis Couture-Harpin 3587a94987 llama : use equal-sequence-length sub-batches for recurrent models

* ggml : simplify SSM-related operators

* llama : make recurrent state slot allocation contiguous

* llama : adapt internal uses of batches to llama_ubatch

2024-06-01 11:49:17 -04:00

89 KiB

Raw Blame History

View Raw

89 KiB Raw Blame History

89 KiB

Raw Blame History