Files
llama.cpp/ggml.c
Francis Couture-Harpin 3587a94987 llama : use equal-sequence-length sub-batches for recurrent models
* ggml : simplify SSM-related operators

* llama : make recurrent state slot allocation contiguous

* llama : adapt internal uses of batches to llama_ubatch
2024-06-01 11:49:17 -04:00

740 KiB