llama.cpp/ggml/include/ggml.h at c51daefc32814a770513ba75799ba2d9138e08fe

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-12 10:47:01 +00:00

Files

Francis Couture-Harpin c51daefc32 llama : advanced batch splits

This includes equal-sequence-length batch splits which are useful
to simplify recurrent model operators.

* llama : always make recurrent state slots contiguous

* ggml : simplify mamba operators

2024-07-16 20:38:48 -04:00

88 KiB

Raw Blame History

View Raw

88 KiB Raw Blame History

88 KiB

Raw Blame History