Files
llama.cpp/ggml
Francis Couture-Harpin e9719576c4 ggml : also faster TQ1_0
Same optimization as for TQ2_0 by offsetting the sum instead of the weights.
This makes TQ1_0 almost as fast as Q8_0 on AVX2.
2024-07-31 00:08:48 -04:00
..
2024-07-31 00:08:48 -04:00
2024-07-13 18:12:39 +02:00