Johannes Gäßler
|
e81b8e4b7f
|
llama: use FA + max. GPU layers by default (#15434)
* llama: use max. GPU layers by default, auto -fa
* ggml-backend: abort instead of segfault
|
2025-08-30 16:32:10 +02:00 |
|
Georgi Gerganov
|
d2fcd91cf9
|
server : disable context shift by default (#15416)
* server : disable context shift by default
ggml-ci
* server : make scopr of test parameters local
|
2025-08-19 16:46:37 +03:00 |
|
Xuan-Son Nguyen
|
6aa892ec2a
|
server : do not return error out of context (with ctx shift disabled) (#13577)
|
2025-05-16 21:50:00 +02:00 |
|
Diego Devesa
|
1d36b3670b
|
llama : move end-user examples to tools directory (#13249)
* llama : move end-user examples to tools directory
---------
Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
|
2025-05-02 20:27:13 +02:00 |
|