Commit Graph

4 Commits

Author SHA1 Message Date
Johannes Gäßler
e81b8e4b7f llama: use FA + max. GPU layers by default (#15434)
* llama: use max. GPU layers by default, auto -fa

* ggml-backend: abort instead of segfault
2025-08-30 16:32:10 +02:00
Georgi Gerganov
d2fcd91cf9 server : disable context shift by default (#15416)
* server : disable context shift by default

ggml-ci

* server : make scopr of test parameters local
2025-08-19 16:46:37 +03:00
Xuan-Son Nguyen
6aa892ec2a server : do not return error out of context (with ctx shift disabled) (#13577) 2025-05-16 21:50:00 +02:00
Diego Devesa
1d36b3670b llama : move end-user examples to tools directory (#13249)
* llama : move end-user examples to tools directory

---------

Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
2025-05-02 20:27:13 +02:00