llama.cpp/scripts at 229bf686287d18f82c44e89888cc662145ecfdb4 - llama.cpp - Gitea - Peisong Xiao

CS348Project/llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-31 08:51:55 +00:00

Files

History

Max Krasnyansky 3eb2be1ca5 Hexagon Op queue & dispatch optimizations (#16820 )

* hexagon: remove dspqueue callbacks and do all read processing inplace

* hexagon: there is no need to ref/deref the buffers at this point

We're not going to release the buffers without flushing the session queue.
So there is no need to inc/dec the refcounts for every request.
We also don't need to include those bufs in the response.

* hexagon: bump the thread count in the adb wrapper scripts

We can use more CPU cores now that the dedicated dspqueue polling threads are not used (ie no contention).
Also enable more agressive polling for now since we still map Flash Attention (and a few other kernels) to
the CPU and those dspqueue threads were keeping the CPU cores are higher clock freqs.

* hexagon: add lhez as the second code owner

2025-10-29 06:29:12 -07:00

..

scripts : make the shell scripts cross-platform (#14341 )

2025-06-30 10:17:18 +02:00

scripts : add Jinja tester PySide6 simple app (#15756 )

2025-09-05 01:05:12 +02:00

Hexagon Op queue & dispatch optimizations (#16820 )

2025-10-29 06:29:12 -07:00

build-info.sh

llama : reorganize source code + improve CMake (#8006 )

2024-06-26 18:33:02 +03:00

check-requirements.sh

scripts : make the shell scripts cross-platform (#14341 )

2025-06-30 10:17:18 +02:00

compare-commits.sh

scripts: add sqlite3 check for compare-commits.sh (#15633 )

2025-08-28 19:23:22 +08:00

compare-llama-bench.py

scripts: strip "AMD Instinct" from GPU name (#15668 )

2025-08-29 22:04:08 +02:00

create_ops_docs.py

Docs: add instructions for adding backends (#14889 )

2025-07-27 09:36:43 +08:00

debug-test.sh

scripts : make the shell scripts cross-platform (#14341 )

2025-06-30 10:17:18 +02:00

fetch_server_test_models.py

llama : move end-user examples to tools directory (#13249 )

2025-05-02 20:27:13 +02:00

gen-authors.sh

scripts : make the shell scripts cross-platform (#14341 )

2025-06-30 10:17:18 +02:00

gen-unicode-data.py

py : type-check all Python scripts with Pyright (#8341 )

2024-07-07 15:04:39 -04:00

get_chat_template.py

scripts: corrected encoding when getting chat template (#11866 ) (#11907 )

2025-02-18 10:30:16 +01:00

get-flags.mk

build : pass all warning flags to nvcc via -Xcompiler (#5570 )

2024-02-18 16:21:52 -05:00

get-hellaswag.sh

scripts : make the shell scripts cross-platform (#14341 )

2025-06-30 10:17:18 +02:00

get-pg.sh

scripts : make the shell scripts cross-platform (#14341 )

2025-06-30 10:17:18 +02:00

get-wikitext-2.sh

scripts : make the shell scripts cross-platform (#14341 )

2025-06-30 10:17:18 +02:00

get-wikitext-103.sh

scripts : make the shell scripts cross-platform (#14341 )

2025-06-30 10:17:18 +02:00

get-winogrande.sh

scripts : make the shell scripts cross-platform (#14341 )

2025-06-30 10:17:18 +02:00

hf.sh

scripts : make the shell scripts cross-platform (#14341 )

2025-06-30 10:17:18 +02:00

install-oneapi.bat

support SYCL backend windows build (#5208 )

2024-01-31 08:08:07 +05:30

server-bench.py

llama: use FA + max. GPU layers by default (#15434 )

2025-08-30 16:32:10 +02:00

sync_vendor.py

sync : vendor (#13901 )

2025-05-30 16:25:45 +03:00

sync-ggml-am.sh

scripts : update sync scripts

2025-08-18 22:06:44 +03:00

sync-ggml.last

ggml : bump version to 0.9.4 (ggml/1363)

2025-09-30 13:53:55 +03:00

sync-ggml.sh

scripts : update sync scripts

2025-08-18 22:06:44 +03:00

tool_bench.py

server : speed up tests (#15836 )

2025-09-06 14:45:24 +02:00

tool_bench.sh

scripts : make the shell scripts cross-platform (#14341 )

2025-06-30 10:17:18 +02:00

verify-checksum-models.py

convert.py : add python logging instead of print() (#6511 )

2024-05-03 22:36:41 +03:00

xxd.cmake

llama : move end-user examples to tools directory (#13249 )

2025-05-02 20:27:13 +02:00