llama.cpp/llama.h at 8b94e799dfa482adf63419df4905dc79b37e179f

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-02 09:12:03 +00:00

Files

Daniel Bevenius 3015851c5a llama : add getters for n_threads/n_threads_batch (#7464 )

* llama : add getters for n_threads/n_threads_batch

This commit adds two new functions to the llama API. The functions
can be used to get the number of threads used for generating a single
token and the number of threads used for prompt and batch processing
(multiple tokens).

The motivation for this is that we want to be able to get the number of
threads that the a context is using. The main use case is for a
testing/verification that the number of threads is set correctly.

Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>

* squash! llama : add getters for n_threads/n_threads_batch

Rename the getters to llama_n_threads and llama_n_threads_batch.

Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>

---------

Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>

2024-05-23 15:29:26 +03:00

53 KiB

Raw Blame History

View Raw

53 KiB Raw Blame History

53 KiB

Raw Blame History