llama.cpp/examples/server/server.cpp at 4a2b196d03d52da31236390e9f5694a88d43d11d

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-31 08:51:55 +00:00

Files

Daniel Bevenius a2df2787b3 server : update help metrics processing/deferred (#11512 )

This commit updates the help text for the metrics `requests_processing`
and `requests_deferred` to be more grammatically correct.

Currently the returned metrics look like this:
```console
\# HELP llamacpp:requests_processing Number of request processing.
\# TYPE llamacpp:requests_processing gauge
llamacpp:requests_processing 0
\# HELP llamacpp:requests_deferred Number of request deferred.
\# TYPE llamacpp:requests_deferred gauge
llamacpp:requests_deferred 0
```

With this commit, the metrics will look like this:
```console
\# HELP llamacpp:requests_processing Number of requests processing.
\# TYPE llamacpp:requests_processing gauge
llamacpp:requests_processing 0
\# HELP llamacpp:requests_deferred Number of requests deferred.
\# TYPE llamacpp:requests_deferred gauge
llamacpp:requests_deferred 0
```
This is also consistent with the description of the metrics in the
server examples [README.md](https://github.com/ggerganov/llama.cpp/tree/master/examples/server#get-metrics-prometheus-compatible-metrics-exporter).

2025-01-31 06:04:53 +01:00

175 KiB

Raw Blame History

View Raw

175 KiB Raw Blame History

175 KiB

Raw Blame History