llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-31 08:51:55 +00:00

Files

Pierrick Hymbert d52d7819b8 server: concurrency fix + monitoring - add /metrics prometheus compatible endpoint (#5708 )

* server: monitoring - add /metrics prometheus compatible endpoint

* server: concurrency issue, when 2 task are waiting for results, only one call thread is notified

* server: metrics - move to a dedicated struct

2024-02-25 13:49:43 +01:00

steps

server: concurrency fix + monitoring - add /metrics prometheus compatible endpoint (#5708 )

2024-02-25 13:49:43 +01:00

environment.py

server: concurrency fix + monitoring - add /metrics prometheus compatible endpoint (#5708 )

2024-02-25 13:49:43 +01:00

issues.feature

server: continue to update other slots on embedding concurrent request (#5699 )

2024-02-24 19:16:04 +01:00

parallel.feature

server: continue to update other slots on embedding concurrent request (#5699 )

2024-02-24 19:16:04 +01:00

security.feature

server: init functional tests (#5566 )

2024-02-24 12:28:55 +01:00

server.feature

server: concurrency fix + monitoring - add /metrics prometheus compatible endpoint (#5708 )

2024-02-25 13:49:43 +01:00

wrong_usages.feature

server: init functional tests (#5566 )

2024-02-24 12:28:55 +01:00