llama.cpp/examples/server/server.cpp at b3d978600f07f22e94f2e797f18a8b5f6df23c89

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-06 09:46:50 +00:00

Files

Xuan Son Nguyen 99b71c068f Server: Use multi-task for embeddings endpoint (#6001 )

* use multitask for embd endpoint

* specify types

* remove redundant {"n_predict", 0}

2024-03-13 11:39:11 +01:00

View Raw