mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2025-10-27 08:21:30 +00:00
This commit updates the modelcard.template file used in the model conversion scripts for embedding models to include the llama-server --embeddings flag in the recommended command to run the model. The motivation for this change was that when using the model-conversion "tool" to upload the EmbeddingGemma models to Hugging Face this flag was missing and the embedding endpoint was there for not available when copy&pasting the command.
49 lines
1.3 KiB
Plaintext
49 lines
1.3 KiB
Plaintext
---
|
|
base_model:
|
|
- {base_model}
|
|
---
|
|
# {model_name} GGUF
|
|
|
|
Recommended way to run this model:
|
|
|
|
```sh
|
|
llama-server -hf {namespace}/{model_name}-GGUF --embeddings
|
|
```
|
|
|
|
Then the endpoint can be accessed at http://localhost:8080/embedding, for
|
|
example using `curl`:
|
|
```console
|
|
curl --request POST \
|
|
--url http://localhost:8080/embedding \
|
|
--header "Content-Type: application/json" \
|
|
--data '{{"input": "Hello embeddings"}}' \
|
|
--silent
|
|
```
|
|
|
|
Alternatively, the `llama-embedding` command line tool can be used:
|
|
```sh
|
|
llama-embedding -hf {namespace}/{model_name}-GGUF --verbose-prompt -p "Hello embeddings"
|
|
```
|
|
|
|
#### embd_normalize
|
|
When a model uses pooling, or the pooling method is specified using `--pooling`,
|
|
the normalization can be controlled by the `embd_normalize` parameter.
|
|
|
|
The default value is `2` which means that the embeddings are normalized using
|
|
the Euclidean norm (L2). Other options are:
|
|
* -1 No normalization
|
|
* 0 Max absolute
|
|
* 1 Taxicab
|
|
* 2 Euclidean/L2
|
|
* \>2 P-Norm
|
|
|
|
This can be passed in the request body to `llama-server`, for example:
|
|
```sh
|
|
--data '{{"input": "Hello embeddings", "embd_normalize": -1}}' \
|
|
```
|
|
|
|
And for `llama-embedding`, by passing `--embd-normalize <value>`, for example:
|
|
```sh
|
|
llama-embedding -hf {namespace}/{model_name}-GGUF --embd-normalize -1 -p "Hello embeddings"
|
|
```
|