This commit updates the modelcard.template file used in the model
conversion scripts for embedding models to include the llama-server
--embeddings flag in the recommended command to run the model.
The motivation for this change was that when using the model-conversion
"tool" to upload the EmbeddingGemma models to Hugging Face this flag was
missing and the embedding endpoint was there for not available when
copy&pasting the command.
* model-conversion: add model card template for embeddings [no ci]
This commit adds a separate model card template (model repository
README.md template) for embedding models.
The motivation for this is that there server command for the embedding
model is a little different and some addition information can be useful
in the model card for embedding models which might not be directly
relevant for causal models.
* squash! model-conversion: add model card template for embeddings [no ci]
Fix pyright lint error.
* remove --pooling override and clarify embd_normalize usage