mirror of
				https://github.com/ggml-org/llama.cpp.git
				synced 2025-10-30 08:42:00 +00:00 
			
		
		
		
	server : output embeddings for all tokens when pooling = none (#10861)
* server : add "tokens" output ggml-ci * server : output embeddings for all tokens when pooling = none ggml-ci * server : update readme [no ci] * server : fix spacing [no ci] Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> * server : be explicit about the pooling type in the tests ggml-ci * server : update /embeddings and /v1/embeddings endpoints ggml-ci * server : do not normalize embeddings when there is no pooling ggml-ci * server : update readme ggml-ci * server : fixes * tests : update server tests ggml-ci * server : update readme [no ci] * server : remove rebase artifact --------- Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
This commit is contained in:
		| @@ -1780,7 +1780,9 @@ void common_embd_normalize(const float * inp, float * out, int n, int embd_norm) | ||||
|             break; | ||||
|         case 0: // max absolute | ||||
|             for (int i = 0; i < n; i++) { | ||||
|                 if (sum < std::abs(inp[i])) sum = std::abs(inp[i]); | ||||
|                 if (sum < std::abs(inp[i])) { | ||||
|                     sum = std::abs(inp[i]); | ||||
|                 } | ||||
|             } | ||||
|             sum /= 32760.0; // make an int16 range | ||||
|             break; | ||||
|   | ||||
		Reference in New Issue
	
	Block a user
	 Georgi Gerganov
					Georgi Gerganov