mirror of
				https://github.com/ggml-org/llama.cpp.git
				synced 2025-10-30 08:42:00 +00:00 
			
		
		
		
	fixed typo (#178)
This commit is contained in:
		| @@ -199,7 +199,7 @@ https://user-images.githubusercontent.com/271616/225014776-1d567049-ad71-4ef2-b0 | |||||||
| - We don't know yet how much the quantization affects the quality of the generated text | - We don't know yet how much the quantization affects the quality of the generated text | ||||||
| - Probably the token sampling can be improved | - Probably the token sampling can be improved | ||||||
| - The Accelerate framework is actually currently unused since I found that for tensor shapes typical for the Decoder, | - The Accelerate framework is actually currently unused since I found that for tensor shapes typical for the Decoder, | ||||||
|   there is no benefit compared to the ARM_NEON intrinsics implementation. Of course, it's possible that I simlpy don't |   there is no benefit compared to the ARM_NEON intrinsics implementation. Of course, it's possible that I simply don't | ||||||
|   know how to utilize it properly. But in any case, you can even disable it with `LLAMA_NO_ACCELERATE=1 make` and the |   know how to utilize it properly. But in any case, you can even disable it with `LLAMA_NO_ACCELERATE=1 make` and the | ||||||
|   performance will be the same, since no BLAS calls are invoked by the current implementation |   performance will be the same, since no BLAS calls are invoked by the current implementation | ||||||
|  |  | ||||||
|   | |||||||
		Reference in New Issue
	
	Block a user
	 moritzbrantner
					moritzbrantner