mirror of
				https://github.com/ggml-org/llama.cpp.git
				synced 2025-10-30 08:42:00 +00:00 
			
		
		
		
	readme: add missing info (#1324)
This commit is contained in:
		| @@ -18,10 +18,12 @@ The main goal of `llama.cpp` is to run the LLaMA model using 4-bit integer quant | ||||
|  | ||||
| - Plain C/C++ implementation without dependencies | ||||
| - Apple silicon first-class citizen - optimized via ARM NEON and Accelerate framework | ||||
| - AVX2 support for x86 architectures | ||||
| - AVX, AVX2 and AVX512 support for x86 architectures | ||||
| - Mixed F16 / F32 precision | ||||
| - 4-bit integer quantization support | ||||
| - 4-bit, 5-bit and 8-bit integer quantization support | ||||
| - Runs on the CPU | ||||
| - OpenBLAS support | ||||
| - cuBLAS and CLBlast support | ||||
|  | ||||
| The original implementation of `llama.cpp` was [hacked in an evening](https://github.com/ggerganov/llama.cpp/issues/33#issuecomment-1465108022). | ||||
| Since then, the project has improved significantly thanks to many contributions. This project is for educational purposes and serves | ||||
|   | ||||
		Reference in New Issue
	
	Block a user
	 Pavol Rusnak
					Pavol Rusnak