mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2025-11-15 11:17:31 +00:00
readme : add RVV,ZVFH,ZFH,ZICBOP support for RISC-V (#17259)
Signed-off-by: Wang Yang <yangwang@iscas.ac.cn>
This commit is contained in:
2
.github/copilot-instructions.md
vendored
2
.github/copilot-instructions.md
vendored
@@ -9,7 +9,7 @@ llama.cpp is a large-scale C/C++ project for efficient LLM (Large Language Model
|
|||||||
- **Size**: ~200k+ lines of code across 1000+ files
|
- **Size**: ~200k+ lines of code across 1000+ files
|
||||||
- **Architecture**: Modular design with main library (`libllama`) and 40+ executable tools/examples
|
- **Architecture**: Modular design with main library (`libllama`) and 40+ executable tools/examples
|
||||||
- **Core dependency**: ggml tensor library (vendored in `ggml/` directory)
|
- **Core dependency**: ggml tensor library (vendored in `ggml/` directory)
|
||||||
- **Backends supported**: CPU (AVX/NEON optimized), CUDA, Metal, Vulkan, SYCL, ROCm, MUSA
|
- **Backends supported**: CPU (AVX/NEON/RVV optimized), CUDA, Metal, Vulkan, SYCL, ROCm, MUSA
|
||||||
- **License**: MIT
|
- **License**: MIT
|
||||||
|
|
||||||
## Build Instructions
|
## Build Instructions
|
||||||
|
|||||||
@@ -61,6 +61,7 @@ range of hardware - locally and in the cloud.
|
|||||||
- Plain C/C++ implementation without any dependencies
|
- Plain C/C++ implementation without any dependencies
|
||||||
- Apple silicon is a first-class citizen - optimized via ARM NEON, Accelerate and Metal frameworks
|
- Apple silicon is a first-class citizen - optimized via ARM NEON, Accelerate and Metal frameworks
|
||||||
- AVX, AVX2, AVX512 and AMX support for x86 architectures
|
- AVX, AVX2, AVX512 and AMX support for x86 architectures
|
||||||
|
- RVV, ZVFH, ZFH and ZICBOP support for RISC-V architectures
|
||||||
- 1.5-bit, 2-bit, 3-bit, 4-bit, 5-bit, 6-bit, and 8-bit integer quantization for faster inference and reduced memory use
|
- 1.5-bit, 2-bit, 3-bit, 4-bit, 5-bit, 6-bit, and 8-bit integer quantization for faster inference and reduced memory use
|
||||||
- Custom CUDA kernels for running LLMs on NVIDIA GPUs (support for AMD GPUs via HIP and Moore Threads GPUs via MUSA)
|
- Custom CUDA kernels for running LLMs on NVIDIA GPUs (support for AMD GPUs via HIP and Moore Threads GPUs via MUSA)
|
||||||
- Vulkan and SYCL backend support
|
- Vulkan and SYCL backend support
|
||||||
|
|||||||
Reference in New Issue
Block a user