* Revert "devops : fix compile bug when the BASE_CUDA_DEV_CONTAINER is based on Ubuntu 24.04 (#15005)"
This reverts commit e4e915912c.
* devops: Allow pip to modify externally-managed python environment (system installation)
- Updated pip install commands to include the --break-system-packages
flag, ensuring compatibility when working with system-managed Python
environments (PEP 668).
- Note: The --break-system-packages option was introduced in 2023.
Ensure pip is updated to a recent version before using this flag.
fixes [#15004](https://github.com/danchev/llama.cpp/issues/15004)
* Changed the CI file to hw
* Changed the CI file to hw
* Added to sudoers for apt
* Removed the clone command and used checkout
* Added libcurl
* Added gcc-14
* Checking gcc --version
* added gcc-14 symlink
* added CC and C++ variables
* Added the gguf weight
* Changed the weights path
* Added system specification
* Removed white spaces
* ci: Replace Jenkins riscv native build Cloud-V pipeline with GitHub Actions workflow
Removed the legacy .devops/cloud-v-pipeline Jenkins CI configuration and introduced .github/workflows/build-riscv-native.yml for native RISC-V builds using GitHub Actions.
* removed trailing whitespaces
---------
Co-authored-by: Akif Ejaz <akifejaz40@gmail.com>
This commit adds support for MFMA instructions to MMQ. CDNA1/GFX908 CDNA2/GFX90a and CDNA3/GFX942 are supported by the MFMA-enabled code path added by this commit. The code path and stream-k is only enabled on CDNA3 for now as it fails to outperform blas in all cases on the other devices.
Blas is currently only consistently outperformed on CDNA3 due to issues in the amd-provided blas libraries.
This commit also improves the awareness of MMQ towards different warp sizes and as a side effect improves the performance of all quant formats besides q4_0 and q4_1, which regress slightly, on GCN gpus.
* musa: fix build warning (unused parameter)
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
* musa: upgrade MUSA SDK version to rc4.0.1
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
* musa: use mudnn::Unary::IDENTITY op to accelerate D2D memory copy
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
* Update ggml/src/ggml-cuda/cpy.cu
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
* musa: remove MUDNN_CHECK_GEN and use CUDA_CHECK_GEN instead in MUDNN_CHECK
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
---------
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
* [CANN] Support ELU and CONV_TRANSPOSE_1D
* [CANN]Modification review comments
* [CANN]Modification review comments
* [CANN]name adjustment
* [CANN]remove lambda used in template
* [CANN]Use std::func instead of template
* [CANN]Modify the code according to the review comments
---------
Signed-off-by: noemotiovon <noemotiovon@gmail.com>