Neo Zhang Jianyu
2be72c2b12
SYCL: Update to oneAPI 2025.2 ( #16371 )
...
* update oneapi to 2025.2, use deep-learning-essentials to replace base-tool
* update to 2025.2 use deeplearn essi to replace base toolkit
* add missed dll
* add deep learning essentials
* add sycl-ls
---------
Co-authored-by: Zhang Jianyu <zhang.jianyu@outlook.com >
2025-10-02 10:16:25 +03:00
uvos
c8dedc9999
CI: reenable cdna in rocm docker builds ( #16376 )
2025-10-01 23:32:39 +02:00
uvos
1fe4e38cc2
ci: Properly install rocwmma for hip builds ( #16305 )
...
* CI: Properly install rocwmma for hip builds
on windows we now windows install rocwmma from ubuntu pacakges
* CI: update linux rocm docker build to use rocm 7.0
2025-10-01 20:18:03 +02:00
R0CKSTAR
d9e0e7c819
ci : fix musa docker build ( #16306 )
...
Signed-off-by: Xiaodong Ye <yeahdongcn@gmail.com >
2025-09-28 16:38:15 +02:00
R0CKSTAR
a86a580a66
musa: upgrade musa sdk to 4.3.0 ( #16240 )
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
2025-09-26 02:56:38 +02:00
Aaron Teo
5fb557653b
devops: fix s390x docker release failure ( #16231 )
2025-09-25 11:36:30 +08:00
Aaron Teo
4b9f4cb0f8
devops: add s390x containers ( #15915 )
...
* devops: add s390x dockerfile
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: add missing ninja
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: move s390x docker into cpu docker
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: rework s390x docker
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: copy more tools
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: add server build step
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: remove apt clean steps as distroless misses it
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: remove apt commands from distroless
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: fix shared libs in distroless
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: use correct libs path
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: fix shared libs
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: add collector stage
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: fix missing stage ref
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: fix permission issue
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: fix unknown model loading failures
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: attempt at fixing model loading failure
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: fix missing ggml shared object
failure to load model
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: remove move shared objects
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: move libggml-cpu and blas into bin
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: finalise hardened server stage
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: add cli target
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: fix typos
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: fix missing shared libraries in base
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: update debian target
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: formalise llama.cpp loc
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* Revert "devops: formalise llama.cpp loc"
This reverts commit 0a7664af84 .
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: formalise llama.cpp loc
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
(cherry picked from commit 0a7664af84 )
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: attempt at fixing missing dir
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: attempt at making it cache the build
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: fix copying process
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: make build dir an argument
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* Revert "devops: make build dir an argument"
This reverts commit 438698976b .
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: add build stage for gguf-py
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: move gguf-py installation into build stage
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: break system packages?
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: add rust compiler installer
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: fix rustc not found
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: remove cache mount to allow rustc to persist
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: move rustc installation to another layer
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: move gguf-py installation to full stage, fix copying
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: remove rustc installation in build
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: disable full target for now
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: attempting static build
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: merge s390x dockerfile into cpu for now
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: switch to gcc image for build step
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: remove build essentials
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: install openblas into base target
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: go back to s390x dockerfile
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: remove libggml and libblas
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: add full target
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: add break system packages
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: add libjpeg
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: add missing cmake dep
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: finalise docker images for s390x
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: add custom openblas patch
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: use libopenblas-dev instead of libopenblas-openmp-dev
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* devops: add s390x docker build
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
---------
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
2025-09-23 13:59:34 +08:00
Diego Devesa
dc381aa9a6
docker : enable rocWMMA in ROCm images, add gfx1151 ( #15997 )
2025-09-15 23:38:52 +02:00
Adam
0fa154e350
rocm.Dockerfile: added gfx1200,gfx1201 architectures to support AMD Radeon RX 9000 series ( #15994 )
...
* rocm.Dockerfile: added gfx1200,gfx1201 architectures to support AMD Radeon RX 9000 series
https://rocm.docs.amd.com/projects/install-on-linux/en/docs-6.4.1/reference/system-requirements.html#rdna-os
states the Radeon RX 9000 series is supported support from Ubuntu 24.04.2, and the dockerfile is using 24.04 which is ROCm 6.4.
This fixed the `ROCm error: invalid device function` I was getting when trying to use the rocm container.
2025-09-14 20:43:54 +02:00
R0CKSTAR
b55f06e1aa
vulkan.Dockerfile: install vulkan SDK using tarball ( #15282 )
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
2025-08-23 08:58:57 +02:00
Dobri Danchev
618575c582
Fix broken build: require updated pip to support --break-system-packages ( #15357 )
...
* Revert "devops : fix compile bug when the BASE_CUDA_DEV_CONTAINER is based on Ubuntu 24.04 (#15005 )"
This reverts commit e4e915912c .
* devops: Allow pip to modify externally-managed python environment (system installation)
- Updated pip install commands to include the --break-system-packages
flag, ensuring compatibility when working with system-managed Python
environments (PEP 668).
- Note: The --break-system-packages option was introduced in 2023.
Ensure pip is updated to a recent version before using this flag.
fixes [#15004 ](https://github.com/danchev/llama.cpp/issues/15004 )
2025-08-18 12:50:48 +02:00
simevo
e4e915912c
devops : fix compile bug when the BASE_CUDA_DEV_CONTAINER is based on Ubuntu 24.04 ( #15005 )
...
fixes #15004
Co-authored-by: Paolo Greppi <paolo.greppi@libpf.com >
2025-08-14 18:45:27 +03:00
Christian Kastner
646944cfa8
docker : Enable GGML_CPU_ALL_VARIANTS for ARM ( #15267 )
2025-08-14 16:22:58 +02:00
Ali Tariq
648ebcdb73
ci : Added CI with RISC-V RVV1.0 Hardware ( #14439 )
...
* Changed the CI file to hw
* Changed the CI file to hw
* Added to sudoers for apt
* Removed the clone command and used checkout
* Added libcurl
* Added gcc-14
* Checking gcc --version
* added gcc-14 symlink
* added CC and C++ variables
* Added the gguf weight
* Changed the weights path
* Added system specification
* Removed white spaces
* ci: Replace Jenkins riscv native build Cloud-V pipeline with GitHub Actions workflow
Removed the legacy .devops/cloud-v-pipeline Jenkins CI configuration and introduced .github/workflows/build-riscv-native.yml for native RISC-V builds using GitHub Actions.
* removed trailing whitespaces
---------
Co-authored-by: Akif Ejaz <akifejaz40@gmail.com >
2025-08-13 13:14:44 +03:00
diannao
2860d479b4
docker : add cann build pipline ( #14591 )
...
* docker: add cann build pipline
* docker: add cann build pipline
* docker: fix cann devops
* cann : fix multi card hccl
* Update ggml/src/ggml-cann/ggml-cann.cpp
Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com >
* Update ggml-cann.cpp
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com >
2025-08-01 10:02:34 +08:00
deepsek
66906cd82a
HIP: Enable Matrix cores for MMQ Kernels, Enable stream-K for CDNA 3 ( #14624 )
...
This commit adds support for MFMA instructions to MMQ. CDNA1/GFX908 CDNA2/GFX90a and CDNA3/GFX942 are supported by the MFMA-enabled code path added by this commit. The code path and stream-k is only enabled on CDNA3 for now as it fails to outperform blas in all cases on the other devices.
Blas is currently only consistently outperformed on CDNA3 due to issues in the amd-provided blas libraries.
This commit also improves the awareness of MMQ towards different warp sizes and as a side effect improves the performance of all quant formats besides q4_0 and q4_1, which regress slightly, on GCN gpus.
2025-07-27 00:28:14 +02:00
R0CKSTAR
3f4fc97f1d
musa: upgrade musa sdk to rc4.2.0 ( #14498 )
...
* musa: apply mublas API changes
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
* musa: update musa version to 4.2.0
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
* musa: restore MUSA graph settings in CMakeLists.txt
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
* musa: disable mudnnMemcpyAsync by default
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
* musa: switch back to non-mudnn images
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
* minor changes
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
* musa: restore rc in docker image tag
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
---------
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
2025-07-24 20:05:37 +01:00
Wroclaw
760b4484e3
nix : use optionalAttrs for env mkDerivation attrset argument ( #14726 )
2025-07-17 15:18:16 -07:00
Vedran Miletić
e9b6350e61
scripts : make the shell scripts cross-platform ( #14341 )
2025-06-30 10:17:18 +02:00
Svetlozar Georgiev
40643edb86
sycl: fix docker image ( #14144 )
2025-06-13 18:32:56 +02:00
R0CKSTAR
33983057d0
musa: Upgrade MUSA SDK version to rc4.0.1 and use mudnn::Unary::IDENTITY op to accelerate D2D memory copy ( #13647 )
...
* musa: fix build warning (unused parameter)
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
* musa: upgrade MUSA SDK version to rc4.0.1
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
* musa: use mudnn::Unary::IDENTITY op to accelerate D2D memory copy
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
* Update ggml/src/ggml-cuda/cpy.cu
Co-authored-by: Johannes Gäßler <johannesg@5d6.de >
* musa: remove MUDNN_CHECK_GEN and use CUDA_CHECK_GEN instead in MUDNN_CHECK
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
---------
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
Co-authored-by: Johannes Gäßler <johannesg@5d6.de >
2025-05-21 09:58:49 +08:00
Alberto Cabrera Pérez
f71f40a284
ci : upgraded oneAPI version in SYCL workflows and dockerfile ( #13532 )
2025-05-19 11:46:09 +01:00
Xuan-Son Nguyen
da84c04d8f
docker : do not build tests ( #13204 )
...
* docker : do not build tests
* include "ggml-cpu.h"
2025-04-30 10:44:07 +02:00
Rudi Servo
b0091ecc1e
docker : added all CPU to GPU images ( #12749 )
2025-04-10 01:17:12 +02:00
Chenguang Li
6e1c4cebdb
CANN: Support Opt CONV_TRANSPOSE_1D and ELU ( #12786 )
...
* [CANN] Support ELU and CONV_TRANSPOSE_1D
* [CANN]Modification review comments
* [CANN]Modification review comments
* [CANN]name adjustment
* [CANN]remove lambda used in template
* [CANN]Use std::func instead of template
* [CANN]Modify the code according to the review comments
---------
Signed-off-by: noemotiovon <noemotiovon@gmail.com >
2025-04-09 14:04:14 +08:00
Xuan-Son Nguyen
bd3f59f812
cmake : enable curl by default ( #12761 )
...
* cmake : enable curl by default
* no curl if no examples
* fix build
* fix build-linux-cross
* add windows-setup-curl
* fix
* shell
* fix path
* fix windows-latest-cmake*
* run: include_directories
* LLAMA_RUN_EXTRA_LIBS
* sycl: no llama_curl
* no test-arg-parser on windows
* clarification
* try riscv64 / arm64
* windows: include libcurl inside release binary
* add msg
* fix mac / ios / android build
* will this fix xcode?
* try clearing the cache
* add bunch of licenses
* revert clear cache
* fix xcode
* fix xcode (2)
* fix typo
2025-04-07 13:35:19 +02:00
Georgi Gerganov
68ff663a04
repo : update links to new url ( #11886 )
...
* repo : update links to new url
ggml-ci
* cont : more urls
ggml-ci
2025-02-15 16:40:57 +02:00
Georgi Gerganov
dbc2ec59b5
docker : drop to CUDA 12.4 ( #11869 )
...
* docker : drop to CUDA 12.4
* docker : update readme [no ci]
2025-02-14 14:48:40 +02:00
R0CKSTAR
bd6e55bfd3
musa: bump MUSA SDK version to rc3.1.1 ( #11822 )
...
* musa: Update MUSA SDK version to rc3.1.1
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
* musa: Remove workaround in PR #10042
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
---------
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
2025-02-13 13:28:18 +01:00
Xuan-Son Nguyen
d0c08040b6
ci : fix build CPU arm64 ( #11472 )
...
* ci : fix build CPU arm64
* failed, trying ubuntu 22
* vulkan: ubuntu 24
* vulkan : jammy --> noble
2025-01-29 00:02:56 +01:00
Nuno
d7d1eccacc
docker: allow installing pip packages system-wide ( #11437 )
...
Signed-off-by: rare-magma <rare-magma@posteo.eu >
2025-01-28 14:17:25 +00:00
Nuno
f643120bad
docker: add perplexity and bench commands to full image ( #11438 )
...
Signed-off-by: rare-magma <rare-magma@posteo.eu >
2025-01-28 10:42:32 +00:00
Xuan Son Nguyen
caf773f249
docker : fix ARM build and Vulkan build ( #11434 )
...
* ci : do not fail-fast for docker
* build arm64/amd64 separatedly
* fix pip
* no fast fail
* vulkan: try jammy
2025-01-26 22:45:32 +01:00
Nuno
6f53d8a6b4
docker: add missing vulkan library to base layer and update to 24.04 ( #11422 )
...
Signed-off-by: rare-magma <rare-magma@posteo.eu >
2025-01-26 18:22:43 +01:00
Diego Devesa
6e264a905b
docker : add GGML_CPU_ARM_ARCH arg to select ARM architecture to build for ( #11419 )
2025-01-25 17:22:41 +01:00
Diego Devesa
20a758155b
docker : fix CPU ARM build ( #11403 )
...
* docker : fix CPU ARM build
* add CURL to other builds
2025-01-25 15:22:29 +01:00
Rudi Servo
7c0e285858
devops : add docker-multi-stage builds ( #10832 )
2024-12-22 23:22:58 +01:00
Evgeny Kurnevsky
e52aba537a
nix: allow to override rocm gpu targets ( #10794 )
...
This allows to reduce compile time when you are building for a single GPU.
2024-12-14 10:17:36 -08:00
Corentin REGAL
11e07fd63b
fix: graceful shutdown for Docker images ( #10815 )
2024-12-13 18:23:50 +01:00
Diego Devesa
59f4db1088
ggml : add predefined list of CPU backend variants to build ( #10626 )
...
* ggml : add predefined list of CPU backend variants to build
* update CPU dockerfiles
2024-12-04 14:45:40 +01:00
Diego Devesa
3420909dff
ggml : automatic selection of best CPU backend ( #10606 )
...
* ggml : automatic selection of best CPU backend
* amx : minor opt
* add GGML_AVX_VNNI to enable avx-vnni, fix checks
2024-12-01 16:12:41 +01:00
R0CKSTAR
249cd93da3
mtgpu: Add MUSA_DOCKER_ARCH in Dockerfiles && update cmake and make ( #10516 )
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
2024-11-26 17:00:41 +01:00
Xuan Son Nguyen
45abe0f74e
server : replace behave with pytest ( #10416 )
...
* server : replace behave with pytest
* fix test on windows
* misc
* add more tests
* more tests
* styling
* log less, fix embd test
* added all sequential tests
* fix coding style
* fix save slot test
* add parallel completion test
* fix parallel test
* remove feature files
* update test docs
* no cache_prompt for some tests
* add test_cache_vs_nocache_prompt
2024-11-26 16:20:18 +01:00
Johannes Gäßler
75207b3a88
docker: use GGML_NATIVE=OFF ( #10368 )
2024-11-18 00:21:53 +01:00
Romain Biessy
57f8355b29
sycl: Update Intel docker images to use DPC++ 2025.0 ( #10305 )
2024-11-15 13:10:45 +02:00
Chenguang Li
231f9360d9
cann: dockerfile and doc adjustment ( #10302 )
...
Co-authored-by: noemotiovon <noemotiovon@gmail.com >
2024-11-15 15:09:35 +08:00
Diego Devesa
ae8de6d50a
ggml : build backends as libraries ( #10256 )
...
* ggml : build backends as libraries
---------
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
Co-authored-by: R0CKSTAR <xiaodong.ye@mthreads.com >
2024-11-14 18:04:35 +01:00
R0CKSTAR
cf8e0a3bb9
musa: add docker image support ( #9685 )
...
* mtgpu: add docker image support
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
* mtgpu: enable docker workflow
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
---------
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
2024-10-10 20:10:37 +02:00
serhii-nakon
6f1d9d71f4
Fix Docker ROCM builds, use AMDGPU_TARGETS instead of GPU_TARGETS ( #9641 )
...
* Fix Docker ROCM builds, use AMDGPU_TARGETS instead of GPU_TARGETS
* Set ROCM_DOCKER_ARCH as string due it incorrectly build and cause OOM exit code
2024-09-30 20:57:12 +02:00
slaren
048de848ee
docker : fix missing binaries in full-cuda image ( #9278 )
2024-09-02 18:11:13 +02:00