Commit Graph

128 Commits

Author SHA1 Message Date
R0CKSTAR
d9e0e7c819 ci : fix musa docker build (#16306)
Signed-off-by: Xiaodong Ye <yeahdongcn@gmail.com>
2025-09-28 16:38:15 +02:00
R0CKSTAR
a86a580a66 musa: upgrade musa sdk to 4.3.0 (#16240)
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-09-26 02:56:38 +02:00
Aaron Teo
5fb557653b devops: fix s390x docker release failure (#16231) 2025-09-25 11:36:30 +08:00
Aaron Teo
4b9f4cb0f8 devops: add s390x containers (#15915)
* devops: add s390x dockerfile

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: add missing ninja

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: move s390x docker into cpu docker

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: rework s390x docker

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: copy more tools

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: add server build step

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: remove apt clean steps as distroless misses it

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: remove apt commands from distroless

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: fix shared libs in distroless

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: use correct libs path

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: fix shared libs

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: add collector stage

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: fix missing stage ref

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: fix permission issue

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: fix unknown model loading failures

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: attempt at fixing model loading failure

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: fix missing ggml shared object

failure to load model

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: remove move shared objects

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: move libggml-cpu and blas into bin

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: finalise hardened server stage

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: add cli target

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: fix typos

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: fix missing shared libraries in base

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: update debian target

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: formalise llama.cpp loc

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* Revert "devops: formalise llama.cpp loc"

This reverts commit 0a7664af84.

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: formalise llama.cpp loc

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
(cherry picked from commit 0a7664af84)
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: attempt at fixing missing dir

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: attempt at making it cache the build

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: fix copying process

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: make build dir an argument

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* Revert "devops: make build dir an argument"

This reverts commit 438698976b.

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: add build stage for gguf-py

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: move gguf-py installation into build stage

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: break system packages?

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: add rust compiler installer

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: fix rustc not found

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: remove cache mount to allow rustc to persist

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: move rustc installation to another layer

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: move gguf-py installation to full stage, fix copying

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: remove rustc installation in build

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: disable full target for now

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: attempting static build

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: merge s390x dockerfile into cpu for now

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: switch to gcc image for build step

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: remove build essentials

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: install openblas into base target

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: go back to s390x dockerfile

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: remove libggml and libblas

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: add full target

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: add break system packages

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: add libjpeg

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: add missing cmake dep

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: finalise docker images for s390x

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: add custom openblas patch

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: use libopenblas-dev instead of libopenblas-openmp-dev

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* devops: add s390x docker build

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

---------

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-09-23 13:59:34 +08:00
Diego Devesa
dc381aa9a6 docker : enable rocWMMA in ROCm images, add gfx1151 (#15997) 2025-09-15 23:38:52 +02:00
Adam
0fa154e350 rocm.Dockerfile: added gfx1200,gfx1201 architectures to support AMD Radeon RX 9000 series (#15994)
* rocm.Dockerfile: added gfx1200,gfx1201 architectures to support  AMD Radeon RX 9000 series

https://rocm.docs.amd.com/projects/install-on-linux/en/docs-6.4.1/reference/system-requirements.html#rdna-os
states the Radeon RX 9000 series is supported support from Ubuntu 24.04.2, and the dockerfile is using 24.04 which is ROCm 6.4.

This fixed the `ROCm error: invalid device function` I was getting when trying to use the rocm container.
2025-09-14 20:43:54 +02:00
R0CKSTAR
b55f06e1aa vulkan.Dockerfile: install vulkan SDK using tarball (#15282)
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-08-23 08:58:57 +02:00
Dobri Danchev
618575c582 Fix broken build: require updated pip to support --break-system-packages (#15357)
* Revert "devops : fix compile bug when the BASE_CUDA_DEV_CONTAINER is based on Ubuntu 24.04 (#15005)"

This reverts commit e4e915912c.

* devops: Allow pip to modify externally-managed python environment (system installation)

- Updated pip install commands to include the --break-system-packages
  flag, ensuring compatibility when working with system-managed Python
  environments (PEP 668).

- Note: The --break-system-packages option was introduced in 2023.
  Ensure pip is updated to a recent version before using this flag.

fixes [#15004](https://github.com/danchev/llama.cpp/issues/15004)
2025-08-18 12:50:48 +02:00
simevo
e4e915912c devops : fix compile bug when the BASE_CUDA_DEV_CONTAINER is based on Ubuntu 24.04 (#15005)
fixes #15004

Co-authored-by: Paolo Greppi <paolo.greppi@libpf.com>
2025-08-14 18:45:27 +03:00
Christian Kastner
646944cfa8 docker : Enable GGML_CPU_ALL_VARIANTS for ARM (#15267) 2025-08-14 16:22:58 +02:00
Ali Tariq
648ebcdb73 ci : Added CI with RISC-V RVV1.0 Hardware (#14439)
* Changed the CI file to hw

* Changed the CI file to hw

* Added to sudoers for apt

* Removed the clone command and used checkout

* Added libcurl

* Added gcc-14

* Checking gcc --version

* added gcc-14 symlink

* added CC and C++ variables

* Added the gguf weight

* Changed the weights path

* Added system specification

* Removed white spaces

* ci: Replace Jenkins riscv native build Cloud-V pipeline with GitHub Actions workflow

Removed the legacy .devops/cloud-v-pipeline Jenkins CI configuration and introduced .github/workflows/build-riscv-native.yml for native RISC-V builds using GitHub Actions.

* removed trailing whitespaces

---------

Co-authored-by: Akif Ejaz <akifejaz40@gmail.com>
2025-08-13 13:14:44 +03:00
diannao
2860d479b4 docker : add cann build pipline (#14591)
* docker: add cann build pipline

* docker: add cann build pipline

* docker: fix cann devops

* cann : fix multi card hccl

* Update ggml/src/ggml-cann/ggml-cann.cpp

Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>

* Update ggml-cann.cpp

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>
2025-08-01 10:02:34 +08:00
deepsek
66906cd82a HIP: Enable Matrix cores for MMQ Kernels, Enable stream-K for CDNA 3 (#14624)
This commit adds support for MFMA instructions to MMQ. CDNA1/GFX908 CDNA2/GFX90a and CDNA3/GFX942 are supported by the MFMA-enabled code path added by this commit. The code path and stream-k is only enabled on CDNA3 for now as it fails to outperform blas in all cases on the other devices.
Blas is currently only consistently outperformed on CDNA3 due to issues in the amd-provided blas libraries.
This commit also improves the awareness of MMQ towards different warp sizes and as a side effect improves the performance of all quant formats besides q4_0 and q4_1, which regress slightly, on GCN gpus.
2025-07-27 00:28:14 +02:00
R0CKSTAR
3f4fc97f1d musa: upgrade musa sdk to rc4.2.0 (#14498)
* musa: apply mublas API changes

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

* musa: update musa version to 4.2.0

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

* musa: restore MUSA graph settings in CMakeLists.txt

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

* musa: disable mudnnMemcpyAsync by default

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

* musa: switch back to non-mudnn images

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

* minor changes

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

* musa: restore rc in docker image tag

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

---------

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-07-24 20:05:37 +01:00
Wroclaw
760b4484e3 nix : use optionalAttrs for env mkDerivation attrset argument (#14726) 2025-07-17 15:18:16 -07:00
Vedran Miletić
e9b6350e61 scripts : make the shell scripts cross-platform (#14341) 2025-06-30 10:17:18 +02:00
Svetlozar Georgiev
40643edb86 sycl: fix docker image (#14144) 2025-06-13 18:32:56 +02:00
R0CKSTAR
33983057d0 musa: Upgrade MUSA SDK version to rc4.0.1 and use mudnn::Unary::IDENTITY op to accelerate D2D memory copy (#13647)
* musa: fix build warning (unused parameter)

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

* musa: upgrade MUSA SDK version to rc4.0.1

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

* musa: use mudnn::Unary::IDENTITY op to accelerate D2D memory copy

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

* Update ggml/src/ggml-cuda/cpy.cu

Co-authored-by: Johannes Gäßler <johannesg@5d6.de>

* musa: remove MUDNN_CHECK_GEN and use CUDA_CHECK_GEN instead in MUDNN_CHECK

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

---------

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
2025-05-21 09:58:49 +08:00
Alberto Cabrera Pérez
f71f40a284 ci : upgraded oneAPI version in SYCL workflows and dockerfile (#13532) 2025-05-19 11:46:09 +01:00
Xuan-Son Nguyen
da84c04d8f docker : do not build tests (#13204)
* docker : do not build tests

* include "ggml-cpu.h"
2025-04-30 10:44:07 +02:00
Rudi Servo
b0091ecc1e docker : added all CPU to GPU images (#12749) 2025-04-10 01:17:12 +02:00
Chenguang Li
6e1c4cebdb CANN: Support Opt CONV_TRANSPOSE_1D and ELU (#12786)
* [CANN] Support ELU and CONV_TRANSPOSE_1D

* [CANN]Modification review comments

* [CANN]Modification review comments

* [CANN]name adjustment

* [CANN]remove lambda used in template

* [CANN]Use std::func instead of template

* [CANN]Modify the code according to the review comments

---------

Signed-off-by: noemotiovon <noemotiovon@gmail.com>
2025-04-09 14:04:14 +08:00
Xuan-Son Nguyen
bd3f59f812 cmake : enable curl by default (#12761)
* cmake : enable curl by default

* no curl if no examples

* fix build

* fix build-linux-cross

* add windows-setup-curl

* fix

* shell

* fix path

* fix windows-latest-cmake*

* run: include_directories

* LLAMA_RUN_EXTRA_LIBS

* sycl: no llama_curl

* no test-arg-parser on windows

* clarification

* try riscv64 / arm64

* windows: include libcurl inside release binary

* add msg

* fix mac / ios / android build

* will this fix xcode?

* try clearing the cache

* add bunch of licenses

* revert clear cache

* fix xcode

* fix xcode (2)

* fix typo
2025-04-07 13:35:19 +02:00
Georgi Gerganov
68ff663a04 repo : update links to new url (#11886)
* repo : update links to new url

ggml-ci

* cont : more urls

ggml-ci
2025-02-15 16:40:57 +02:00
Georgi Gerganov
dbc2ec59b5 docker : drop to CUDA 12.4 (#11869)
* docker : drop to CUDA 12.4

* docker : update readme [no ci]
2025-02-14 14:48:40 +02:00
R0CKSTAR
bd6e55bfd3 musa: bump MUSA SDK version to rc3.1.1 (#11822)
* musa: Update MUSA SDK version to rc3.1.1

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

* musa: Remove workaround in PR #10042

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

---------

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-02-13 13:28:18 +01:00
Xuan-Son Nguyen
d0c08040b6 ci : fix build CPU arm64 (#11472)
* ci : fix build CPU arm64

* failed, trying ubuntu 22

* vulkan: ubuntu 24

* vulkan : jammy --> noble
2025-01-29 00:02:56 +01:00
Nuno
d7d1eccacc docker: allow installing pip packages system-wide (#11437)
Signed-off-by: rare-magma <rare-magma@posteo.eu>
2025-01-28 14:17:25 +00:00
Nuno
f643120bad docker: add perplexity and bench commands to full image (#11438)
Signed-off-by: rare-magma <rare-magma@posteo.eu>
2025-01-28 10:42:32 +00:00
Xuan Son Nguyen
caf773f249 docker : fix ARM build and Vulkan build (#11434)
* ci : do not fail-fast for docker

* build arm64/amd64 separatedly

* fix pip

* no fast fail

* vulkan: try jammy
2025-01-26 22:45:32 +01:00
Nuno
6f53d8a6b4 docker: add missing vulkan library to base layer and update to 24.04 (#11422)
Signed-off-by: rare-magma <rare-magma@posteo.eu>
2025-01-26 18:22:43 +01:00
Diego Devesa
6e264a905b docker : add GGML_CPU_ARM_ARCH arg to select ARM architecture to build for (#11419) 2025-01-25 17:22:41 +01:00
Diego Devesa
20a758155b docker : fix CPU ARM build (#11403)
* docker : fix CPU ARM build

* add CURL to other builds
2025-01-25 15:22:29 +01:00
Rudi Servo
7c0e285858 devops : add docker-multi-stage builds (#10832) 2024-12-22 23:22:58 +01:00
Evgeny Kurnevsky
e52aba537a nix: allow to override rocm gpu targets (#10794)
This allows to reduce compile time when you are building for a single GPU.
2024-12-14 10:17:36 -08:00
Corentin REGAL
11e07fd63b fix: graceful shutdown for Docker images (#10815) 2024-12-13 18:23:50 +01:00
Diego Devesa
59f4db1088 ggml : add predefined list of CPU backend variants to build (#10626)
* ggml : add predefined list of CPU backend variants to build

* update CPU dockerfiles
2024-12-04 14:45:40 +01:00
Diego Devesa
3420909dff ggml : automatic selection of best CPU backend (#10606)
* ggml : automatic selection of best CPU backend

* amx : minor opt

* add GGML_AVX_VNNI to enable avx-vnni, fix checks
2024-12-01 16:12:41 +01:00
R0CKSTAR
249cd93da3 mtgpu: Add MUSA_DOCKER_ARCH in Dockerfiles && update cmake and make (#10516)
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2024-11-26 17:00:41 +01:00
Xuan Son Nguyen
45abe0f74e server : replace behave with pytest (#10416)
* server : replace behave with pytest

* fix test on windows

* misc

* add more tests

* more tests

* styling

* log less, fix embd test

* added all sequential tests

* fix coding style

* fix save slot test

* add parallel completion test

* fix parallel test

* remove feature files

* update test docs

* no cache_prompt for some tests

* add test_cache_vs_nocache_prompt
2024-11-26 16:20:18 +01:00
Johannes Gäßler
75207b3a88 docker: use GGML_NATIVE=OFF (#10368) 2024-11-18 00:21:53 +01:00
Romain Biessy
57f8355b29 sycl: Update Intel docker images to use DPC++ 2025.0 (#10305) 2024-11-15 13:10:45 +02:00
Chenguang Li
231f9360d9 cann: dockerfile and doc adjustment (#10302)
Co-authored-by: noemotiovon <noemotiovon@gmail.com>
2024-11-15 15:09:35 +08:00
Diego Devesa
ae8de6d50a ggml : build backends as libraries (#10256)
* ggml : build backends as libraries

---------

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: R0CKSTAR <xiaodong.ye@mthreads.com>
2024-11-14 18:04:35 +01:00
R0CKSTAR
cf8e0a3bb9 musa: add docker image support (#9685)
* mtgpu: add docker image support

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

* mtgpu: enable docker workflow

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

---------

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2024-10-10 20:10:37 +02:00
serhii-nakon
6f1d9d71f4 Fix Docker ROCM builds, use AMDGPU_TARGETS instead of GPU_TARGETS (#9641)
* Fix Docker ROCM builds, use AMDGPU_TARGETS instead of GPU_TARGETS

* Set ROCM_DOCKER_ARCH as string due it incorrectly build and cause OOM exit code
2024-09-30 20:57:12 +02:00
slaren
048de848ee docker : fix missing binaries in full-cuda image (#9278) 2024-09-02 18:11:13 +02:00
Tushar
9c1ba55733 build(nix): Package gguf-py (#5664)
* style: format with nixfmt/rfc101-style

* build(nix): Package gguf-py

* build(nix): Refactor to new scope for gguf-py

* build(nix): Exclude gguf-py from devShells

* build(nix): Refactor gguf-py derivation to take in exact deps

* build(nix): Enable pytestCheckHook and pythonImportsCheck for gguf-py

* build(python): Package python scripts with pyproject.toml

* chore: Cleanup

* dev(nix): Break up python/C devShells

* build(python): Relax pytorch version constraint

Nix has an older version

* chore: Move cmake to nativeBuildInputs for devShell

* fmt: Reconcile formatting with rebase

* style: nix fmt

* cleanup: Remove unncessary __init__.py

* chore: Suggestions from review

- Filter out non-source files from llama-scripts flake derivation
- Clean up unused closure
- Remove scripts devShell

* revert: Bad changes

* dev: Simplify devShells, restore the -extra devShell

* build(nix): Add pyyaml for gguf-py

* chore: Remove some unused bindings

* dev: Add tiktoken to -extra devShells
2024-09-02 14:21:01 +03:00
Echo Nolan
a47667cff4 nix: fix CUDA build - replace deprecated autoAddOpenGLRunpathHook
The CUDA nix build broke when we updated nixpkgs in
8cd1bcfd3f. As far as I can tell all
that happened is cudaPackages.autoAddOpenGLRunpathHook got moved to
pkgs.autoAddDriverRunpath. This commit fixes it.
2024-08-31 08:44:21 +00:00
slaren
66b039a501 docker : update CUDA images (#9213) 2024-08-28 13:20:36 +02:00