Files
llama.cpp/ci/README.md
Georgi Gerganov 28baac9c9f ci : migrate ggml ci to self-hosted runners (#16116)
* ci : migrate ggml ci to a self-hosted runners

* ci : add T4 runner

* ci : add instructions for adding self-hosted runners

* ci : disable test-backend-ops from debug builds due to slowness

* ci : add AMD V710 runner (vulkan)

* cont : add ROCM workflow

* ci : switch to qwen3 0.6b model

* cont : fix the context size
2025-09-21 16:50:45 +03:00

1.3 KiB

CI

This CI implements heavy-duty workflows that run on self-hosted runners. Typically the purpose of these workflows is to cover hardware configurations that are not available from Github-hosted runners and/or require more computational resource than normally available.

It is a good practice, before publishing changes to execute the full CI locally on your machine. For example:

mkdir tmp

# CPU-only build
bash ./ci/run.sh ./tmp/results ./tmp/mnt

# with CUDA support
GG_BUILD_CUDA=1 bash ./ci/run.sh ./tmp/results ./tmp/mnt

# with SYCL support
source /opt/intel/oneapi/setvars.sh
GG_BUILD_SYCL=1 bash ./ci/run.sh ./tmp/results ./tmp/mnt

# with MUSA support
GG_BUILD_MUSA=1 bash ./ci/run.sh ./tmp/results ./tmp/mnt

# etc.

Adding self-hosted runners

  • Add a self-hosted ggml-ci workflow to .github/workflows/build.yml with an appropriate label
  • Request a runner token from ggml-org (for example, via a comment in the PR or email)
  • Set-up a machine using the received token (docs)
  • Optionally update ci/run.sh to build and run on the target platform by gating the implementation with a GG_BUILD_... env