mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2025-10-27 08:21:30 +00:00
* ci : migrate ggml ci to a self-hosted runners * ci : add T4 runner * ci : add instructions for adding self-hosted runners * ci : disable test-backend-ops from debug builds due to slowness * ci : add AMD V710 runner (vulkan) * cont : add ROCM workflow * ci : switch to qwen3 0.6b model * cont : fix the context size
34 lines
1.3 KiB
Markdown
34 lines
1.3 KiB
Markdown
# CI
|
|
|
|
This CI implements heavy-duty workflows that run on self-hosted runners. Typically the purpose of these workflows is to
|
|
cover hardware configurations that are not available from Github-hosted runners and/or require more computational
|
|
resource than normally available.
|
|
|
|
It is a good practice, before publishing changes to execute the full CI locally on your machine. For example:
|
|
|
|
```bash
|
|
mkdir tmp
|
|
|
|
# CPU-only build
|
|
bash ./ci/run.sh ./tmp/results ./tmp/mnt
|
|
|
|
# with CUDA support
|
|
GG_BUILD_CUDA=1 bash ./ci/run.sh ./tmp/results ./tmp/mnt
|
|
|
|
# with SYCL support
|
|
source /opt/intel/oneapi/setvars.sh
|
|
GG_BUILD_SYCL=1 bash ./ci/run.sh ./tmp/results ./tmp/mnt
|
|
|
|
# with MUSA support
|
|
GG_BUILD_MUSA=1 bash ./ci/run.sh ./tmp/results ./tmp/mnt
|
|
|
|
# etc.
|
|
```
|
|
|
|
# Adding self-hosted runners
|
|
|
|
- Add a self-hosted `ggml-ci` workflow to [[.github/workflows/build.yml]] with an appropriate label
|
|
- Request a runner token from `ggml-org` (for example, via a comment in the PR or email)
|
|
- Set-up a machine using the received token ([docs](https://docs.github.com/en/actions/how-tos/manage-runners/self-hosted-runners/add-runners))
|
|
- Optionally update [ci/run.sh](https://github.com/ggml-org/llama.cpp/blob/master/ci/run.sh) to build and run on the target platform by gating the implementation with a `GG_BUILD_...` env
|