mirror of
				https://github.com/ggml-org/llama.cpp.git
				synced 2025-11-03 09:22:01 +00:00 
			
		
		
		
	* ci : migrate ggml ci to a self-hosted runners * ci : add T4 runner * ci : add instructions for adding self-hosted runners * ci : disable test-backend-ops from debug builds due to slowness * ci : add AMD V710 runner (vulkan) * cont : add ROCM workflow * ci : switch to qwen3 0.6b model * cont : fix the context size
		
			
				
	
	
	
		
			1.3 KiB
		
	
	
	
	
	
	
	
			
		
		
	
	
			1.3 KiB
		
	
	
	
	
	
	
	
CI
This CI implements heavy-duty workflows that run on self-hosted runners. Typically the purpose of these workflows is to cover hardware configurations that are not available from Github-hosted runners and/or require more computational resource than normally available.
It is a good practice, before publishing changes to execute the full CI locally on your machine. For example:
mkdir tmp
# CPU-only build
bash ./ci/run.sh ./tmp/results ./tmp/mnt
# with CUDA support
GG_BUILD_CUDA=1 bash ./ci/run.sh ./tmp/results ./tmp/mnt
# with SYCL support
source /opt/intel/oneapi/setvars.sh
GG_BUILD_SYCL=1 bash ./ci/run.sh ./tmp/results ./tmp/mnt
# with MUSA support
GG_BUILD_MUSA=1 bash ./ci/run.sh ./tmp/results ./tmp/mnt
# etc.
Adding self-hosted runners
- Add a self-hosted 
ggml-ciworkflow to .github/workflows/build.yml with an appropriate label - Request a runner token from 
ggml-org(for example, via a comment in the PR or email) - Set-up a machine using the received token (docs)
 - Optionally update ci/run.sh to build and run on the target platform by gating the implementation with a 
GG_BUILD_...env