llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-27 08:21:30 +00:00

Author	SHA1	Message	Date
Neo Zhang Jianyu	2be72c2b12	SYCL: Update to oneAPI 2025.2 (#16371 ) * update oneapi to 2025.2, use deep-learning-essentials to replace base-tool * update to 2025.2 use deeplearn essi to replace base toolkit * add missed dll * add deep learning essentials * add sycl-ls --------- Co-authored-by: Zhang Jianyu <zhang.jianyu@outlook.com>	2025-10-02 10:16:25 +03:00
uvos	c8dedc9999	CI: reenable cdna in rocm docker builds (#16376 )	2025-10-01 23:32:39 +02:00
uvos	1fe4e38cc2	ci: Properly install rocwmma for hip builds (#16305 ) * CI: Properly install rocwmma for hip builds on windows we now windows install rocwmma from ubuntu pacakges * CI: update linux rocm docker build to use rocm 7.0	2025-10-01 20:18:03 +02:00
R0CKSTAR	d9e0e7c819	ci : fix musa docker build (#16306 ) Signed-off-by: Xiaodong Ye <yeahdongcn@gmail.com>	2025-09-28 16:38:15 +02:00
R0CKSTAR	a86a580a66	musa: upgrade musa sdk to 4.3.0 (#16240 ) Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>	2025-09-26 02:56:38 +02:00
Aaron Teo	5fb557653b	devops: fix s390x docker release failure (#16231 )	2025-09-25 11:36:30 +08:00
Aaron Teo	4b9f4cb0f8	devops: add s390x containers (#15915 ) * devops: add s390x dockerfile Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: add missing ninja Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: move s390x docker into cpu docker Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: rework s390x docker Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: copy more tools Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: add server build step Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: remove apt clean steps as distroless misses it Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: remove apt commands from distroless Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: fix shared libs in distroless Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: use correct libs path Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: fix shared libs Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: add collector stage Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: fix missing stage ref Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: fix permission issue Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: fix unknown model loading failures Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: attempt at fixing model loading failure Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: fix missing ggml shared object failure to load model Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: remove move shared objects Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: move libggml-cpu and blas into bin Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: finalise hardened server stage Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: add cli target Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: fix typos Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: fix missing shared libraries in base Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: update debian target Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: formalise llama.cpp loc Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * Revert "devops: formalise llama.cpp loc" This reverts commit `0a7664af84`. Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: formalise llama.cpp loc Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> (cherry picked from commit `0a7664af84`) Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: attempt at fixing missing dir Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: attempt at making it cache the build Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: fix copying process Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: make build dir an argument Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * Revert "devops: make build dir an argument" This reverts commit `438698976b`. Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: add build stage for gguf-py Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: move gguf-py installation into build stage Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: break system packages? Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: add rust compiler installer Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: fix rustc not found Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: remove cache mount to allow rustc to persist Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: move rustc installation to another layer Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: move gguf-py installation to full stage, fix copying Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: remove rustc installation in build Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: disable full target for now Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: attempting static build Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: merge s390x dockerfile into cpu for now Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: switch to gcc image for build step Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: remove build essentials Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: install openblas into base target Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: go back to s390x dockerfile Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: remove libggml and libblas Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: add full target Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: add break system packages Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: add libjpeg Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: add missing cmake dep Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: finalise docker images for s390x Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: add custom openblas patch Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: use libopenblas-dev instead of libopenblas-openmp-dev Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * devops: add s390x docker build Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> --------- Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>	2025-09-23 13:59:34 +08:00
Diego Devesa	dc381aa9a6	docker : enable rocWMMA in ROCm images, add gfx1151 (#15997 )	2025-09-15 23:38:52 +02:00
Adam	0fa154e350	rocm.Dockerfile: added gfx1200,gfx1201 architectures to support AMD Radeon RX 9000 series (#15994 ) * rocm.Dockerfile: added gfx1200,gfx1201 architectures to support AMD Radeon RX 9000 series https://rocm.docs.amd.com/projects/install-on-linux/en/docs-6.4.1/reference/system-requirements.html#rdna-os states the Radeon RX 9000 series is supported support from Ubuntu 24.04.2, and the dockerfile is using 24.04 which is ROCm 6.4. This fixed the `ROCm error: invalid device function` I was getting when trying to use the rocm container.	2025-09-14 20:43:54 +02:00
R0CKSTAR	b55f06e1aa	vulkan.Dockerfile: install vulkan SDK using tarball (#15282 ) Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>	2025-08-23 08:58:57 +02:00
Dobri Danchev	618575c582	Fix broken build: require updated pip to support --break-system-packages (#15357 ) * Revert "devops : fix compile bug when the BASE_CUDA_DEV_CONTAINER is based on Ubuntu 24.04 (#15005)" This reverts commit `e4e915912c`. * devops: Allow pip to modify externally-managed python environment (system installation) - Updated pip install commands to include the --break-system-packages flag, ensuring compatibility when working with system-managed Python environments (PEP 668). - Note: The --break-system-packages option was introduced in 2023. Ensure pip is updated to a recent version before using this flag. fixes [#15004](https://github.com/danchev/llama.cpp/issues/15004)	2025-08-18 12:50:48 +02:00
simevo	e4e915912c	devops : fix compile bug when the BASE_CUDA_DEV_CONTAINER is based on Ubuntu 24.04 (#15005 ) fixes #15004 Co-authored-by: Paolo Greppi <paolo.greppi@libpf.com>	2025-08-14 18:45:27 +03:00
Christian Kastner	646944cfa8	docker : Enable GGML_CPU_ALL_VARIANTS for ARM (#15267 )	2025-08-14 16:22:58 +02:00
Ali Tariq	648ebcdb73	ci : Added CI with RISC-V RVV1.0 Hardware (#14439 ) * Changed the CI file to hw * Changed the CI file to hw * Added to sudoers for apt * Removed the clone command and used checkout * Added libcurl * Added gcc-14 * Checking gcc --version * added gcc-14 symlink * added CC and C++ variables * Added the gguf weight * Changed the weights path * Added system specification * Removed white spaces * ci: Replace Jenkins riscv native build Cloud-V pipeline with GitHub Actions workflow Removed the legacy .devops/cloud-v-pipeline Jenkins CI configuration and introduced .github/workflows/build-riscv-native.yml for native RISC-V builds using GitHub Actions. * removed trailing whitespaces --------- Co-authored-by: Akif Ejaz <akifejaz40@gmail.com>	2025-08-13 13:14:44 +03:00
diannao	2860d479b4	docker : add cann build pipline (#14591 ) * docker: add cann build pipline * docker: add cann build pipline * docker: fix cann devops * cann : fix multi card hccl * Update ggml/src/ggml-cann/ggml-cann.cpp Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com> * Update ggml-cann.cpp --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>	2025-08-01 10:02:34 +08:00
deepsek	66906cd82a	HIP: Enable Matrix cores for MMQ Kernels, Enable stream-K for CDNA 3 (#14624 ) This commit adds support for MFMA instructions to MMQ. CDNA1/GFX908 CDNA2/GFX90a and CDNA3/GFX942 are supported by the MFMA-enabled code path added by this commit. The code path and stream-k is only enabled on CDNA3 for now as it fails to outperform blas in all cases on the other devices. Blas is currently only consistently outperformed on CDNA3 due to issues in the amd-provided blas libraries. This commit also improves the awareness of MMQ towards different warp sizes and as a side effect improves the performance of all quant formats besides q4_0 and q4_1, which regress slightly, on GCN gpus.	2025-07-27 00:28:14 +02:00
R0CKSTAR	3f4fc97f1d	musa: upgrade musa sdk to rc4.2.0 (#14498 ) * musa: apply mublas API changes Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * musa: update musa version to 4.2.0 Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * musa: restore MUSA graph settings in CMakeLists.txt Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * musa: disable mudnnMemcpyAsync by default Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * musa: switch back to non-mudnn images Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * minor changes Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * musa: restore rc in docker image tag Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> --------- Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>	2025-07-24 20:05:37 +01:00
Wroclaw	760b4484e3	nix : use optionalAttrs for env mkDerivation attrset argument (#14726 )	2025-07-17 15:18:16 -07:00
Vedran Miletić	e9b6350e61	scripts : make the shell scripts cross-platform (#14341 )	2025-06-30 10:17:18 +02:00
Svetlozar Georgiev	40643edb86	sycl: fix docker image (#14144 )	2025-06-13 18:32:56 +02:00
R0CKSTAR	33983057d0	musa: Upgrade MUSA SDK version to rc4.0.1 and use mudnn::Unary::IDENTITY op to accelerate D2D memory copy (#13647 ) * musa: fix build warning (unused parameter) Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * musa: upgrade MUSA SDK version to rc4.0.1 Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * musa: use mudnn::Unary::IDENTITY op to accelerate D2D memory copy Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * Update ggml/src/ggml-cuda/cpy.cu Co-authored-by: Johannes Gäßler <johannesg@5d6.de> * musa: remove MUDNN_CHECK_GEN and use CUDA_CHECK_GEN instead in MUDNN_CHECK Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> --------- Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> Co-authored-by: Johannes Gäßler <johannesg@5d6.de>	2025-05-21 09:58:49 +08:00
Alberto Cabrera Pérez	f71f40a284	ci : upgraded oneAPI version in SYCL workflows and dockerfile (#13532 )	2025-05-19 11:46:09 +01:00
Xuan-Son Nguyen	da84c04d8f	docker : do not build tests (#13204 ) * docker : do not build tests * include "ggml-cpu.h"	2025-04-30 10:44:07 +02:00
Rudi Servo	b0091ecc1e	docker : added all CPU to GPU images (#12749 )	2025-04-10 01:17:12 +02:00
Chenguang Li	6e1c4cebdb	CANN: Support Opt CONV_TRANSPOSE_1D and ELU (#12786 ) * [CANN] Support ELU and CONV_TRANSPOSE_1D * [CANN]Modification review comments * [CANN]Modification review comments * [CANN]name adjustment * [CANN]remove lambda used in template * [CANN]Use std::func instead of template * [CANN]Modify the code according to the review comments --------- Signed-off-by: noemotiovon <noemotiovon@gmail.com>	2025-04-09 14:04:14 +08:00
Xuan-Son Nguyen	bd3f59f812	cmake : enable curl by default (#12761 ) * cmake : enable curl by default * no curl if no examples * fix build * fix build-linux-cross * add windows-setup-curl * fix * shell * fix path * fix windows-latest-cmake* * run: include_directories * LLAMA_RUN_EXTRA_LIBS * sycl: no llama_curl * no test-arg-parser on windows * clarification * try riscv64 / arm64 * windows: include libcurl inside release binary * add msg * fix mac / ios / android build * will this fix xcode? * try clearing the cache * add bunch of licenses * revert clear cache * fix xcode * fix xcode (2) * fix typo	2025-04-07 13:35:19 +02:00
Georgi Gerganov	68ff663a04	repo : update links to new url (#11886 ) * repo : update links to new url ggml-ci * cont : more urls ggml-ci	2025-02-15 16:40:57 +02:00
Georgi Gerganov	dbc2ec59b5	docker : drop to CUDA 12.4 (#11869 ) * docker : drop to CUDA 12.4 * docker : update readme [no ci]	2025-02-14 14:48:40 +02:00
R0CKSTAR	bd6e55bfd3	musa: bump MUSA SDK version to rc3.1.1 (#11822 ) * musa: Update MUSA SDK version to rc3.1.1 Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * musa: Remove workaround in PR #10042 Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> --------- Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>	2025-02-13 13:28:18 +01:00
Xuan-Son Nguyen	d0c08040b6	ci : fix build CPU arm64 (#11472 ) * ci : fix build CPU arm64 * failed, trying ubuntu 22 * vulkan: ubuntu 24 * vulkan : jammy --> noble	2025-01-29 00:02:56 +01:00
Nuno	d7d1eccacc	docker: allow installing pip packages system-wide (#11437 ) Signed-off-by: rare-magma <rare-magma@posteo.eu>	2025-01-28 14:17:25 +00:00
Nuno	f643120bad	docker: add perplexity and bench commands to full image (#11438 ) Signed-off-by: rare-magma <rare-magma@posteo.eu>	2025-01-28 10:42:32 +00:00
Xuan Son Nguyen	caf773f249	docker : fix ARM build and Vulkan build (#11434 ) * ci : do not fail-fast for docker * build arm64/amd64 separatedly * fix pip * no fast fail * vulkan: try jammy	2025-01-26 22:45:32 +01:00
Nuno	6f53d8a6b4	docker: add missing vulkan library to base layer and update to 24.04 (#11422 ) Signed-off-by: rare-magma <rare-magma@posteo.eu>	2025-01-26 18:22:43 +01:00
Diego Devesa	6e264a905b	docker : add GGML_CPU_ARM_ARCH arg to select ARM architecture to build for (#11419 )	2025-01-25 17:22:41 +01:00
Diego Devesa	20a758155b	docker : fix CPU ARM build (#11403 ) * docker : fix CPU ARM build * add CURL to other builds	2025-01-25 15:22:29 +01:00
Rudi Servo	7c0e285858	devops : add docker-multi-stage builds (#10832 )	2024-12-22 23:22:58 +01:00
Evgeny Kurnevsky	e52aba537a	nix: allow to override rocm gpu targets (#10794 ) This allows to reduce compile time when you are building for a single GPU.	2024-12-14 10:17:36 -08:00
Corentin REGAL	11e07fd63b	fix: graceful shutdown for Docker images (#10815 )	2024-12-13 18:23:50 +01:00
Diego Devesa	59f4db1088	ggml : add predefined list of CPU backend variants to build (#10626 ) * ggml : add predefined list of CPU backend variants to build * update CPU dockerfiles	2024-12-04 14:45:40 +01:00
Diego Devesa	3420909dff	ggml : automatic selection of best CPU backend (#10606 ) * ggml : automatic selection of best CPU backend * amx : minor opt * add GGML_AVX_VNNI to enable avx-vnni, fix checks	2024-12-01 16:12:41 +01:00
R0CKSTAR	249cd93da3	mtgpu: Add MUSA_DOCKER_ARCH in Dockerfiles && update cmake and make (#10516 ) Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>	2024-11-26 17:00:41 +01:00
Xuan Son Nguyen	45abe0f74e	server : replace behave with pytest (#10416 ) * server : replace behave with pytest * fix test on windows * misc * add more tests * more tests * styling * log less, fix embd test * added all sequential tests * fix coding style * fix save slot test * add parallel completion test * fix parallel test * remove feature files * update test docs * no cache_prompt for some tests * add test_cache_vs_nocache_prompt	2024-11-26 16:20:18 +01:00
Johannes Gäßler	75207b3a88	docker: use GGML_NATIVE=OFF (#10368 )	2024-11-18 00:21:53 +01:00
Romain Biessy	57f8355b29	sycl: Update Intel docker images to use DPC++ 2025.0 (#10305 )	2024-11-15 13:10:45 +02:00
Chenguang Li	231f9360d9	cann: dockerfile and doc adjustment (#10302 ) Co-authored-by: noemotiovon <noemotiovon@gmail.com>	2024-11-15 15:09:35 +08:00
Diego Devesa	ae8de6d50a	ggml : build backends as libraries (#10256 ) * ggml : build backends as libraries --------- Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: R0CKSTAR <xiaodong.ye@mthreads.com>	2024-11-14 18:04:35 +01:00
R0CKSTAR	cf8e0a3bb9	musa: add docker image support (#9685 ) * mtgpu: add docker image support Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * mtgpu: enable docker workflow Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> --------- Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>	2024-10-10 20:10:37 +02:00
serhii-nakon	6f1d9d71f4	Fix Docker ROCM builds, use AMDGPU_TARGETS instead of GPU_TARGETS (#9641 ) * Fix Docker ROCM builds, use AMDGPU_TARGETS instead of GPU_TARGETS * Set ROCM_DOCKER_ARCH as string due it incorrectly build and cause OOM exit code	2024-09-30 20:57:12 +02:00
slaren	048de848ee	docker : fix missing binaries in full-cuda image (#9278 )	2024-09-02 18:11:13 +02:00

1 2 3

131 Commits