mirror of
				https://github.com/ggml-org/llama.cpp.git
				synced 2025-10-30 08:42:00 +00:00 
			
		
		
		
	HIP: rocWMMA documentation and enabling in workflow builds (#12179)
* Enable rocWMMA for Windows CI build * Enable for Ubuntu * GGML_HIP_ROCWMMA_FATTN documentation work
This commit is contained in:
		
							
								
								
									
										16
									
								
								.github/workflows/build.yml
									
									
									
									
										vendored
									
									
								
							
							
						
						
									
										16
									
								
								.github/workflows/build.yml
									
									
									
									
										vendored
									
									
								
							| @@ -467,6 +467,7 @@ jobs: | ||||
|         run: | | ||||
|           cmake -B build -S . \ | ||||
|             -DCMAKE_HIP_COMPILER="$(hipconfig -l)/clang" \ | ||||
|             -DGGML_HIP_ROCWMMA_FATTN=ON \ | ||||
|             -DGGML_HIP=ON | ||||
|           cmake --build build --config Release -j $(nproc) | ||||
|  | ||||
| @@ -476,6 +477,7 @@ jobs: | ||||
|           cmake -B build2 -S . \ | ||||
|             -DCMAKE_C_COMPILER=hipcc \ | ||||
|             -DCMAKE_CXX_COMPILER=hipcc \ | ||||
|             -DGGML_HIP_ROCWMMA_FATTN=ON \ | ||||
|             -DGGML_HIP=ON | ||||
|           cmake --build build2 --config Release -j $(nproc) | ||||
|  | ||||
| @@ -1202,6 +1204,11 @@ jobs: | ||||
|         id: checkout | ||||
|         uses: actions/checkout@v4 | ||||
|  | ||||
|       - name: Clone rocWMMA repository | ||||
|         id: clone_rocwmma | ||||
|         run: | | ||||
|           git clone https://github.com/rocm/rocwmma --branch rocm-6.2.4 --depth 1 | ||||
|  | ||||
|       - name: Install | ||||
|         id: depends | ||||
|         run: | | ||||
| @@ -1231,8 +1238,10 @@ jobs: | ||||
|           cmake -G "Unix Makefiles" -B build -S . ` | ||||
|             -DCMAKE_C_COMPILER="${env:HIP_PATH}\bin\clang.exe" ` | ||||
|             -DCMAKE_CXX_COMPILER="${env:HIP_PATH}\bin\clang++.exe" ` | ||||
|             -DCMAKE_CXX_FLAGS="-Irocwmma/library/include/" ` | ||||
|             -DCMAKE_BUILD_TYPE=Release ` | ||||
|             -DGGML_HIP=ON ` | ||||
|             -DGGML_HIP_ROCWMMA_FATTN=ON ` | ||||
|             -DGGML_RPC=ON | ||||
|           cmake --build build -j ${env:NUMBER_OF_PROCESSORS} | ||||
|  | ||||
| @@ -1251,6 +1260,11 @@ jobs: | ||||
|         with: | ||||
|             fetch-depth: 0 | ||||
|  | ||||
|       - name: Clone rocWMMA repository | ||||
|         id: clone_rocwmma | ||||
|         run: | | ||||
|           git clone https://github.com/rocm/rocwmma --branch rocm-6.2.4 --depth 1 | ||||
|  | ||||
|       - name: ccache | ||||
|         uses: hendrikmuhs/ccache-action@v1.2.16 | ||||
|         with: | ||||
| @@ -1280,8 +1294,10 @@ jobs: | ||||
|           cmake -G "Unix Makefiles" -B build -S . ` | ||||
|             -DCMAKE_C_COMPILER="${env:HIP_PATH}\bin\clang.exe" ` | ||||
|             -DCMAKE_CXX_COMPILER="${env:HIP_PATH}\bin\clang++.exe" ` | ||||
|             -DCMAKE_CXX_FLAGS="-Irocwmma/library/include/" ` | ||||
|             -DCMAKE_BUILD_TYPE=Release ` | ||||
|             -DAMDGPU_TARGETS=${{ matrix.gpu_target }} ` | ||||
|             -DGGML_HIP_ROCWMMA_FATTN=ON ` | ||||
|             -DGGML_HIP=ON ` | ||||
|             -DGGML_RPC=ON | ||||
|           cmake --build build -j ${env:NUMBER_OF_PROCESSORS} | ||||
|   | ||||
| @@ -235,6 +235,12 @@ You can download it from your Linux distro's package manager or from here: [ROCm | ||||
|   On Linux it is also possible to use unified memory architecture (UMA) to share main memory between the CPU and integrated GPU by setting `-DGGML_HIP_UMA=ON`. | ||||
|   However, this hurts performance for non-integrated GPUs (but enables working with integrated GPUs). | ||||
|  | ||||
|   To enhance flash attention performance on RDNA3+ or CDNA architectures, you can utilize the rocWMMA library by enabling the `-DGGML_HIP_ROCWMMA_FATTN=ON` option. This requires rocWMMA headers to be installed on the build system. | ||||
|  | ||||
|   The rocWMMA library is included by default when installing the ROCm SDK using the `rocm` meta package provided by AMD. Alternatively, if you are not using the meta package, you can install the library using the `rocwmma-dev` or `rocwmma-devel` package, depending on your system's package manager. | ||||
|  | ||||
|   As an alternative, you can manually install the library by cloning it from the official [GitHub repository](https://github.com/ROCm/rocWMMA), checkout the corresponding version tag (e.g. `rocm-6.2.4`) and set `-DCMAKE_CXX_FLAGS="-I<path/to/rocwmma>/library/include/"` in CMake. This also works under Windows despite not officially supported by AMD. | ||||
|  | ||||
|   Note that if you get the following error: | ||||
|   ``` | ||||
|   clang: error: cannot find ROCm device library; provide its path via '--rocm-path' or '--rocm-device-lib-path', or pass '-nogpulib' to build without ROCm device library | ||||
|   | ||||
		Reference in New Issue
	
	Block a user
	 David Huang
					David Huang