mirror of
				https://github.com/ggml-org/llama.cpp.git
				synced 2025-10-27 08:21:30 +00:00 
			
		
		
		
	 6bbc598a63
			
		
	
	6bbc598a63
	
	
	
		
			
			* use hipblas based on cublas * Update Makefile for the Cuda kernels * Expand arch list and make it overrideable * Fix multi GPU on multiple amd architectures with rocblas_initialize() (#5) * add hipBLAS to README * new build arg LLAMA_CUDA_MMQ_Y * fix half2 decomposition * Add intrinsics polyfills for AMD * AMD assembly optimized __dp4a * Allow overriding CC_TURING * use "ROCm" instead of "CUDA" * ignore all build dirs * Add Dockerfiles * fix llama-bench * fix -nommq help for non CUDA/HIP --------- Co-authored-by: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com> Co-authored-by: ardfork <134447697+ardfork@users.noreply.github.com> Co-authored-by: funnbot <22226942+funnbot@users.noreply.github.com> Co-authored-by: Engininja2 <139037756+Engininja2@users.noreply.github.com> Co-authored-by: Kerfuffle <44031344+KerfuffleV2@users.noreply.github.com> Co-authored-by: jammm <2500920+jammm@users.noreply.github.com> Co-authored-by: jdecourval <7315817+jdecourval@users.noreply.github.com>
		
			
				
	
	
		
			18 lines
		
	
	
		
			121 B
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			18 lines
		
	
	
		
			121 B
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| *.o
 | |
| *.a
 | |
| .cache/
 | |
| .vs/
 | |
| .vscode/
 | |
| .DS_Store
 | |
| 
 | |
| build*/
 | |
| 
 | |
| models/*
 | |
| 
 | |
| /main
 | |
| /quantize
 | |
| 
 | |
| arm_neon.h
 | |
| compile_commands.json
 | |
| Dockerfile
 |