mirror of
				https://github.com/ggml-org/llama.cpp.git
				synced 2025-10-31 08:51:55 +00:00 
			
		
		
		
	doc: fix outdated default value of batch size (#6336)
* doc: fix outdated default value of batch size * doc: add doc for ubatch-size
This commit is contained in:
		| @@ -296,7 +296,9 @@ These options help improve the performance and memory usage of the LLaMA models. | ||||
|  | ||||
| ### Batch Size | ||||
|  | ||||
| -   `-b N, --batch-size N`: Set the batch size for prompt processing (default: 512). This large batch size benefits users who have BLAS installed and enabled it during the build. If you don't have BLAS enabled ("BLAS=0"), you can use a smaller number, such as 8, to see the prompt progress as it's evaluated in some situations. | ||||
| -   `-b N, --batch-size N`: Set the batch size for prompt processing (default: `2048`). This large batch size benefits users who have BLAS installed and enabled it during the build. If you don't have BLAS enabled ("BLAS=0"), you can use a smaller number, such as 8, to see the prompt progress as it's evaluated in some situations. | ||||
|  | ||||
| - `-ub N`, `--ubatch-size N`: physical maximum batch size. This is for pipeline parallelization. Default: `512`. | ||||
|  | ||||
| ### Prompt Caching | ||||
|  | ||||
|   | ||||
		Reference in New Issue
	
	Block a user
	 Ting Sun
					Ting Sun