mirror of
				https://github.com/ggml-org/llama.cpp.git
				synced 2025-10-28 08:31:25 +00:00 
			
		
		
		
	llama : llama_perf + option to disable timings during decode (#9355)
* llama : llama_perf + option to disable timings during decode ggml-ci * common : add llama_arg * Update src/llama.cpp Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> * perf : separate functions in the API ggml-ci * perf : safer pointer handling + naming update ggml-ci * minor : better local var name * perf : abort on invalid sampler pointer ggml-ci --------- Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
This commit is contained in:
		| @@ -292,7 +292,7 @@ int main(int argc, char ** argv) { | ||||
|     } | ||||
|  | ||||
|     LOG_TEE("\n"); | ||||
|     llama_perf_print(ctx, LLAMA_PERF_TYPE_CONTEXT); | ||||
|     llama_perf_context_print(ctx); | ||||
|  | ||||
|     // clean up | ||||
|     llama_batch_free(query_batch); | ||||
|   | ||||
		Reference in New Issue
	
	Block a user
	 Georgi Gerganov
					Georgi Gerganov