mirror of
				https://github.com/ggml-org/llama.cpp.git
				synced 2025-10-30 08:42:00 +00:00 
			
		
		
		
	 22f281aa16
			
		
	
	22f281aa16
	
	
	
		
			
			Changes:
- Move each example into its own function. This makes the code much
  easier to read and understand.
- Make the program easy to only run one test by commenting out function
  calls in main().
- Make the output easy to parse by indenting the output for each example.
- Add shebang and +x bit to make it clear it's an executable.
- Make the host configurable via --host with a default 127.0.0.1:8080.
- Make the code look in the tools list to call the registered tool,
  instead of hardcoding the returned values. This makes the code more
  copy-pastable.
- Add error checking, so that the program exits 1 if the LLM didn't
  returned expected values. It's super useful to check for correctness.
Testing:
- Tested with Mistral-7B-Instruct-v0.3 in F16 and Q5_K_M and
  Meta-Llama-3-8B-Instruct in F16 and Q5_K_M.
  - I did not observe a failure even once in Mistral-7B-Instruct-v0.3.
  - Llama-3 failed about a third of the time in example_concurrent: it
    only returned one call instead of 3. Even for F16.
Potential follow ups:
- Do not fix the prompt encoding yet. Surprisingly it mostly works even
  if the prompt encoding is not model optimized.
- Add chained answer and response.
Test only change.
		
	
		
			
				
	
	
	
		
			13 KiB
		
	
	
	
	
		
			Executable File
		
	
	
	
	
			
		
		
	
	
			13 KiB
		
	
	
	
	
		
			Executable File