mirror of
				https://github.com/ggml-org/llama.cpp.git
				synced 2025-11-03 09:22:01 +00:00 
			
		
		
		
	Updating a few lingering doc references for rename of main to llama-cli
This commit is contained in:
		@@ -733,7 +733,7 @@ Here is an example of a few-shot interaction, invoked with the command
 | 
				
			|||||||
./llama-cli -m ./models/13B/ggml-model-q4_0.gguf -n 256 --repeat_penalty 1.0 --color -i -r "User:" -f prompts/chat-with-bob.txt
 | 
					./llama-cli -m ./models/13B/ggml-model-q4_0.gguf -n 256 --repeat_penalty 1.0 --color -i -r "User:" -f prompts/chat-with-bob.txt
 | 
				
			||||||
```
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Note the use of `--color` to distinguish between user input and generated text. Other parameters are explained in more detail in the [README](examples/main/README.md) for the `main` example program.
 | 
					Note the use of `--color` to distinguish between user input and generated text. Other parameters are explained in more detail in the [README](examples/main/README.md) for the `llama-cli` example program.
 | 
				
			||||||
 | 
					
 | 
				
			||||||

 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
@@ -958,7 +958,7 @@ docker run --gpus all -v /path/to/models:/models local/llama.cpp:server-cuda -m
 | 
				
			|||||||
 | 
					
 | 
				
			||||||
### Docs
 | 
					### Docs
 | 
				
			||||||
 | 
					
 | 
				
			||||||
- [main](./examples/main/README.md)
 | 
					- [main (cli)](./examples/main/README.md)
 | 
				
			||||||
- [server](./examples/server/README.md)
 | 
					- [server](./examples/server/README.md)
 | 
				
			||||||
- [jeopardy](./examples/jeopardy/README.md)
 | 
					- [jeopardy](./examples/jeopardy/README.md)
 | 
				
			||||||
- [BLIS](./docs/BLIS.md)
 | 
					- [BLIS](./docs/BLIS.md)
 | 
				
			||||||
 
 | 
				
			|||||||
@@ -64,7 +64,7 @@ llama-cli.exe -m models\7B\ggml-model.bin --ignore-eos -n -1
 | 
				
			|||||||
 | 
					
 | 
				
			||||||
## Common Options
 | 
					## Common Options
 | 
				
			||||||
 | 
					
 | 
				
			||||||
In this section, we cover the most commonly used options for running the `main` program with the LLaMA models:
 | 
					In this section, we cover the most commonly used options for running the `llama-cli` program with the LLaMA models:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
-   `-m FNAME, --model FNAME`: Specify the path to the LLaMA model file (e.g., `models/7B/ggml-model.gguf`; inferred from `--model-url` if set).
 | 
					-   `-m FNAME, --model FNAME`: Specify the path to the LLaMA model file (e.g., `models/7B/ggml-model.gguf`; inferred from `--model-url` if set).
 | 
				
			||||||
-   `-mu MODEL_URL --model-url MODEL_URL`: Specify a remote http url to download the file (e.g https://huggingface.co/ggml-org/models/resolve/main/phi-2/ggml-model-q4_0.gguf).
 | 
					-   `-mu MODEL_URL --model-url MODEL_URL`: Specify a remote http url to download the file (e.g https://huggingface.co/ggml-org/models/resolve/main/phi-2/ggml-model-q4_0.gguf).
 | 
				
			||||||
@@ -74,7 +74,7 @@ In this section, we cover the most commonly used options for running the `main`
 | 
				
			|||||||
 | 
					
 | 
				
			||||||
## Input Prompts
 | 
					## Input Prompts
 | 
				
			||||||
 | 
					
 | 
				
			||||||
The `main` program provides several ways to interact with the LLaMA models using input prompts:
 | 
					The `llama-cli` program provides several ways to interact with the LLaMA models using input prompts:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
-   `--prompt PROMPT`: Provide a prompt directly as a command-line option.
 | 
					-   `--prompt PROMPT`: Provide a prompt directly as a command-line option.
 | 
				
			||||||
-   `--file FNAME`: Provide a file containing a prompt or multiple prompts.
 | 
					-   `--file FNAME`: Provide a file containing a prompt or multiple prompts.
 | 
				
			||||||
@@ -82,7 +82,7 @@ The `main` program provides several ways to interact with the LLaMA models using
 | 
				
			|||||||
 | 
					
 | 
				
			||||||
## Interaction
 | 
					## Interaction
 | 
				
			||||||
 | 
					
 | 
				
			||||||
The `main` program offers a seamless way to interact with LLaMA models, allowing users to engage in real-time conversations or provide instructions for specific tasks. The interactive mode can be triggered using various options, including `--interactive` and `--interactive-first`.
 | 
					The `llama-cli` program offers a seamless way to interact with LLaMA models, allowing users to engage in real-time conversations or provide instructions for specific tasks. The interactive mode can be triggered using various options, including `--interactive` and `--interactive-first`.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
In interactive mode, users can participate in text generation by injecting their input during the process. Users can press `Ctrl+C` at any time to interject and type their input, followed by pressing `Return` to submit it to the LLaMA model. To submit additional lines without finalizing input, users can end the current line with a backslash (`\`) and continue typing.
 | 
					In interactive mode, users can participate in text generation by injecting their input during the process. Users can press `Ctrl+C` at any time to interject and type their input, followed by pressing `Return` to submit it to the LLaMA model. To submit additional lines without finalizing input, users can end the current line with a backslash (`\`) and continue typing.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 
 | 
				
			|||||||
		Reference in New Issue
	
	Block a user