mirror of
				https://github.com/ggml-org/llama.cpp.git
				synced 2025-10-30 08:42:00 +00:00 
			
		
		
		
	readme : update hot topics
This commit is contained in:
		| @@ -10,13 +10,9 @@ | |||||||
| Inference of [LLaMA](https://arxiv.org/abs/2302.13971) model in pure C/C++ | Inference of [LLaMA](https://arxiv.org/abs/2302.13971) model in pure C/C++ | ||||||
|  |  | ||||||
| ### Hot topics | ### Hot topics | ||||||
| - ‼️ BPE tokenizer update: existing Falcon and Starcoder `.gguf` models will need to be reconverted: [#3252](https://github.com/ggerganov/llama.cpp/pull/3252) |  | ||||||
| - ‼️ Breaking change: `rope_freq_base` and `rope_freq_scale` must be set to zero to use the model default values: [#3401](https://github.com/ggerganov/llama.cpp/pull/3401) |  | ||||||
| - Parallel decoding + continuous batching support added: [#3228](https://github.com/ggerganov/llama.cpp/pull/3228) \ |  | ||||||
|   **Devs should become familiar with the new API** |  | ||||||
| - Local Falcon 180B inference on Mac Studio |  | ||||||
|  |  | ||||||
|   https://github.com/ggerganov/llama.cpp/assets/1991296/98abd4e8-7077-464c-ae89-aebabca7757e | - LLaVA support: https://github.com/ggerganov/llama.cpp/pull/3436 | ||||||
|  | - ‼️ BPE tokenizer update: existing Falcon and Starcoder `.gguf` models will need to be reconverted: [#3252](https://github.com/ggerganov/llama.cpp/pull/3252) | ||||||
|  |  | ||||||
| ---- | ---- | ||||||
|  |  | ||||||
|   | |||||||
		Reference in New Issue
	
	Block a user
	 Georgi Gerganov
					Georgi Gerganov