mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2025-10-27 08:21:30 +00:00
Move convert.py to examples/convert-legacy-llama.py (#7430)
* Move convert.py to examples/convert-no-torch.py * Fix CI, scripts, readme files * convert-no-torch -> convert-legacy-llama * Move vocab thing to vocab.py * Fix convert-no-torch -> convert-legacy-llama * Fix lost convert.py in ci/run.sh * Fix imports * Fix gguf not imported correctly * Fix flake8 complaints * Fix check-requirements.sh * Get rid of ADDED_TOKENS_FILE, FAST_TOKENIZER_FILE * Review fixes
This commit is contained in:
@@ -704,7 +704,8 @@ Building the program with BLAS support may lead to some performance improvements
|
||||
|
||||
To obtain the official LLaMA 2 weights please see the <a href="#obtaining-and-using-the-facebook-llama-2-model">Obtaining and using the Facebook LLaMA 2 model</a> section. There is also a large selection of pre-quantized `gguf` models available on Hugging Face.
|
||||
|
||||
Note: `convert.py` does not support LLaMA 3, you can use `convert-hf-to-gguf.py` with LLaMA 3 downloaded from Hugging Face.
|
||||
Note: `convert.py` has been moved to `examples/convert-legacy-llama.py` and shouldn't be used for anything other than `Llama/Llama2/Mistral` models and their derievatives.
|
||||
It does not support LLaMA 3, you can use `convert-hf-to-gguf.py` with LLaMA 3 downloaded from Hugging Face.
|
||||
|
||||
```bash
|
||||
# obtain the official LLaMA model weights and place them in ./models
|
||||
@@ -721,10 +722,10 @@ ls ./models
|
||||
python3 -m pip install -r requirements.txt
|
||||
|
||||
# convert the model to ggml FP16 format
|
||||
python3 convert.py models/mymodel/
|
||||
python3 convert-hf-to-gguf.py models/mymodel/
|
||||
|
||||
# [Optional] for models using BPE tokenizers
|
||||
python convert.py models/mymodel/ --vocab-type bpe
|
||||
python convert-hf-to-gguf.py models/mymodel/ --vocab-type bpe
|
||||
|
||||
# quantize the model to 4-bits (using Q4_K_M method)
|
||||
./quantize ./models/mymodel/ggml-model-f16.gguf ./models/mymodel/ggml-model-Q4_K_M.gguf Q4_K_M
|
||||
|
||||
Reference in New Issue
Block a user