Files
llama.cpp/convert-pth-to-ggml.py
Ronsor 956dfda8ad Use tokenizer.vocab_size() instead of hardcoding 32000 in convert-pth-to-ggml.py (#142)
There are ways that special tokens or other new tokens could be added to the tokenizer; therefore it's probably best not to assume the vocabulary is only 32000 tokens.
2023-03-15 21:37:50 +02:00

5.3 KiB