Commit Graph

1263 Commits

Author SHA1 Message Date
Georgi Gerganov
811f653f95 py : cosmetics 2023-08-21 20:40:08 +03:00
goerch
49c25cce19 tests : use new tokenizer type API (#2692)
* Merge tokenizer fixes into the gguf branch.

* Add test vocabularies

* Adapt convert-new.py (and fix a clang-cl compiler error on windows)

* Improved tokenizer test

But does it work on MacOS?

* Improve token type support

- Added @klosax code to convert.py
- Improved token type support in vocabulary

* Exclude platform dependent tests

* More sentencepiece compatibility by eliminating magic numbers

* Restored accidentally removed comment

* Improve commentary

* Use token type API in test-tokenizer-1.cpp
2023-08-21 20:11:14 +03:00
Georgi Gerganov
0b53b8b08d llama : add API for token type
ggml-ci
2023-08-21 19:35:31 +03:00
goerch
8d177eddeb llama : improve token type support (#2668)
* Merge tokenizer fixes into the gguf branch.

* Add test vocabularies

* Adapt convert-new.py (and fix a clang-cl compiler error on windows)

* Improved tokenizer test

But does it work on MacOS?

* Improve token type support

- Added @klosax code to convert.py
- Improved token type support in vocabulary

* Exclude platform dependent tests

* More sentencepiece compatibility by eliminating magic numbers

* Restored accidentally removed comment
2023-08-21 18:56:02 +03:00
Kerfuffle
e06cbcee73 gguf : add Python script to convert GGMLv3 LLaMA models to GGUF (#2682)
* First pass at converting GGMLv3 LLaMA models to GGUF

* Cleanups, better output during conversion

* Fix vocab space conversion logic

* More vocab conversion fixes

* Add description to converted GGUF files

* Improve help text, expand warning

* Allow specifying name and description for output GGUF

* Allow overriding vocab and hyperparams from original model metadata

* Use correct params override var name

* Fix wrong type size for Q8_K

Better handling of original style metadata

* Set default value for gguf add_tensor raw_shape KW arg
2023-08-21 17:45:52 +03:00
Georgi Gerganov
6490ff7198 py : fix whitespace 2023-08-21 16:42:27 +03:00
Georgi Gerganov
1e7a0092dd Merge branch 'master' into gguf
ggml-ci
2023-08-21 16:28:30 +03:00
klosax
7a7d1ba68a convert-llama-hf-to-gguf.py : rope scale fix 2023-08-21 14:12:02 +02:00
klosax
9070e330ab convert-llama-7b-pth-to-gguf.py : rope scale fix 2023-08-21 14:11:22 +02:00
klosax
c082b9fa0b llama.cpp : use rope scale kv 2023-08-21 13:30:03 +02:00
klosax
dc1f051013 convert-llama-7b-pth-to-gguf.py : rope scale and added tokens 2023-08-21 13:27:53 +02:00
klosax
5f6ff387ca convert-llama-hf-to-gguf.py : rope scale and added tokens 2023-08-21 13:25:14 +02:00
klosax
6a69a693cb gguf.py : fix rope scale kv 2023-08-21 13:23:10 +02:00
Shouzheng Liu
dadbed99e6 metal : fix synchronization in new matrix multiplication kernel (#2686) 2023-08-21 13:59:29 +03:00
Kawrakow
cb1c0727bd HellaSwag: split token evaluation into batches if needed (#2681)
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
master-cb1c072
2023-08-21 11:11:31 +03:00
klosax
c818c405e0 convert-llama-hf-to-gguf.py : fix attn_q permute 2023-08-21 04:42:09 +02:00
klosax
58bde5c5c1 Delete convert-permute-debug.py 2023-08-21 04:35:06 +02:00
klosax
287db51015 Delete convert-permute-debug-master.py 2023-08-21 04:34:39 +02:00
klosax
d5c8fcfd8a convert.py : 70b model working (change attn_q permute) 2023-08-21 04:33:33 +02:00
klosax
7de7cb4bd8 convert-permute-debug.py : change permute type of attn_q 2023-08-21 04:06:59 +02:00
klosax
4f92488dd6 convert-permute-debug-master.py : permute debug for master 2023-08-21 03:44:16 +02:00
klosax
5a02b9625a convert-permute-debug.py : permute debug print 2023-08-21 03:24:29 +02:00
slaren
9e232f0234 ggml : move all type info to ggml_type_traits (#2663) master-9e232f0 2023-08-20 22:17:53 +02:00
klosax
f838faa874 convert-llama-7b-pth-to-gguf.py : special tokens 2023-08-20 16:56:48 +02:00
klosax
76b46627e2 convert-llama-hf-to-gguf.py : special tokens 2023-08-20 16:54:42 +02:00
Kawrakow
5e9ff54a67 More efficient Hellaswag implementation (#2677)
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
master-5e9ff54
2023-08-20 16:44:46 +03:00
klosax
28b8c265eb cmpnct_gpt2bpe.hpp : cleanup gguf-28b8c26 2023-08-19 18:26:51 +02:00
klosax
c0a1269b7f Update examples/server/README.md
Co-authored-by: slaren <slarengh@gmail.com>
2023-08-19 15:27:37 +02:00
klosax
6a2e520095 cmpnct_gpt2bpe.hpp : remove non-general stuff 2023-08-19 13:19:02 +02:00
klosax
8945d47f52 gptneox-main.cpp : fixes 2023-08-19 12:09:24 +02:00
klosax
781bf2481f falcon-main.cpp : fixes 2023-08-19 12:08:17 +02:00
klosax
dadf098b5a cmpnct_gpt2bpe.hpp : fixes 2023-08-19 12:06:22 +02:00
klosax
b3a7a2b486 convert-falcon-hf-to-gguf.py : add tensor data layout 2023-08-19 12:05:11 +02:00
klosax
2c8055b65b convert-falcon-hf-to-gguf.py : update ref 2023-08-19 01:08:39 +02:00
klosax
1d80eea574 falcon-main.cpp : fix for falcon 40b 2023-08-19 01:03:37 +02:00
klosax
bd5a57901b gguf.py : fix for falcon 40b 2023-08-19 01:01:52 +02:00
klosax
281d6d1105 convert-llama-hf-to-gguf.py : remove extra kv 2023-08-19 00:32:56 +02:00
klosax
593b04fdcd convert-llama-7b-pth-to-gguf.py : remove extra kv 2023-08-19 00:32:27 +02:00
klosax
c0e4ca630b convert-gptneox-hf-to-gguf.py : remove extra kv 2023-08-19 00:31:56 +02:00
klosax
16ab9ba3b3 convert-falcon-hf-to-gguf.py : remove extra kv 2023-08-19 00:31:28 +02:00
klosax
d5e976c12b falcon-main.cpp : falcon inference example 2023-08-19 00:02:18 +02:00
Georgi Gerganov
1f0bccb279 server : better default prompt (#2646) 2023-08-19 05:45:36 +08:00
Jhen-Jie Hong
f63564adfa server : update xxd usage for older versions compatibility (#2649)
* server : update xxd usage for older versions compatibility

* remove unused $func
2023-08-19 05:41:32 +08:00
Adrian
2d8b76a110 Add link to clojure bindings to Readme. (#2659) 2023-08-18 21:39:22 +02:00
klosax
fb7c883cd3 convert-falcon-hf-to-gguf.py : falcon HF --> gguf conversion, not tested 2023-08-18 20:14:01 +02:00
Georgi Gerganov
25b8a8922d llama : introduce enum llama_vocab_type + remove hardcoded string constants 2023-08-18 18:46:38 +03:00
Georgi Gerganov
7af633aec3 readme : incoming BREAKING CHANGE 2023-08-18 17:48:31 +03:00
Georgi Gerganov
a4ad2bf35c llama : fix MPI build
ggml-ci
2023-08-18 17:34:27 +03:00
Georgi Gerganov
5d2656d670 llama : avoid hardcoded special tokens 2023-08-18 17:29:20 +03:00
Georgi Gerganov
035d511457 llama : minor API updates 2023-08-18 17:10:20 +03:00