Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						e235b267a2 
					 
					
						
						
							
							py : switch to snake_case ( #8305 )  
						
						... 
						
						
						
						* py : switch to snake_case
ggml-ci
* cont
ggml-ci
* cont
ggml-ci
* cont : fix link
* gguf-py : use snake_case in scripts entrypoint export
* py : rename requirements for convert_legacy_llama.py
Needed for scripts/check-requirements.sh
---------
Co-authored-by: Francis Couture-Harpin <git@compilade.net > 
						
						
					 
					
						2024-07-05 07:53:33 +03:00 
						 
				 
			
				
					
						
							
							
								ditsuke 
							
						 
					 
					
						
						
							
						
						821922916f 
					 
					
						
						
							
							fix: Update script paths in CI scripts  
						
						
						
						
					 
					
						2024-07-04 15:39:13 +00:00 
						 
				 
			
				
					
						
							
							
								Clint Herron 
							
						 
					 
					
						
						
							
						
						07a3fc0608 
					 
					
						
						
							
							Removes multiple newlines at the end of files that is breaking the editorconfig step of CI. ( #8258 )  
						
						
						
						
					 
					
						2024-07-02 12:18:10 -04:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						c70d117c37 
					 
					
						
						
							
							scripts : fix filename sync  
						
						
						
						
					 
					
						2024-06-26 23:25:22 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						f2d48fffde 
					 
					
						
						
							
							sync : ggml  
						
						
						
						
					 
					
						2024-06-26 19:39:19 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						f3f65429c4 
					 
					
						
						
							
							llama : reorganize source code + improve CMake ( #8006 )  
						
						... 
						
						
						
						* scripts : update sync [no ci]
* files : relocate [no ci]
* ci : disable kompute build [no ci]
* cmake : fixes [no ci]
* server : fix mingw build
ggml-ci
* cmake : minor [no ci]
* cmake : link math library [no ci]
* cmake : build normal ggml library (not object library) [no ci]
* cmake : fix kompute build
ggml-ci
* make,cmake : fix LLAMA_CUDA + replace GGML_CDEF_PRIVATE
ggml-ci
* move public backend headers to the public include directory (#8122 )
* move public backend headers to the public include directory
* nix test
* spm : fix metal header
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
* scripts : fix sync paths [no ci]
* scripts : sync ggml-blas.h [no ci]
---------
Co-authored-by: slaren <slarengh@gmail.com > 
						
						
					 
					
						2024-06-26 18:33:02 +03:00 
						 
				 
			
				
					
						
							
							
								jaime-m-p 
							
						 
					 
					
						
						
							
						
						37bef89433 
					 
					
						
						
							
							tokenizer : BPE fixes ( #7530 )  
						
						... 
						
						
						
						* Random test: add_bos_token, add_eos_token
* Random test: add BPE models for testing
* Custom regex split fails with codepoint 0
* Fix falcon punctuation regex
* Refactor llm_tokenizer_bpe: move code to constructor
* Move 'add_special_bos/eos' logic to llm_tokenizer_bpe
* Move tokenizer flags to vocab structure.
* Default values for special_add_bos/eos
* Build vocab.special_tokens_cache using vocab token types
* Generalize 'jina-v2' per token attributes
* Fix unicode whitespaces (deepseek-coder, deepseek-llm)
* Skip missing byte tokens (falcon)
* Better unicode data generation
* Replace char32_t with uint32_t 
						
						
					 
					
						2024-06-18 18:40:52 +02:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						5326bcceeb 
					 
					
						
						
							
							ggml : sync  
						
						
						
						
					 
					
						2024-06-18 09:50:45 +03:00 
						 
				 
			
				
					
						
							
							
								Olivier Chafik 
							
						 
					 
					
						
						
							
						
						1c641e6aac 
					 
					
						
						
							
							build: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )  
						
						... 
						
						
						
						* `main`/`server`: rename to `llama` / `llama-server` for consistency w/ homebrew
* server: update refs -> llama-server
gitignore llama-server
* server: simplify nix package
* main: update refs -> llama
fix examples/main ref
* main/server: fix targets
* update more names
* Update build.yml
* rm accidentally checked in bins
* update straggling refs
* Update .gitignore
* Update server-llm.sh
* main: target name -> llama-cli
* Prefix all example bins w/ llama-
* fix main refs
* rename {main->llama}-cmake-pkg binary
* prefix more cmake targets w/ llama-
* add/fix gbnf-validator subfolder to cmake
* sort cmake example subdirs
* rm bin files
* fix llama-lookup-* Makefile rules
* gitignore /llama-*
* rename Dockerfiles
* rename llama|main -> llama-cli; consistent RPM bin prefixes
* fix some missing -cli suffixes
* rename dockerfile w/ llama-cli
* rename(make): llama-baby-llama
* update dockerfile refs
* more llama-cli(.exe)
* fix test-eval-callback
* rename: llama-cli-cmake-pkg(.exe)
* address gbnf-validator unused fread warning (switched to C++ / ifstream)
* add two missing llama- prefixes
* Updating docs for eval-callback binary to use new `llama-` prefix.
* Updating a few lingering doc references for rename of main to llama-cli
* Updating `run-with-preset.py` to use new binary names.
Updating docs around `perplexity` binary rename.
* Updating documentation references for lookup-merge and export-lora
* Updating two small `main` references missed earlier in the finetune docs.
* Update apps.nix
* update grammar/README.md w/ new llama-* names
* update llama-rpc-server bin name + doc
* Revert "update llama-rpc-server bin name + doc"
This reverts commit e474ef1df4hanclinto@gmail.com > 
						
						
					 
					
						2024-06-13 00:41:52 +01:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						1442677f92 
					 
					
						
						
							
							common : refactor cli arg parsing ( #7675 )  
						
						... 
						
						
						
						* common : gpt_params_parse do not print usage
* common : rework usage print (wip)
* common : valign
* common : rework print_usage
* infill : remove cfg support
* common : reorder args
* server : deduplicate parameters
ggml-ci
* common : add missing header
ggml-ci
* common : remote --random-prompt usages
ggml-ci
* examples : migrate to gpt_params
ggml-ci
* batched-bench : migrate to gpt_params
* retrieval : migrate to gpt_params
* common : change defaults for escape and n_ctx
* common : remove chatml and instruct params
ggml-ci
* common : passkey use gpt_params 
						
						
					 
					
						2024-06-04 21:23:39 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						554c247caf 
					 
					
						
						
							
							ggml : remove OpenCL ( #7735 )  
						
						... 
						
						
						
						ggml-ci 
						
						
					 
					
						2024-06-04 21:23:20 +03:00 
						 
				 
			
				
					
						
							
							
								slaren 
							
						 
					 
					
						
						
							
						
						adc9ff3841 
					 
					
						
						
							
							llama-bench : allow using a different printer for stderr with -oe ( #7722 )  
						
						... 
						
						
						
						compare-commits.sh : hide stdout, use -oe to print markdown 
						
						
					 
					
						2024-06-04 14:32:42 +02:00 
						 
				 
			
				
					
						
							
							
								Johannes Gäßler 
							
						 
					 
					
						
						
							
						
						c8047d538f 
					 
					
						
						
							
							scripts: update compare_llama_bench.py [no ci] ( #7673 )  
						
						
						
						
					 
					
						2024-05-31 16:26:21 +02:00 
						 
				 
			
				
					
						
							
							
								Galunid 
							
						 
					 
					
						
						
							
						
						9c4c9cc83f 
					 
					
						
						
							
							Move convert.py to examples/convert-legacy-llama.py ( #7430 )  
						
						... 
						
						
						
						* Move convert.py to examples/convert-no-torch.py
* Fix CI, scripts, readme files
* convert-no-torch -> convert-legacy-llama
* Move vocab thing to vocab.py
* Fix convert-no-torch -> convert-legacy-llama
* Fix lost convert.py in ci/run.sh
* Fix imports
* Fix gguf not imported correctly
* Fix flake8 complaints
* Fix check-requirements.sh
* Get rid of ADDED_TOKENS_FILE, FAST_TOKENIZER_FILE
* Review fixes 
						
						
					 
					
						2024-05-30 21:40:00 +10:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						00281b7be3 
					 
					
						
						
							
							scripts : remove mpi remnants  
						
						
						
						
					 
					
						2024-05-29 14:31:18 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						2ab977282b 
					 
					
						
						
							
							sync : ggml  
						
						
						
						
					 
					
						2024-05-29 14:29:52 +03:00 
						 
				 
			
				
					
						
							
							
								slaren 
							
						 
					 
					
						
						
							
						
						d359f30921 
					 
					
						
						
							
							llama : remove MPI backend ( #7395 )  
						
						
						
						
					 
					
						2024-05-20 01:17:03 +02:00 
						 
				 
			
				
					
						
							
							
								jaime-m-p 
							
						 
					 
					
						
						
							
						
						b43272afa2 
					 
					
						
						
							
							Unicode codepoint flags for custom regexs ( #7245 )  
						
						... 
						
						
						
						* Replace CODEPOINT_TYPE_* with codepoint_flags
* Update and bugfix brute force random test
* Deterministic brute force random test
* Unicode normalization NFD
* Get rid of BOM 
						
						
					 
					
						2024-05-18 01:09:13 +02:00 
						 
				 
			
				
					
						
							
							
								Brian 
							
						 
					 
					
						
						
							
						
						51e9d02599 
					 
					
						
						
							
							Added a single test function script and fix debug-test.sh to be more robust ( #7279 )  
						
						... 
						
						
						
						* run-single-test.sh: added a single test function script and fix debug-test.sh to be more robust
* debug-test.sh: combined execute and gdb test mode via -g flag
* debug-test.sh: refactor
* debug-test: refactor for clarity
* debug-test.sh: comment style changes
* debug-test.sh: fix gdb 
						
						
					 
					
						2024-05-17 22:40:14 +10:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						29499bb593 
					 
					
						
						
							
							sync : ggml  
						
						
						
						
					 
					
						2024-05-15 13:23:41 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						9f773486ab 
					 
					
						
						
							
							script : sync ggml-rpc  
						
						
						
						
					 
					
						2024-05-14 19:14:38 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						a5e3fde857 
					 
					
						
						
							
							sync : ggml  
						
						... 
						
						
						
						ggml-ci 
						
						
					 
					
						2024-05-14 19:08:09 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						7bd4ffb780 
					 
					
						
						
							
							metal : fix warnings (skipme) ( #0 )  
						
						
						
						
					 
					
						2024-05-11 21:38:13 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						1622ac023f 
					 
					
						
						
							
							sync : ggml  
						
						
						
						
					 
					
						2024-05-11 21:35:05 +03:00 
						 
				 
			
				
					
						
							
							
								Josh Ramer 
							
						 
					 
					
						
						
							
						
						fed0108491 
					 
					
						
						
							
							Scripting & documenting debugging one test without anything else in the loop. ( #7096 )  
						
						... 
						
						
						
						* A little documentation that shares my quick tips for working in the repository.
* Update startup-testing-debugging.md
* script that shows a menu of tests to pick from & run the debugger on
* debug-test.sh: Refactor CLI help message
* debug-test.sh: documentation update
* debug-test.sh: CLI Help output corrections
* debug-test.sh: minor doc fix
---------
authored-by: Josh Ramer <ubuntu@ip-172-31-32-53.ec2.internal >
Assisted-by: brian khuu <mofosyne@gmail.com > 
						
						
					 
					
						2024-05-12 03:26:35 +10:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						fae9d234b6 
					 
					
						
						
							
							sync : ggml  
						
						... 
						
						
						
						ggml-ci 
						
						
					 
					
						2024-05-11 15:38:34 +03:00 
						 
				 
			
				
					
						
							
							
								slaren 
							
						 
					 
					
						
						
							
						
						e849648888 
					 
					
						
						
							
							llama-bench : add pp+tg test type ( #7199 )  
						
						
						
						
					 
					
						2024-05-10 18:03:54 +02:00 
						 
				 
			
				
					
						
							
							
								jaime-m-p 
							
						 
					 
					
						
						
							
						
						43248e5594 
					 
					
						
						
							
							llama3 custom regex split ( #6965 )  
						
						... 
						
						
						
						* merged the changes from deepseeker models to main branch
* Moved regex patterns to unicode.cpp and updated unicode.h
* Moved header files
* Resolved issues
* added and refactored unicode_regex_split and related functions
* Updated/merged the deepseek coder pr
* Refactored code
* Adding unicode regex mappings
* Adding unicode regex function
* Added needed functionality, testing remains
* Fixed issues
* Fixed issue with gpt2 regex custom preprocessor
* unicode : fix? unicode_wstring_to_utf8
* lint : fix whitespaces
* tests : add tokenizer tests for numbers
* unicode : remove redundant headers
* tests : remove and rename tokenizer test scripts
* tests : add sample usage
* gguf-py : reader prints warnings on duplicate keys
* llama : towards llama3 tokenization support (wip)
* unicode : shot in the dark to fix tests on Windows
* unicode : first try custom implementations
* convert : add "tokenizer.ggml.pre" GGUF KV (wip)
* llama : use new pre-tokenizer type
* convert : fix pre-tokenizer type writing
* lint : fix
* make : add test-tokenizer-0-llama-v3
* wip
* models : add llama v3 vocab file
* llama : adapt punctuation regex + add llama 3 regex
* minor
* unicode : set bomb
* unicode : set bomb
* unicode : always use std::wregex
* unicode : support \p{N}, \p{L} and \p{P} natively
* unicode : try fix windows
* unicode : category support via std::regex
* unicode : clean-up
* unicode : simplify
* llama3 custom regex split
* convert : add convert-hf-to-gguf-update.py
ggml-ci
* lint : update
* convert : add falcon
ggml-ci
* unicode : normalize signatures
* lint : fix
* lint : fix
* convert : remove unused functions
* convert : add comments
* convert : exercise contractions
ggml-ci
* Using char32_t for codepoints
* lint : fix
* already exists unicode_tolower()
* Typing
* Restore BOM
* cmake : refactor test targets
* tests : refactor vocab tests
ggml-ci
* tests : add more vocabs and tests
ggml-ci
* unicode : cleanup
* scripts : ignore new update script in check-requirements.sh
* Fix merge
* models : add phi-3, mpt, gpt-2, starcoder
* tests : disable obsolete
ggml-ci
* tests : use faster bpe test
ggml-ci
* llama : more prominent warning for old BPE models
* tests : disable test-tokenizer-1-bpe due to slowness
ggml-ci
* Move unused variable value
* GPT2 custom regex split
* Add alternative regex for custom aplit llama3
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
* Style
* Add bruteforce random tests for token encoding
* wip: fixing unicode codepoint ranges
* Fix merge
* Unicode tables: separator, lowercase, uppercase and whitespace
* llama3 custom regex split: fix \s
* Restore BOM
* Style
* wip: generate NDF table
* Ignore special tokens for testing
* Clean gen-unicode-data.py
* Refactor random tokenizer test
* lint : fix
* tests : add fail test for llama-bpe
---------
Co-authored-by: Jaggzh <jaggz.h@gmail.com >
Co-authored-by: Kazim Abrar Mahi <kazimabrarmahi135@gmail.com >
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
Co-authored-by: jaime-m-p <> 
						
						
					 
					
						2024-05-09 23:30:44 +10:00 
						 
				 
			
				
					
						
							
							
								Brian 
							
						 
					 
					
						
						
							
						
						acdce3cdef 
					 
					
						
						
							
							compare-llama-bench.py: add missing basicConfig ( #7138 )  
						
						... 
						
						
						
						* compare-llama-bench.py: add missing basicConfig
* compare-llama-bench.py: Add line break between error message and print_help()
* Add regular print() markdown table 
						
						
					 
					
						2024-05-08 10:54:39 +02:00 
						 
				 
			
				
					
						
							
							
								Brian 
							
						 
					 
					
						
						
							
						
						6fbd432211 
					 
					
						
						
							
							py : logging and flake8 suppression refactoring ( #7081 )  
						
						... 
						
						
						
						Set one as executable and add basicConfig()
to another. Also added noqa tag to test scripts. 
						
						
					 
					
						2024-05-05 08:07:48 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						92139b90af 
					 
					
						
						
							
							tests : add test-tokenizer-0.sh + fix some tokenizers ( #7036 )  
						
						... 
						
						
						
						* tests : add test-tokenizer-0.sh
* unicode : add all unicode number ranges
* starcoder : fix pre-tokenizer
* tests : add test that fails with DeepSeek tokenizers
* falcon : fix regex
* unicode : regenerate unicode tables
* refact : add tokenizer model
* lint : fix
* tests : disable failing tests
ggml-ci
* refact : add tests files
ggml-ci
* convert : print -> logging
ggml-ci
* lint : fix
* unicode : digit -> number
* phi-3 : update 
						
						
					 
					
						2024-05-04 08:32:32 +03:00 
						 
				 
			
				
					
						
							
							
								Brian 
							
						 
					 
					
						
						
							
						
						a2ac89d6ef 
					 
					
						
						
							
							convert.py : add python logging instead of print() ( #6511 )  
						
						... 
						
						
						
						* convert.py: add python logging instead of print()
* convert.py: verbose flag takes priority over dump flag log suppression
* convert.py: named instance logging
* convert.py: use explicit logger id string
* convert.py: convert extra print() to named logger
* convert.py: sys.stderr.write --> logger.error
* *.py: Convert all python scripts to use logging module
* requirements.txt: remove extra line
* flake8: update flake8 ignore and exclude to match ci settings
* gh-actions: add flake8-no-print to flake8 lint step
* pre-commit: add flake8-no-print to flake8 and also update pre-commit version
* convert-hf-to-gguf.py: print() to logger conversion
* *.py: logging basiconfig refactor to use conditional expression
* *.py: removed commented out logging
* fixup! *.py: logging basiconfig refactor to use conditional expression
* constant.py: logger.error then exit should be a raise exception instead
* *.py: Convert logger error and sys.exit() into a raise exception (for atypical error)
* gguf-convert-endian.py: refactor convert_byteorder() to use tqdm progressbar
* verify-checksum-model.py: This is the result of the program, it should be printed to stdout.
* compare-llama-bench.py: add blank line for readability during missing repo response
* reader.py: read_gguf_file() use print() over logging
* convert.py: warning goes to stderr and won't hurt the dump output
* gguf-dump.py: dump_metadata() should print to stdout
* convert-hf-to-gguf.py: print --> logger.debug or ValueError()
* verify-checksum-models.py: use print() for printing table
* *.py: refactor logging.basicConfig()
* gguf-py/gguf/*.py: use __name__ as logger name
Since they will be imported and not run directly.
* python-lint.yml: use .flake8 file instead
* constants.py: logger no longer required
* convert-hf-to-gguf.py: add additional logging
* convert-hf-to-gguf.py: print() --> logger
* *.py: fix flake8 warnings
* revert changes to convert-hf-to-gguf.py for get_name()
* convert-hf-to-gguf-update.py: use triple quoted f-string instead
* *.py: accidentally corrected the wrong line
* *.py: add compilade warning suggestions and style fixes 
						
						
					 
					
						2024-05-03 22:36:41 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						f4ab2a4147 
					 
					
						
						
							
							llama : fix BPE pre-tokenization ( #6920 )  
						
						... 
						
						
						
						* merged the changes from deepseeker models to main branch
* Moved regex patterns to unicode.cpp and updated unicode.h
* Moved header files
* Resolved issues
* added and refactored unicode_regex_split and related functions
* Updated/merged the deepseek coder pr
* Refactored code
* Adding unicode regex mappings
* Adding unicode regex function
* Added needed functionality, testing remains
* Fixed issues
* Fixed issue with gpt2 regex custom preprocessor
* unicode : fix? unicode_wstring_to_utf8
* lint : fix whitespaces
* tests : add tokenizer tests for numbers
* unicode : remove redundant headers
* tests : remove and rename tokenizer test scripts
* tests : add sample usage
* gguf-py : reader prints warnings on duplicate keys
* llama : towards llama3 tokenization support (wip)
* unicode : shot in the dark to fix tests on Windows
* unicode : first try custom implementations
* convert : add "tokenizer.ggml.pre" GGUF KV (wip)
* llama : use new pre-tokenizer type
* convert : fix pre-tokenizer type writing
* lint : fix
* make : add test-tokenizer-0-llama-v3
* wip
* models : add llama v3 vocab file
* llama : adapt punctuation regex + add llama 3 regex
* minor
* unicode : set bomb
* unicode : set bomb
* unicode : always use std::wregex
* unicode : support \p{N}, \p{L} and \p{P} natively
* unicode : try fix windows
* unicode : category support via std::regex
* unicode : clean-up
* unicode : simplify
* convert : add convert-hf-to-gguf-update.py
ggml-ci
* lint : update
* convert : add falcon
ggml-ci
* unicode : normalize signatures
* lint : fix
* lint : fix
* convert : remove unused functions
* convert : add comments
* convert : exercise contractions
ggml-ci
* lint : fix
* cmake : refactor test targets
* tests : refactor vocab tests
ggml-ci
* tests : add more vocabs and tests
ggml-ci
* unicode : cleanup
* scripts : ignore new update script in check-requirements.sh
* models : add phi-3, mpt, gpt-2, starcoder
* tests : disable obsolete
ggml-ci
* tests : use faster bpe test
ggml-ci
* llama : more prominent warning for old BPE models
* tests : disable test-tokenizer-1-bpe due to slowness
ggml-ci
---------
Co-authored-by: Jaggzh <jaggz.h@gmail.com >
Co-authored-by: Kazim Abrar Mahi <kazimabrarmahi135@gmail.com > 
						
						
					 
					
						2024-04-29 16:58:41 +03:00 
						 
				 
			
				
					
						
							
							
								Olivier Chafik 
							
						 
					 
					
						
						
							
						
						5cf5e7d490 
					 
					
						
						
							
							build: generate hex dump of server assets during build (#6661 )  
						
						... 
						
						
						
						* `build`: generate hex dumps of server assets on the fly
* build: workaround lack of -n on gnu xxd
* build: don't use xxd in cmake
* build: don't call xxd from build.zig
* build: more idiomatic hexing
* build: don't use xxd in Makefile (od hackery instead)
* build: avoid exceeding max cmd line limit in makefile hex dump
* build: hex dump assets at cmake build time (not config time) 
						
						
					 
					
						2024-04-21 18:48:53 +01:00 
						 
				 
			
				
					
						
							
							
								slaren 
							
						 
					 
					
						
						
							
						
						0d56246f4b 
					 
					
						
						
							
							ggml : group all experts in a single ggml_mul_mat_id ( #6505 )  
						
						... 
						
						
						
						* ggml : group all experts in a single ggml_mul_mat_id
cuda : improve mmid row copy
* cuda : fix bin bcast with non-cont src0
* test-backend-ops : only run all mul mat tests for base types
* llama : disable moe offloading with SYCL
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com > 
						
						
					 
					
						2024-04-18 15:18:48 +02:00 
						 
				 
			
				
					
						
							
							
								Pierrick Hymbert 
							
						 
					 
					
						
						
							
						
						4bd0f93e4a 
					 
					
						
						
							
							model: support arch DbrxForCausalLM ( #6515 )  
						
						... 
						
						
						
						* model: dbrx convert to gguf
#6344 
* llama: support dbrx
#6344 
* doc: dbrx: add the model as supported
* scripts: get-wikitext-2 add unzip
* llama: increase maximum experts allowed
* llama: factorize moe graph implementation between grok, mixtral and dbrx
---------
Co-authored-by: Megha Agarwal <16129366+megha95@users.noreply.github.com > 
						
						
					 
					
						2024-04-13 11:33:52 +02:00 
						 
				 
			
				
					
						
							
							
								Daniel Bevenius 
							
						 
					 
					
						
						
							
						
						f4183afe6a 
					 
					
						
						
							
							scripts : add --outdir option to hf.sh ( #6600 )  
						
						... 
						
						
						
						* scripts : add --outdir option to hf.sh
This commit adds an option to the hf.sh script that allows the user to
specify an output directory for the downloaded file.
The motivation for this changes is that examples that use the hf.sh
script to download models from huggingface can now specify the output
directory, perhaps to the `models` directory to keep them in one place
and not clutter the root directory.
Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com >
* squash! scripts : add --outdir option to hf.sh
Fix format of the --outdir option in the usage message.
Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com >
---------
Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com > 
						
						
					 
					
						2024-04-11 16:22:47 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						c4a3a4ff47 
					 
					
						
						
							
							sync : ggml  
						
						
						
						
					 
					
						2024-04-09 20:29:06 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						e11a8999b5 
					 
					
						
						
							
							license : update copyright notice + add AUTHORS ( #6405 )  
						
						... 
						
						
						
						* license : add AUTHORS
* authors : update
* scipts : add LICENSE and gen-authors.sh to sync 
						
						
					 
					
						2024-04-09 09:23:19 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						c37247796b 
					 
					
						
						
							
							sync : ggml  
						
						
						
						
					 
					
						2024-04-07 17:05:51 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						43e8995e75 
					 
					
						
						
							
							scripts : sync ggml-cuda folder  
						
						
						
						
					 
					
						2024-04-07 16:08:12 +03:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						54ea0698fb 
					 
					
						
						
							
							sync : ggml  
						
						
						
						
					 
					
						2024-04-06 18:27:46 +03:00 
						 
				 
			
				
					
						
							
							
								Johannes Gäßler 
							
						 
					 
					
						
						
							
						
						33a5244806 
					 
					
						
						
							
							compare-llama-bench.py: fix long hexsha args ( #6424 )  
						
						
						
						
					 
					
						2024-04-01 13:30:43 +02:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						d48ccf3ad4 
					 
					
						
						
							
							sync : ggml ( #6351 )  
						
						... 
						
						
						
						* sync : ggml
ggml-ci
* cuda : move GGML_CUDA_DMMV constants to dmmv.cuh
---------
Co-authored-by: slaren <slarengh@gmail.com > 
						
						
					 
					
						2024-03-29 17:45:46 +02:00 
						 
				 
			
				
					
						
							
							
								slaren 
							
						 
					 
					
						
						
							
						
						280345968d 
					 
					
						
						
							
							cuda : rename build flag to LLAMA_CUDA ( #6299 )  
						
						
						
						
					 
					
						2024-03-26 01:16:01 +01:00 
						 
				 
			
				
					
						
							
							
								Johannes Gäßler 
							
						 
					 
					
						
						
							
						
						50ccaf5eac 
					 
					
						
						
							
							lookup: complement data from context with general text statistics ( #5479 )  
						
						... 
						
						
						
						* lookup: evaluation tools, use corpus/previous gens
* fixup! lookup: evaluation tools, use corpus/previous gens
* fixup! lookup: evaluation tools, use corpus/previous gens
* fixup! lookup: evaluation tools, use corpus/previous gens
* fixup! lookup: evaluation tools, use corpus/previous gens 
						
						
					 
					
						2024-03-23 01:24:36 +01:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						b838b53ad6 
					 
					
						
						
							
							sync : ggml  
						
						
						
						
					 
					
						2024-03-10 20:10:46 +02:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						8a3012a4ad 
					 
					
						
						
							
							ggml : add ggml-common.h to deduplicate shared code ( #5940 )  
						
						... 
						
						
						
						* ggml : add ggml-common.h to shared code
ggml-ci
* scripts : update sync scripts
* sycl : reuse quantum tables
ggml-ci
* ggml : minor
* ggml : minor
* sycl : try to fix build 
						
						
					 
					
						2024-03-09 12:47:57 +02:00 
						 
				 
			
				
					
						
							
							
								slaren 
							
						 
					 
					
						
						
							
						
						652ca2bded 
					 
					
						
						
							
							compare-llama-bench.py : remove mul_mat_q ( #5892 )  
						
						
						
						
					 
					
						2024-03-05 22:27:29 +01:00 
						 
				 
			
				
					
						
							
							
								Georgi Gerganov 
							
						 
					 
					
						
						
							
						
						efd8533ef8 
					 
					
						
						
							
							sync : ggml  
						
						... 
						
						
						
						ggml-ci 
						
						
					 
					
						2024-03-04 20:54:23 +02:00