llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-31 08:51:55 +00:00

Files

Pierrick Hymbert f482bb2e49 common: llama_load_model_from_url split support (#6192 )

* llama: llama_split_prefix fix strncpy does not include string termination
common: llama_load_model_from_url:
 - fix header name case sensitive
 - support downloading additional split in parallel
 - hide password in url

* common: EOL EOF

* common: remove redundant LLAMA_CURL_MAX_PATH_LENGTH definition

* common: change max url max length

* common: minor comment

* server: support HF URL options

* llama: llama_model_loader fix log

* common: use a constant for max url length

* common: clean up curl if file cannot be loaded in gguf

* server: tests: add split tests, and HF options params

* common: move llama_download_hide_password_in_url inside llama_download_file as a lambda

* server: tests: enable back Release test on PR

* spacing

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* spacing

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* spacing

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

2024-03-23 18:07:00 +01:00

CMakeLists.txt

gguf-split: split and merge gguf per batch of tensors (#6135 )

2024-03-19 12:05:44 +01:00

gguf-split.cpp

common: llama_load_model_from_url split support (#6192 )

2024-03-23 18:07:00 +01:00

README.md

gguf-split: split and merge gguf per batch of tensors (#6135 )

2024-03-19 12:05:44 +01:00

README.md

GGUF split Example

CLI to split / merge GGUF files.

Command line options:

--split: split GGUF to multiple GGUF, default operation.
--split-max-tensors: maximum tensors in each split: default(128)
--merge: merge multiple GGUF to a single GGUF.