* Add files via upload
* fix unit test
* fix crashes for --reasoning-format=none
* Patch buggy official MiniMax-M2 chat template
* add upstream minja fix: https://github.com/ochafik/minja/pull/7
* Fix <think> token not generated
* add test copied from https://github.com/ggml-org/llama.cpp/pull/16946
* cleanup
* Hopes to fix the compilation error on CI
* Delete chat template patching since it’s fixed by upstream Minja
* Remove undeeded Minimax-M2 template patch
https://github.com/ochafik/minja/pull/7#issuecomment-3480356100
* Add proper handling of optional parameters with test
merged tests from: 23d4bb75c4
* Fix making all tool parameters optional
* Move xml tool parser to separate file
* cleanup & add tests for GLM4.5
* add streaming tests & enhancement & cleanups
Add streaming test for both GLM 4.5 and minimax-m2.
Cleanup for preserved_tokens.
Cleanup for grammar rule name.
Enhance the parser's stability.
* cleanup & add support for Kimi-K2 Qwen3-Coder Apriel-1.5 Xiaomi-MiMo
* apply suggestions from reviewers
* fix a misuse for data.grammar_lazy
* fix grammar when tool have no argument
* Fix `no triggers set for lazy grammar!` for GLM4.5/4.6. Insert additional stops for Kimi-K2
* update chat.cpp
* fix grammar for GLM 4.5/4.6
* Try fix Jinja template for GLM
* Try fix GLM-4.6.jinja
* Update common/chat-parser-xml-toolcall.cpp
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Update tests/test-chat.cpp
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* improve chat template for GLM, rename Kimi-K2 template to Kimi-K2-Thinking
* Improve Kimi-K2 chat template
* Fix unit test
* Fix "Invalid tool call arguments passed" in a rare case.
In a rare case, the model may emit a raw string that begins with a valid JSON string. This commit adds unit tests to cover that scenario and fixes the regression introduced during the Kimi-K2 adaptation.
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* common: move download functions to download.(cpp|h)
* rm unused includes
* minor cleanup
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* common: introduce http.h for httplib-based client
This change moves cpp-httplib based URL parsing and client setup into
a new header `common/http.h`, and integrates it in `arg.cpp` and `run.cpp`.
It is an iteration towards removing libcurl, while intentionally
minimizing changes to existing code to guarantee the same behavior when
`LLAMA_CURL` is used.
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
* tools : add missing WIN32_LEAN_AND_MEAN
Signed-off-by: Adrien Gallouët <adrien@gallouet.fr>
---------
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
Signed-off-by: Adrien Gallouët <adrien@gallouet.fr>
Introduce a new `LLAMA_OPENSSL` option, enabled by default.
This preserves the previous default (libcurl first, OpenSSL as fallback),
while allowing OpenSSL to be disabled if desired.
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
* vendor : update httplib
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
* common : use cpp-httplib as a cURL alternative for downloads
The existing cURL implementation is intentionally left untouched to
prevent any regressions and to allow for safe, side-by-side testing by
toggling the `LLAMA_CURL` CMake option.
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
* ggml : Bump to Windows 10
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
---------
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
* cmake: Simplify build-info.cpp generation
The rebuild of build-info.cpp still gets triggered when .git/index gets
changes.
* cmake: generate build-info.cpp in build dir
* setup windows linking for llguidance; thanks @phil-scott-78
* add build instructions for windows and update script link
* change VS Community link from DE to EN
* whitespace fix
* initial porting of previous LLG patch
* update for new APIs
* build: integrate llguidance as an external project
* use '%llguidance' as marker to enable llg lark syntax
* add some docs
* clarify docs
* code style fixes
* remove llguidance.h from .gitignore
* fix tests when llg is enabled
* pass vocab not model to llama_sampler_init_llg()
* copy test-grammar-integration.cpp to test-llguidance.cpp
* clang fmt
* fix ref-count bug
* build and run test
* gbnf -> lark syntax
* conditionally include llguidance test based on LLAMA_LLGUIDANCE flag
* rename llguidance test file to test-grammar-llguidance.cpp
* add gh action for llg test
* align tests with LLG grammar syntax and JSON Schema spec
* llama_tokenizer() in fact requires valid utf8
* update llg
* format file
* add $LLGUIDANCE_LOG_LEVEL support
* fix whitespace
* fix warning
* include <cmath> for INFINITY
* add final newline
* fail llama_sampler_init_llg() at runtime
* Link gbnf_to_lark.py script; fix links; refer to llg docs for lexemes
* simplify #includes
* improve doc string for LLAMA_LLGUIDANCE
* typo in merge
* bump llguidance to 0.6.12
- Add `struct llama_sampler` and `struct llama_sampler_i`
- Add `llama_sampler_` API
- Add `llama_sampler_chain_` API for chaining multiple samplers
- Remove `LLAMA_API_INTERNAL`
- Add `llama_perf_` API and remove old `llama_print_timings` and `llama_reset_timings`
* cmake : fix joining of REAL_GIT_DIR
* fix includes with help from include-what-you-use
* make : remove unneeded deps and add test-rope target
* fix C includes in C++ source files
* Revert "fix includes with help from include-what-you-use"
This reverts commit 635e9fadfd.
* wip llava python bindings compatibility
* add external llava API
* add base64 in-prompt image support
* wip refactor image loading
* refactor image load out of llava init
* cleanup
* further cleanup; move llava-cli into its own file and rename
* move base64.hpp into common/
* collapse clip and llava libraries
* move llava into its own subdir
* wip
* fix bug where base64 string was not removed from the prompt
* get libllava to output in the right place
* expose llava methods in libllama.dylib
* cleanup memory usage around clip_image_*
* cleanup and refactor *again*
* update headerdoc
* build with cmake, not tested (WIP)
* Editorconfig
* Editorconfig
* Build with make
* Build with make
* Fix cyclical depts on Windows
* attempt to fix build on Windows
* attempt to fix build on Windows
* Upd TODOs
* attempt to fix build on Windows+CUDA
* Revert changes in cmake
* Fix according to review comments
* Support building as a shared library
* address review comments
---------
Co-authored-by: M. Yusuf Sarıgöz <yusufsarigoz@gmail.com>
Co-authored-by: Jared Van Bortel <jared@nomic.ai>
* cmake : fix build when .git does not exist
* cmake : simplify BUILD_INFO target
* cmake : add missing dependencies on BUILD_INFO
* build : link against build info instead of compiling against it
* zig : make build info a .cpp source instead of a header
Co-authored-by: Matheus C. França <matheus-catarino@hotmail.com>
* cmake : revert change to CMP0115
---------
Co-authored-by: Matheus C. França <matheus-catarino@hotmail.com>
* Fix mirostat state when using multiple sequences
* Fix mirostat by completely refactoring sampling!
* Try to fix zig build.
* Export function to fetch/create default sampler states
Code formatting cleanups and add some comments
Silence a warning about id not being used when logging is disabled
* Apply some renaming suggestions.
Fix comments that were out of sync with the pull.
* Use more consistant naming convention for sampling contexts