llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-21 12:16:57 +00:00

Author	SHA1	Message	Date
hksdpc255	1920345c3b	common : Generalized XML-style tool-call parsing with streaming support (GLM 4.5/4.6 + MiniMax M2 + SeedOSS + Kimi-K2 + Qwen3-Coder + Apriel-1.5 + Xiaomi-MiMo) (#16932 ) * Add files via upload * fix unit test * fix crashes for --reasoning-format=none * Patch buggy official MiniMax-M2 chat template * add upstream minja fix: https://github.com/ochafik/minja/pull/7 * Fix <think> token not generated * add test copied from https://github.com/ggml-org/llama.cpp/pull/16946 * cleanup * Hopes to fix the compilation error on CI * Delete chat template patching since it’s fixed by upstream Minja * Remove undeeded Minimax-M2 template patch https://github.com/ochafik/minja/pull/7#issuecomment-3480356100 * Add proper handling of optional parameters with test merged tests from: `23d4bb75c4` * Fix making all tool parameters optional * Move xml tool parser to separate file * cleanup & add tests for GLM4.5 * add streaming tests & enhancement & cleanups Add streaming test for both GLM 4.5 and minimax-m2. Cleanup for preserved_tokens. Cleanup for grammar rule name. Enhance the parser's stability. * cleanup & add support for Kimi-K2 Qwen3-Coder Apriel-1.5 Xiaomi-MiMo * apply suggestions from reviewers * fix a misuse for data.grammar_lazy * fix grammar when tool have no argument * Fix `no triggers set for lazy grammar!` for GLM4.5/4.6. Insert additional stops for Kimi-K2 * update chat.cpp * fix grammar for GLM 4.5/4.6 * Try fix Jinja template for GLM * Try fix GLM-4.6.jinja * Update common/chat-parser-xml-toolcall.cpp Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update tests/test-chat.cpp Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * improve chat template for GLM, rename Kimi-K2 template to Kimi-K2-Thinking * Improve Kimi-K2 chat template * Fix unit test * Fix "Invalid tool call arguments passed" in a rare case. In a rare case, the model may emit a raw string that begins with a valid JSON string. This commit adds unit tests to cover that scenario and fixes the regression introduced during the Kimi-K2 adaptation. --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>	2025-11-18 18:54:15 +01:00
Adrien Gallouët	52cf111b31	cmake : cleanup (#17199 )	2025-11-12 14:48:30 +02:00
Adrien Gallouët	78010a0d52	cmake : move OpenSSL linking to vendor/cpp-httplib (#17177 ) * cmake : move OpenSSL linking to vendor/cpp-httplib Signed-off-by: Adrien Gallouët <angt@huggingface.co> * bring back httplib 0.27.0 * add -DLLAMA_HTTPLIB * update cmake config for visionos --------- Signed-off-by: Adrien Gallouët <angt@huggingface.co> Co-authored-by: Xuan Son Nguyen <son@huggingface.co>	2025-11-12 12:32:50 +01:00
Xuan-Son Nguyen	1d45b4228f	vendor: split httplib to cpp/h files (#17150 ) * vendor: split httplib to cpp/h files * move defines * include httplib if curl is not used * add TODO * fix build ios * fix build visionos instead	2025-11-11 13:32:58 +01:00
Xuan-Son Nguyen	5c9a18e674	common: move download functions to download.(cpp\|h) (#17059 ) * common: move download functions to download.(cpp\|h) * rm unused includes * minor cleanup --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2025-11-07 11:23:34 +01:00
Adrien Gallouët	4201deae9c	common: introduce http.h for httplib-based client (#16373 ) * common: introduce http.h for httplib-based client This change moves cpp-httplib based URL parsing and client setup into a new header `common/http.h`, and integrates it in `arg.cpp` and `run.cpp`. It is an iteration towards removing libcurl, while intentionally minimizing changes to existing code to guarantee the same behavior when `LLAMA_CURL` is used. Signed-off-by: Adrien Gallouët <angt@huggingface.co> * tools : add missing WIN32_LEAN_AND_MEAN Signed-off-by: Adrien Gallouët <adrien@gallouet.fr> --------- Signed-off-by: Adrien Gallouët <angt@huggingface.co> Signed-off-by: Adrien Gallouët <adrien@gallouet.fr>	2025-10-01 20:22:18 +03:00
Adrien Gallouët	234e2ff8ed	server : remove old LLAMA_SERVER_SSL (#16290 ) Signed-off-by: Adrien Gallouët <angt@huggingface.co>	2025-09-27 19:17:08 +03:00
Adrien Gallouët	ace6a54565	build : add LLAMA_OPENSSL option (#16287 ) Introduce a new `LLAMA_OPENSSL` option, enabled by default. This preserves the previous default (libcurl first, OpenSSL as fallback), while allowing OpenSSL to be disabled if desired. Signed-off-by: Adrien Gallouët <angt@huggingface.co>	2025-09-27 12:12:46 +03:00
Adrien Gallouët	b995a10760	common : use cpp-httplib as a cURL alternative for downloads (#16185 ) * vendor : update httplib Signed-off-by: Adrien Gallouët <angt@huggingface.co> * common : use cpp-httplib as a cURL alternative for downloads The existing cURL implementation is intentionally left untouched to prevent any regressions and to allow for safe, side-by-side testing by toggling the `LLAMA_CURL` CMake option. Signed-off-by: Adrien Gallouët <angt@huggingface.co> * ggml : Bump to Windows 10 Signed-off-by: Adrien Gallouët <angt@huggingface.co> --------- Signed-off-by: Adrien Gallouët <angt@huggingface.co>	2025-09-26 14:12:19 +03:00
Eric Zhang	a457551332	cmake : do not search for curl libraries by ourselves (#14613 ) * cmake : do not search for curl libraries by ourselves * run : do not search for curl libraries by ourselves	2025-07-10 15:29:05 +03:00
Eric Zhang	f9a867f592	cmake : bump llguidance version to v1.0.1 (#14609 )	2025-07-10 08:19:37 +03:00
Eric Zhang	ac44eb6c80	cmake : llguidance build parser library only (#14608 )	2025-07-10 08:19:13 +03:00
Christian Kastner	09cf2c7c65	cmake : Improve build-info.cpp generation (#14156 ) * cmake: Simplify build-info.cpp generation The rebuild of build-info.cpp still gets triggered when .git/index gets changes. * cmake: generate build-info.cpp in build dir	2025-06-13 09:51:34 +03:00
Sigbjørn Skjæret	d4e0d95cf5	chore : clean up relative source dir paths (#14128 )	2025-06-11 19:04:23 +02:00
Georgi Gerganov	53f925074d	sync : vendor (#13901 ) * sync : vendor ggml-ci * cont : fix httplib version ggml-ci * cont : fix lint * cont : fix lint * vendor : move to common folder /vendor ggml-ci * cont : fix lint * cont : move httplib to /vendor + use json_fwd.hpp ggml-ci * cont : fix server build ggml-ci * cont : add missing headers ggml-ci * cont : header clean-up ggml-ci	2025-05-30 16:25:45 +03:00
Olivier Chafik	f5cd27b71d	`server`: streaming of tool calls and thoughts when `--jinja` is on (#12379 ) * add common_json w/ support for truncated json healing * add common_chat_msg_diff * partial common_chat_parse * refactor parser w/ optionals * server: wire chat diffs in stream mode * fix trigger of thinking models (must happen after thoughts are closed) * fix functionary v3.2 raw python! * rename: common_chat_syntax (now contains format) * rm common_regex.at_start * don't return empty <think></think> * accommodate yet another deepseek r1 distill fantasy syntax (`<｜tool▁calls｜>`) * fix QwQ 32B tool call parsing after thoughts (hermes2) * better logs for grammar triggers * consume spaces after parse_json_tool_calls * fix required tool calls w/ thinking models that have pre-opened thinking tags * fix thinking model's initial trigger + test qwq's template * run most test_tool_call tests in stream + non-stream modes * make functionary v3.2 parsing more strict (differentiate first match from others) * send final diff from server, to close off raw python arguments * support partial content streaming in Generic mode * tool-call: allow content prelude before hermes2 tool calls (for Qwen2.5) * Update function-calling.md * Update tool_bench.py * chat-parser: remove input from exception (llm output may contain PII) --------- Co-authored-by: ochafik <ochafik@google.com> Co-authored-by: Olivier Chafik <ochafik@users.noreply.github.com>	2025-05-25 01:48:08 +01:00
Z	3e0be1cace	llguidance : official v0.7.20 release (no actual changes) [noci] (#13594 )	2025-05-16 22:56:28 +02:00
Olivier Chafik	3198405e98	`common`: add partial regex support (#12808 ) * move string_find_partial_stop & string_ends_with to common * add common_regex (supports partial matches) Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Update common/regex-partial.cpp Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Update common/regex-partial.cpp Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Update common/regex-partial.h Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * partial regex: add missing iterator end checks * string utils: use string_views * direct throw to avoid ggml.h include * regex-partial: replace missed ggml_asserts --------- Co-authored-by: ochafik <ochafik@google.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2025-05-14 19:50:57 +01:00
Helton Reis	7c28a74e07	chore(llguidance): use tagged version that does not break the build (#13413 )	2025-05-09 23:15:39 +03:00
Diego Devesa	15e03282bb	ci : limit write permission to only the release step + fixes (#13392 ) * ci : limit write permission to only the release step * fix win cuda file name * fix license file copy on multi-config generators	2025-05-08 23:45:22 +02:00
Diego Devesa	d7a14c42a1	build : fix build info on windows (#13239 ) * build : fix build info on windows * fix cuda host compiler msg	2025-05-01 21:48:08 +02:00
Xuan-Son Nguyen	bd3f59f812	cmake : enable curl by default (#12761 ) * cmake : enable curl by default * no curl if no examples * fix build * fix build-linux-cross * add windows-setup-curl * fix * shell * fix path * fix windows-latest-cmake* * run: include_directories * LLAMA_RUN_EXTRA_LIBS * sycl: no llama_curl * no test-arg-parser on windows * clarification * try riscv64 / arm64 * windows: include libcurl inside release binary * add msg * fix mac / ios / android build * will this fix xcode? * try clearing the cache * add bunch of licenses * revert clear cache * fix xcode * fix xcode (2) * fix typo	2025-04-07 13:35:19 +02:00
Michał Moskal	2447ad8a98	upgrade to llguidance 0.7.10 (#12576 )	2025-03-26 11:06:09 -07:00
Olivier Chafik	63e489c025	tool-call: refactor common chat / tool-call api (+ tests / fixes) (#11900 ) * tool-call refactoring: moved common_chat_* to chat.h, common_chat_templates_init return a unique_ptr to opaque type * addressed clang-tidy lints in [test-]chat.* * rm minja deps from util & common & move it to common/minja/ * add name & tool_call_id to common_chat_msg * add common_chat_tool * added json <-> tools, msgs conversions to chat.h * fix double bos/eos jinja avoidance hack (was preventing inner bos/eos tokens) * fix deepseek r1 slow test (no longer <think> opening w/ new template) * allow empty tools w/ auto + grammar * fix & test server grammar & json_schema params w/ & w/o --jinja	2025-02-18 18:03:23 +00:00
Michał Moskal	89daa2564f	llguidance build fixes for Windows (#11664 ) * setup windows linking for llguidance; thanks @phil-scott-78 * add build instructions for windows and update script link * change VS Community link from DE to EN * whitespace fix	2025-02-14 12:46:08 -08:00
Michał Moskal	ff227703d6	sampling : support for llguidance grammars (#10224 ) * initial porting of previous LLG patch * update for new APIs * build: integrate llguidance as an external project * use '%llguidance' as marker to enable llg lark syntax * add some docs * clarify docs * code style fixes * remove llguidance.h from .gitignore * fix tests when llg is enabled * pass vocab not model to llama_sampler_init_llg() * copy test-grammar-integration.cpp to test-llguidance.cpp * clang fmt * fix ref-count bug * build and run test * gbnf -> lark syntax * conditionally include llguidance test based on LLAMA_LLGUIDANCE flag * rename llguidance test file to test-grammar-llguidance.cpp * add gh action for llg test * align tests with LLG grammar syntax and JSON Schema spec * llama_tokenizer() in fact requires valid utf8 * update llg * format file * add $LLGUIDANCE_LOG_LEVEL support * fix whitespace * fix warning * include <cmath> for INFINITY * add final newline * fail llama_sampler_init_llg() at runtime * Link gbnf_to_lark.py script; fix links; refer to llg docs for lexemes * simplify #includes * improve doc string for LLAMA_LLGUIDANCE * typo in merge * bump llguidance to 0.6.12	2025-02-02 09:55:32 +02:00
Olivier Chafik	8b576b6c55	Tool call support (generic + native for Llama, Functionary, Hermes, Mistral, Firefunction, DeepSeek) w/ lazy grammars (#9639 ) --------- Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: Xuan Son Nguyen <son@huggingface.co>	2025-01-30 19:13:58 +00:00
Olivier Chafik	6171c9d258	Add Jinja template support (#11016 ) * Copy minja from `58f0ca6dd7` * Add --jinja and --chat-template-file flags * Add missing <optional> include * Avoid print in get_hf_chat_template.py * No designated initializers yet * Try and work around msvc++ non-macro max resolution quirk * Update test_chat_completion.py * Wire LLM_KV_TOKENIZER_CHAT_TEMPLATE_N in llama_model_chat_template * Refactor test-chat-template * Test templates w/ minja * Fix deprecation * Add --jinja to llama-run * Update common_chat_format_example to use minja template wrapper * Test chat_template in e2e test * Update utils.py * Update test_chat_completion.py * Update run.cpp * Update arg.cpp * Refactor common_chat_* functions to accept minja template + use_jinja option * Attempt to fix linkage of LLAMA_CHATML_TEMPLATE * Revert LLAMA_CHATML_TEMPLATE refactor * Normalize newlines in test-chat-templates for windows tests * Forward decl minja::chat_template to avoid eager json dep * Flush stdout in chat template before potential crash * Fix copy elision warning * Rm unused optional include * Add missing optional include to server.cpp * Disable jinja test that has a cryptic windows failure * minja: fix vigogne (https://github.com/google/minja/pull/22) * Apply suggestions from code review Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Finish suggested renamings * Move chat_templates inside server_context + remove mutex * Update --chat-template-file w/ recent change to --chat-template * Refactor chat template validation * Guard against missing eos/bos tokens (null token otherwise throws in llama_vocab::impl::token_get_attr) * Warn against missing eos / bos tokens when jinja template references them * rename: common_chat_template[s] * reinstate assert on chat_templates.template_default * Update minja to `b8437df626` * Update minja to https://github.com/google/minja/pull/25 * Update minja from https://github.com/google/minja/pull/27 * rm unused optional header --------- Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2025-01-21 13:18:51 +00:00
Eric Curtin	c27ac678dd	Opt class for positional argument handling (#10508 ) Added support for positional arguments `model` and `prompt`. Added functionality to download via strings like: llama-run llama3 llama-run ollama://granite-code llama-run ollama://granite-code:8b llama-run hf://QuantFactory/SmolLM-135M-GGUF/SmolLM-135M.Q2_K.gguf llama-run huggingface://bartowski/SmolLM-1.7B-Instruct-v0.2-GGUF/SmolLM-1.7B-Instruct-v0.2-IQ3_M.gguf llama-run https://example.com/some-file1.gguf llama-run some-file2.gguf llama-run file://some-file3.gguf Signed-off-by: Eric Curtin <ecurtin@redhat.com>	2024-12-13 19:34:25 +01:00
Diego Devesa	7cc2d2c889	ggml : move AMX to the CPU backend (#10570 ) * ggml : move AMX to the CPU backend --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2024-11-29 21:54:58 +01:00
Georgi Gerganov	ab96610b1e	cmake : enable warnings in llama (#10474 ) * cmake : enable warnings in llama ggml-ci * cmake : add llama_get_flags and respect LLAMA_FATAL_WARNINGS * cmake : get_flags -> ggml_get_flags * speculative-simple : fix warnings * cmake : reuse ggml_get_flags ggml-ci * speculative-simple : fix compile warning ggml-ci	2024-11-26 14:18:08 +02:00
Georgi Gerganov	d9d54e498d	speculative : refactor and add a simpler example (#10362 ) * speculative : refactor and add a simpler example ggml-ci * speculative : clean-up and add comments and TODOs [no ci] * speculative : manage context in common_speculative ggml-ci * speculative : simplify ggml-ci * speculative : simplify (cont) ggml-ci * speculative : add --draft-min CLI arg * speculative : minor fixup * make : build fixes * speculative : do not redraft previous drafts ggml-ci * speculative : fix the draft sampling ggml-ci * speculative : fix compile warning * common : refactor args ggml-ci * common : change defaults [no ci] * common : final touches ggml-ci	2024-11-25 09:58:41 +02:00
Diego Devesa	9f40989351	ggml : move CPU backend to a separate file (#10144 )	2024-11-03 19:34:08 +01:00
Georgi Gerganov	6262d13e0b	common : reimplement logging (#9418 ) https://github.com/ggerganov/llama.cpp/pull/9418	2024-09-15 20:46:12 +03:00
Xuan Son Nguyen	bfe76d4a17	common : move arg parser code to `arg.cpp` (#9388 ) * common : move arg parser to arg.cpp * better categorize args * add cmake * missing climits * missing cstdarg * common : more explicit includes * fix build * refactor gpt_params_parse * update server readme * fix test --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2024-09-09 23:36:09 +02:00
Georgi Gerganov	df270ef745	llama : refactor sampling v2 (#9294 ) - Add `struct llama_sampler` and `struct llama_sampler_i` - Add `llama_sampler_` API - Add `llama_sampler_chain_` API for chaining multiple samplers - Remove `LLAMA_API_INTERNAL` - Add `llama_perf_` API and remove old `llama_print_timings` and `llama_reset_timings`	2024-09-07 15:16:19 +03:00
Georgi Gerganov	f3f65429c4	llama : reorganize source code + improve CMake (#8006 ) * scripts : update sync [no ci] * files : relocate [no ci] * ci : disable kompute build [no ci] * cmake : fixes [no ci] * server : fix mingw build ggml-ci * cmake : minor [no ci] * cmake : link math library [no ci] * cmake : build normal ggml library (not object library) [no ci] * cmake : fix kompute build ggml-ci * make,cmake : fix LLAMA_CUDA + replace GGML_CDEF_PRIVATE ggml-ci * move public backend headers to the public include directory (#8122) * move public backend headers to the public include directory * nix test * spm : fix metal header --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * scripts : fix sync paths [no ci] * scripts : sync ggml-blas.h [no ci] --------- Co-authored-by: slaren <slarengh@gmail.com>	2024-06-26 18:33:02 +03:00
intelmatt	27615f5ab2	cmake : fix BUILD_SHARED_LIBS=ON build (#7784 ) common depends on pthreads in Linux	2024-06-07 15:15:07 +03:00
Olivier Chafik	7593639ce3	`main`: add --json-schema / -j flag (#6659 ) * main: add --json-schema / -j * json: move json-schema-to-grammar to common lib * json: fix zig build	2024-04-15 18:35:21 +01:00
Johannes Gäßler	50ccaf5eac	lookup: complement data from context with general text statistics (#5479 ) * lookup: evaluation tools, use corpus/previous gens * fixup! lookup: evaluation tools, use corpus/previous gens * fixup! lookup: evaluation tools, use corpus/previous gens * fixup! lookup: evaluation tools, use corpus/previous gens * fixup! lookup: evaluation tools, use corpus/previous gens	2024-03-23 01:24:36 +01:00
Olivier Chafik	5b7b0ac8df	json-schema-to-grammar improvements (+ added to server) (#5978 ) * json: fix arrays (disallow `[,1]`) * json: support tuple types (`[number, string]`) * json: support additionalProperties (`{[k: string]: [string,number][]}`) * json: support required / optional properties * json: add support for pattern * json: resolve $ref (and support https schema urls) * json: fix $ref resolution * join: support union types (mostly for nullable types I think) * json: support allOf + nested anyOf * json: support any (`{}` or `{type: object}`) * json: fix merge * json: temp fix for escapes * json: spaces in output and unrestricted output spaces * json: add typings * json:fix typo * Create ts-type-to-grammar.sh * json: fix _format_literal (json.dumps already escapes quotes) * json: merge lit sequences and handle negatives {"type": "string", "pattern": "^({\"question\": \"[^\"]+\", \"response\": \"[^\"]+\"}\\n)+$"} * json: handle pattern repetitions * Update json-schema-to-grammar.mjs * Create regex-to-grammar.py * json: extract repeated regexp patterns to subrule * Update json-schema-to-grammar.py * Update json-schema-to-grammar.py * Update json-schema-to-grammar.py * json: handle schema from pydantic Optional fields * Update json-schema-to-grammar.py * Update json-schema-to-grammar.py * Update ts-type-to-grammar.sh * Update ts-type-to-grammar.sh * json: simplify nullable fields handling * json: accept duplicate identical rules * json: revert space to 1 at most * json: reuse regexp pattern subrules * json: handle uuid string format * json: fix literal escapes * json: add --allow-fetch * json: simplify range escapes * json: support negative ranges in patterns * Delete commit.txt * json: custom regex parser, adds dot support & JS-portable * json: rm trailing spaces * Update json-schema-to-grammar.mjs * json: updated server & chat `( cd examples/server && ./deps.sh )` * json: port fixes from mjs to python * Update ts-type-to-grammar.sh * json: support prefixItems alongside array items * json: add date format + fix uuid * json: add date, time, date-time formats * json: preserve order of props from TS defs * json: port schema converter to C++, wire in ./server * json: nits * Update json-schema-to-grammar.cpp * Update json-schema-to-grammar.cpp * Update json-schema-to-grammar.cpp * json: fix mjs implementation + align outputs * Update json-schema-to-grammar.mjs.hpp * json: test C++, JS & Python versions * json: nits + regen deps * json: cleanup test * json: revert from c++17 to 11 * json: nit fixes * json: dirty include for test * json: fix zig build * json: pass static command to std::system in tests (fixed temp files) * json: fix top-level $refs * json: don't use c++20 designated initializers * nit * json: basic support for reserved names `{number:{number:{root:number}}}` * Revamp test cmake to allow args (WORKING_DIRECTORY needed for JSON test) * json: re-ran server deps.sh * json: simplify test * json: support mix of additional props & required/optional * json: add tests for some expected failures * json: fix type=const in c++, add failure expectations for non-str const&enum * json: test (& simplify output of) empty schema * json: check parsing in test + fix value & string refs * json: add server tests for OAI JSON response_format * json: test/fix top-level anyOf * json: improve grammar parsing failures * json: test/fix additional props corner cases * json: fix string patterns (was missing quotes) * json: ws nit * json: fix json handling in server when there's no response_format * json: catch schema conversion errors in server * json: don't complain about unknown format type in server if unset * json: cleaner build of test * json: create examples/json-schema-pydantic-example.py * json: fix date pattern * json: move json.hpp & json-schema-to-grammar.{cpp,h} to common * json: indent 4 spaces * json: fix naming of top-level c++ function (+ drop unused one) * json: avoid using namespace std * json: fix zig build * Update server.feature * json: iostream -> fprintf * json: space before & refs for consistency * json: nits	2024-03-21 11:50:43 +00:00
Pierrick Hymbert	d01b3c4c32	common: llama_load_model_from_url using --model-url (#6098 ) * common: llama_load_model_from_url with libcurl dependency Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2024-03-17 19:12:37 +01:00
Dane Madsen	fe52be11e3	cmake : handle cases where git index is not found in .git (#5844 ) * Update CMakeLists.txt * Update CMakeLists.txt	2024-03-04 20:26:55 +02:00
Cuong Trinh Manh	97bbca6e85	cmake : fix ld warning duplicate libraries libllama.a (#4671 ) * fix "ld: warning: ignoring duplicate libraries: '../libllama.a'" * fix warning in example.	2023-12-29 16:39:15 +02:00
Jared Van Bortel	15f5d96037	build : fix build info generation and cleanup Makefile (#3920 ) * cmake : fix joining of REAL_GIT_DIR * fix includes with help from include-what-you-use * make : remove unneeded deps and add test-rope target * fix C includes in C++ source files * Revert "fix includes with help from include-what-you-use" This reverts commit `635e9fadfd`.	2023-12-01 00:23:08 +02:00
bandoti	b38a16dfcf	cmake : fix issue with version info not getting baked into LlamaConfig.cmake (#3970 ) * Split CPP generation from build-info query * Remove blank lines * Add BUILD_SHARED_LIBS option	2023-11-27 21:25:42 +02:00
Damian Stewart	381efbf480	llava : expose as a shared library for downstream projects (#3613 ) * wip llava python bindings compatibility * add external llava API * add base64 in-prompt image support * wip refactor image loading * refactor image load out of llava init * cleanup * further cleanup; move llava-cli into its own file and rename * move base64.hpp into common/ * collapse clip and llava libraries * move llava into its own subdir * wip * fix bug where base64 string was not removed from the prompt * get libllava to output in the right place * expose llava methods in libllama.dylib * cleanup memory usage around clip_image_* * cleanup and refactor again * update headerdoc * build with cmake, not tested (WIP) * Editorconfig * Editorconfig * Build with make * Build with make * Fix cyclical depts on Windows * attempt to fix build on Windows * attempt to fix build on Windows * Upd TODOs * attempt to fix build on Windows+CUDA * Revert changes in cmake * Fix according to review comments * Support building as a shared library * address review comments --------- Co-authored-by: M. Yusuf Sarıgöz <yusufsarigoz@gmail.com> Co-authored-by: Jared Van Bortel <jared@nomic.ai>	2023-11-07 00:36:23 +03:00
Andrei	51b2fc11f7	cmake : fix relative path to git submodule index (#3915 )	2023-11-02 21:40:31 +02:00
cebtenzzre	b12fa0d1c1	build : link against build info instead of compiling against it (#3879 ) * cmake : fix build when .git does not exist * cmake : simplify BUILD_INFO target * cmake : add missing dependencies on BUILD_INFO * build : link against build info instead of compiling against it * zig : make build info a .cpp source instead of a header Co-authored-by: Matheus C. França <matheus-catarino@hotmail.com> * cmake : revert change to CMP0115 --------- Co-authored-by: Matheus C. França <matheus-catarino@hotmail.com>	2023-11-02 08:50:16 +02:00
Kerfuffle	70c29da118	common : fix mirostat state when using multiple sequences (#3543 ) * Fix mirostat state when using multiple sequences * Fix mirostat by completely refactoring sampling! * Try to fix zig build. * Export function to fetch/create default sampler states Code formatting cleanups and add some comments Silence a warning about id not being used when logging is disabled * Apply some renaming suggestions. Fix comments that were out of sync with the pull. * Use more consistant naming convention for sampling contexts	2023-10-11 22:35:46 +03:00

1 2

52 Commits