llama.cpp/common at 657b8a77bd01854f99d37a47318fa24f2e7e298f - llama.cpp - Gitea - Peisong Xiao

CS348Project/llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-27 08:21:30 +00:00

Files

History

Daniel Bevenius 657b8a77bd chat: handle gpt-oss return/end token inconsistency (#15421 )

This commit addresses an inconsistency during inference by adding a new
member to the `templates_params` struct to indicate whether the chat is
in inference mode. This allows the gpt-oss specific function
`common_chat_params_init_gpt_oss` to check this flag and the
`add_generation_prompt` flag to determine if it should replace the
`<|return|>` token with the `<|end|>` token in the prompt.

The motivation for this change is to ensure that the formatted prompt of
past messages in `common_chat_format_single` matches the output of the
formatted new message. The issue is that the gpt-oss template returns
different end tags: `<|return|>` when `add_generation_prompt` is false,
and `<|end|>` when `add_generation_prompt` is true. This causes the
substring function to start at an incorrect position, resulting in
tokenization starting with 'tart|>' instead of '<|start|>'.

Resolves: https://github.com/ggml-org/llama.cpp/issues/15417

2025-08-20 14:26:01 +02:00

..

arg.cpp

common : fix context shift help message (#15448 )

2025-08-20 13:33:30 +03:00

arg.h

common : add common_remote_get_content (#13123 )

2025-04-26 22:58:12 +02:00

base64.hpp

llava : expose as a shared library for downstream projects (#3613 )

2023-11-07 00:36:23 +03:00

build-info.cpp.in

cmake: Add ability to pass in LLAMA_BUILD_NUMBER/COMMIT (#14167 )

2025-06-13 10:38:52 +02:00

chat-parser.cpp

chat : support Granite model reasoning and tool call (#14864 )

2025-08-06 20:27:30 +02:00

chat-parser.h

llama-chat : Do not throw when tool parsing fails (#14012 )

2025-06-14 17:25:15 +01:00

chat.cpp

chat: handle gpt-oss return/end token inconsistency (#15421 )

2025-08-20 14:26:01 +02:00

chat.h

chat : include kwargs in template example (#15309 )

2025-08-14 10:28:29 -07:00

CMakeLists.txt

cmake : do not search for curl libraries by ourselves (#14613 )

2025-07-10 15:29:05 +03:00

common.cpp

finetune: SGD optimizer, more CLI args (#13873 )

2025-08-14 12:03:57 +02:00

common.h

common : fix context shift help message (#15448 )

2025-08-20 13:33:30 +03:00

console.cpp

console : utf-8 fix for windows stdin (#9690 )

2024-09-30 11:23:42 +03:00

console.h

gguf : new file format with flexible meta data (beta) (#2398 )

2023-08-21 23:07:43 +03:00

json-partial.cpp

sync : vendor (#13901 )

2025-05-30 16:25:45 +03:00

json-partial.h

sync : vendor (#13901 )

2025-05-30 16:25:45 +03:00

json-schema-to-grammar.cpp

common : use std::string_view now that we target c++17 (#14319 )

2025-06-22 08:37:43 +03:00

json-schema-to-grammar.h

sync : vendor (#13901 )

2025-05-30 16:25:45 +03:00

llguidance.cpp

llguidance : set tokenizer slices to default (#13424 )

2025-05-10 17:19:52 +02:00

log.cpp

Fix: Compile failure due to Microsoft STL breaking change (#11836 )

2025-02-12 21:36:11 +01:00

log.h

cleanup: fix compile warnings associated with gnu_printf (#11811 )

2025-02-12 10:06:53 -04:00

ngram-cache.cpp

ggml : portability fixes for VS 2017 (#12150 )

2025-03-04 18:53:26 +02:00

ngram-cache.h

llama : use LLAMA_TOKEN_NULL (#11062 )

2025-01-06 10:52:15 +02:00

regex-partial.cpp

common: add partial regex support (#12808 )

2025-05-14 19:50:57 +01:00

regex-partial.h

common: add partial regex support (#12808 )

2025-05-14 19:50:57 +01:00

sampling.cpp

server: streaming of tool calls and thoughts when --jinja is on (#12379 )

2025-05-25 01:48:08 +01:00

sampling.h

sampling : support for llguidance grammars (#10224 )

2025-02-02 09:55:32 +02:00

speculative.cpp

server : implement universal assisted decoding (#12635 )

2025-07-31 14:25:23 +02:00

speculative.h

server : implement universal assisted decoding (#12635 )

2025-07-31 14:25:23 +02:00