mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-10-27 08:21:30 +00:00

Files

Pascal 9b9201f65a webui: introduce OpenAI-compatible model selector in JSON payload (#16562 )

* webui: introduce OpenAI-compatible model selector in JSON payload

* webui: restore OpenAI-Compatible model source of truth and unify metadata capture

This change re-establishes a single, reliable source of truth for the active model:
fully aligned with the OpenAI-Compat API behavior

It introduces a unified metadata flow that captures the model field from both
streaming and non-streaming responses, wiring a new onModel callback through ChatService
The model name is now resolved directly from the API payload rather than relying on
server /props or UI assumptions

ChatStore records and persists the resolved model for each assistant message during
streaming, ensuring consistency across the UI and database
Type definitions for API and settings were also extended to include model metadata
and the onModel callback, completing the alignment with OpenAI-Compat semantics

* webui: address review feedback from allozaur

* webui: move model selector into ChatForm (idea by @allozaur)

* webui: make model selector more subtle and integrated into ChatForm

* webui: replaced the Flowbite selector with a native Svelte dropdown

* webui: add developer setting to toggle the chat model selector

* webui: address review feedback from allozaur

Normalized streamed model names during chat updates
by trimming input and removing directory components before saving
or persisting them, so the conversation UI shows only the filename

Forced model names within the chat form selector dropdown to render as
a single-line, truncated entry with a tooltip revealing the full name

* webui: toggle displayed model source for legacy vs OpenAI-Compat modes

When the selector is disabled, it falls back to the active server model name from /props

When the model selector is enabled, the displayed model comes from the message metadata
(the one explicitly selected and sent in the request)

* Update tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormActions.svelte

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

* Update tools/server/webui/src/lib/constants/localstorage-keys.ts

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

* Update tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormModelSelector.svelte

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

* Update tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

* Update tools/server/webui/src/lib/services/chat.ts

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

* Update tools/server/webui/src/lib/services/chat.ts

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

* webui: refactor model selector and persistence helpers

- Replace inline portal and event listeners with proper Svelte bindings
- Introduce 'persisted' store helper for localStorage sync without runes
- Extract 'normalizeModelName' utils + Vitest coverage
- Simplify ChatFormModelSelector structure and cleanup logic

Replaced the persisted store helper's use of '$state/$effect' runes with
a plain TS implementation to prevent orphaned effect runtime errors
outside component context

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

* webui: document normalizeModelName usage with inline examples

* Update tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormModelSelector.svelte

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

* Update tools/server/webui/src/lib/stores/models.svelte.ts

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

* Update tools/server/webui/src/lib/stores/models.svelte.ts

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

* webui: extract ModelOption type into dedicated models.d.ts

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

* webui: refine ChatMessageAssistant displayedModel source logic

* webui: stabilize dropdown, simplify model extraction, and init assistant model field

* chore: update webui static build

* Update tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

* chore: npm format, update webui static build

* webui: align sidebar trigger position, remove z-index glitch

* chore: update webui build output

---------

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

2025-10-22 16:58:23 +02:00

.storybook

SvelteKit-based WebUI (#14839 )

2025-09-17 19:29:13 +02:00

e2e

SvelteKit-based WebUI (#14839 )

2025-09-17 19:29:13 +02:00

scripts

webui: Remove running llama-server within WebUI dev.sh script (#16363 )

2025-10-01 08:40:26 +03:00

src

webui: introduce OpenAI-compatible model selector in JSON payload (#16562 )

2025-10-22 16:58:23 +02:00

static

SvelteKit-based WebUI (#14839 )

2025-09-17 19:29:13 +02:00

.gitignore

SvelteKit-based WebUI (#14839 )

2025-09-17 19:29:13 +02:00

.npmrc

SvelteKit-based WebUI (#14839 )

2025-09-17 19:29:13 +02:00

.prettierignore

SvelteKit-based WebUI (#14839 )

2025-09-17 19:29:13 +02:00

.prettierrc

SvelteKit-based WebUI (#14839 )

2025-09-17 19:29:13 +02:00

components.json

SvelteKit-based WebUI (#14839 )

2025-09-17 19:29:13 +02:00

eslint.config.js

SvelteKit-based WebUI (#14839 )

2025-09-17 19:29:13 +02:00

package-lock.json

Enable per-conversation loading states to allow having parallel conversations (#16327 )

2025-10-20 12:41:13 +02:00

package.json

Enable per-conversation loading states to allow having parallel conversations (#16327 )

2025-10-20 12:41:13 +02:00

playwright.config.ts

Enable per-conversation loading states to allow having parallel conversations (#16327 )

2025-10-20 12:41:13 +02:00

README.md

SvelteKit-based WebUI (#14839 )

2025-09-17 19:29:13 +02:00

svelte.config.js

Enable per-conversation loading states to allow having parallel conversations (#16327 )

2025-10-20 12:41:13 +02:00

tsconfig.json

SvelteKit-based WebUI (#14839 )

2025-09-17 19:29:13 +02:00

vite.config.ts

Enable per-conversation loading states to allow having parallel conversations (#16327 )

2025-10-20 12:41:13 +02:00

vitest-setup-client.ts

SvelteKit-based WebUI (#14839 )

2025-09-17 19:29:13 +02:00

README.md

llama.cpp Web UI

A modern, feature-rich web interface for llama.cpp built with SvelteKit. This UI provides an intuitive chat interface with advanced file handling, conversation management, and comprehensive model interaction capabilities.

Features

Modern Chat Interface - Clean, responsive design with dark/light mode
File Attachments - Support for images, text files, PDFs, and audio with rich previews and drag-and-drop support
Conversation Management - Create, edit, branch, and search conversations
Advanced Markdown - Code highlighting, math formulas (KaTeX), and content blocks
Reasoning Content - Support for models with thinking blocks
Keyboard Shortcuts - Keyboard navigation (Shift+Ctrl/Cmd+O for new chat, Shift+Ctrl/Cmdt+E for edit conversation, Shift+Ctrl/Cmdt+D for delete conversation, Ctrl/Cmd+K for search, Ctrl/Cmd+V for paste, Ctrl/Cmd+B for opening/collapsing sidebar)
Request Tracking - Monitor processing with slots endpoint integration
UI Testing - Storybook component library with automated tests

Development

Install dependencies:

npm install

Start the development server + Storybook:

npm run dev

This will start both the SvelteKit dev server and Storybook on port 6006.

Building

Create a production build:

npm run build

The build outputs static files to ../public directory for deployment with llama.cpp server.

Testing

Run the test suite:

# E2E tests
npm run test:e2e

# Unit tests
npm run test:unit

# UI tests
npm run test:ui

# All tests
npm run test

Architecture

Framework: SvelteKit with Svelte 5 runes
Components: ShadCN UI + bits-ui design system
Database: IndexedDB with Dexie for local storage
Build: Static adapter for deployment with llama.cpp server
Testing: Playwright (E2E) + Vitest (unit) + Storybook (components)