Added code for backend glue

2025-10-13 19:20:24 -04:00
parent 692b069b5b
commit 29a451ab58
25 changed files with 1063 additions and 0 deletions
--- a/backend/design.md
+++ b/backend/design.md
@@ -0,0 +1,98 @@
+# MIND - Modular Inference & Node Database - Server-Side Design
+
+## High-Level Overview
+
+### Inference Engine - `llama.cpp`
+A modified version of `llama.cpp` that provides extra fields in its
+completion API to specify the use of on-disk kv-cache.  And also tells
+the client where the new kv-cache blocks are located.
+
+### Database - MySQL
+This will store the information about users, the conversation
+histories, and also the index to the kv-cache stored as chunks on
+disk.
+
+### Backend Server - Go Layer
+This will provide the APIs used by the frontend, and will talk to the
+inference engine so that it can load the correct chunks of kv-cache
+into memory or reconstruct a conversation out of cache, and will
+handle the life cycle of caches stored on disk.
+
+It will also handle authentication (add-on feature).
+
+### CLI Interface - Go/Python
+This will provide a simple interface to access all the features
+provided by the backend of ease of testing and prototyping.
+
+## Supported APIs For the Backend
+Note that all APIs will need to encode the owner of the node.
+
+### `POST /conversations`
+This will start a new conversation tree.  The `go` backend should
+handle the node creation.
+
+### `GET /conversations`
+This will return all the root nodes of the conversation trees, to
+provide context for the user to switch conversation trees.
+
+### `GET /tree`
+This will return the DAG under root, or within a specified depth or
+reversed depth from leaves, which would provide context for the user
+to switch between branches on a given tree.
+
+### `POST /branches`
+Creates a new fork from a given commit.
+
+### `GET /branches`
+List the branches of related to the current branch.  Can also specify
+the maximum branch-off points to list.
+
+### `POST /graft`
+Attach a range of nodes from another conversation.
+
+### `POST /detach`
+Detaches a branch into a new conversation.
+
+### `GET /linearize`
+Reconstruct a linear conversation history from a branch or node.
+
+### `POST /completion`
+Trigger inference from the last node of a branch, creating two new
+nodes, one for the prompt and one for the answer.
+
+Note that this is for talking to the go backend, so the go backend
+will be responsible for bookkeeping the kv-cache on disk, and the
+frontend doesn't need to worry about it.
+
+### `POST /login`
+Logs into a certain user.
+
+### `POST /logout`
+Logs out a user.
+
+### `GET /me`
+Get the current user.
+
+## Database-Specific
+The database should keep track of reachability and the backend should
+automatically remove orphaned nodes and caches.
+
+It should also keep track of the DAG generated by the prompts and
+answers and different root nodes.
+
+## Cache
+For a single user, the kv-cache on disk should only concern the
+working node, that is, all of its parent nodes.
+
+## Multiple users
+
+### Authentication
+JWT-based authentication and multi-user switching, all API calls
+except for `POST /login` would require a token.
+
+The default token will be given for earlier stages.
+
+### Queuing
+The go layer should also be responsible for keeping track of the
+`llama.cpp` services availability and queue prompts in the case of
+multiple users.