From c22496493b08f8fd71298172382996554dc12504 Mon Sep 17 00:00:00 2001
From: Peisong Xiao <peisong.xiao.xps@gmail.com>
Date: Mon, 20 Oct 2025 20:15:26 -0400
Subject: [PATCH] added README with instructions on setting up the backend

---
 backend/README.md | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)
 create mode 100644 backend/README.md

diff --git a/backend/README.md b/backend/README.md
new file mode 100644
index 0000000..c1fe7af
--- /dev/null
+++ b/backend/README.md
@@ -0,0 +1,26 @@
+# MIND Backend
+
+## Setup
+Below will setup the backend including the `go` orchestration layer
+and a `llama.cpp` inference server on `localhost:8081` and
+`localhost:8080` for local testing.
+### Building `llama.cpp`
+In `$REPO/third_party/llama.cpp` run `make` to build.
+
+### Running `llama.cpp`
+#### Getting a `GGUF` format model
+Run `./backend/get-qwen3-1.7b.sh` to download the Qwen 3 1.7B model
+from HuggingFace.
+#### Running the inference server
+Run `./llama-server -m <path-to-gguf-model> --port 8081` to run the
+inference server at `localhost:8081`.
+
+### Running the backend layer
+Run `go run main.go`.  This will run the backend layer at
+`localhost:8080`.
+
+## A simple CLI client
+A simple CLI-based client can be found under `backend/cli.py`, which
+will connect to the backend layer at `localhost:8080`.
+
+Please use the `\help` command to view specific operations.