From c22496493b08f8fd71298172382996554dc12504 Mon Sep 17 00:00:00 2001 From: Peisong Xiao Date: Mon, 20 Oct 2025 20:15:26 -0400 Subject: [PATCH] added README with instructions on setting up the backend --- backend/README.md | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) create mode 100644 backend/README.md diff --git a/backend/README.md b/backend/README.md new file mode 100644 index 0000000..c1fe7af --- /dev/null +++ b/backend/README.md @@ -0,0 +1,26 @@ +# MIND Backend + +## Setup +Below will setup the backend including the `go` orchestration layer +and a `llama.cpp` inference server on `localhost:8081` and +`localhost:8080` for local testing. +### Building `llama.cpp` +In `$REPO/third_party/llama.cpp` run `make` to build. + +### Running `llama.cpp` +#### Getting a `GGUF` format model +Run `./backend/get-qwen3-1.7b.sh` to download the Qwen 3 1.7B model +from HuggingFace. +#### Running the inference server +Run `./llama-server -m --port 8081` to run the +inference server at `localhost:8081`. + +### Running the backend layer +Run `go run main.go`. This will run the backend layer at +`localhost:8080`. + +## A simple CLI client +A simple CLI-based client can be found under `backend/cli.py`, which +will connect to the backend layer at `localhost:8080`. + +Please use the `\help` command to view specific operations.