added README with instructions on setting up the backend

This commit is contained in:
2025-10-20 20:15:26 -04:00
parent b0c6cfbf62
commit c22496493b

26
backend/README.md Normal file
View File

@@ -0,0 +1,26 @@
# MIND Backend
## Setup
Below will setup the backend including the `go` orchestration layer
and a `llama.cpp` inference server on `localhost:8081` and
`localhost:8080` for local testing.
### Building `llama.cpp`
In `$REPO/third_party/llama.cpp` run `make` to build.
### Running `llama.cpp`
#### Getting a `GGUF` format model
Run `./backend/get-qwen3-1.7b.sh` to download the Qwen 3 1.7B model
from HuggingFace.
#### Running the inference server
Run `./llama-server -m <path-to-gguf-model> --port 8081` to run the
inference server at `localhost:8081`.
### Running the backend layer
Run `go run main.go`. This will run the backend layer at
`localhost:8080`.
## A simple CLI client
A simple CLI-based client can be found under `backend/cli.py`, which
will connect to the backend layer at `localhost:8080`.
Please use the `\help` command to view specific operations.