added README with instructions on setting up the backend
This commit is contained in:
26
backend/README.md
Normal file
26
backend/README.md
Normal file
@@ -0,0 +1,26 @@
|
|||||||
|
# MIND Backend
|
||||||
|
|
||||||
|
## Setup
|
||||||
|
Below will setup the backend including the `go` orchestration layer
|
||||||
|
and a `llama.cpp` inference server on `localhost:8081` and
|
||||||
|
`localhost:8080` for local testing.
|
||||||
|
### Building `llama.cpp`
|
||||||
|
In `$REPO/third_party/llama.cpp` run `make` to build.
|
||||||
|
|
||||||
|
### Running `llama.cpp`
|
||||||
|
#### Getting a `GGUF` format model
|
||||||
|
Run `./backend/get-qwen3-1.7b.sh` to download the Qwen 3 1.7B model
|
||||||
|
from HuggingFace.
|
||||||
|
#### Running the inference server
|
||||||
|
Run `./llama-server -m <path-to-gguf-model> --port 8081` to run the
|
||||||
|
inference server at `localhost:8081`.
|
||||||
|
|
||||||
|
### Running the backend layer
|
||||||
|
Run `go run main.go`. This will run the backend layer at
|
||||||
|
`localhost:8080`.
|
||||||
|
|
||||||
|
## A simple CLI client
|
||||||
|
A simple CLI-based client can be found under `backend/cli.py`, which
|
||||||
|
will connect to the backend layer at `localhost:8080`.
|
||||||
|
|
||||||
|
Please use the `\help` command to view specific operations.
|
||||||
Reference in New Issue
Block a user