Guides
Architecture
How GNO is built — Bun, SQLite, local models, and the shared retrieval core.
GNO is a Bun-compiled TypeScript application with a SQLite-backed index, local model inference via node-llama-cpp, and a shared retrieval core that every surface (CLI, Web UI, SDK, REST API, MCP) plugs into.
Runtime
- Bun — instant cold starts, single binary distribution, TypeScript-first runtime
- SQLite + sqlite-vec — BM25 FTS plus vector similarity via a SQLite extension. One database file per GNO installation.
- node-llama-cpp — local LLM inference for embedding, reranking, and answer generation. GGUF model format.
Storage layout
~/.config/gno/ — config files (index.yml, presets)~/.local/share/gno/ — SQLite database, model cache, asset cache~/.cache/gno/ — temporary artifacts, rerank scratchpad
Surfaces
Every GNO surface speaks to the same shared retrieval core:
- CLI —
src/cli/*, bundled as the gno binary - Web UI —
gno serve launches a Bun HTTP server with the browser workspace and the REST API - SDK — package-root importable client, same core under the hood
- REST API — exposed by
gno serve, 35+ endpoints - MCP server — stdio transport, read/write tools, graph navigation, and resources for any MCP-compatible client
- Desktop — a native window wrapping the web workspace
Data flow
- Ingestion — file walker reads sources, parsers extract text + frontmatter, chunker splits into retrievable units
- Embedding — chunk text is encoded to dense vectors via the chosen embedding model
- Indexing — chunks land in SQLite FTS and sqlite-vec tables; docs, links, and tags land in relational tables
- Retrieval — query enters the pipeline, hits BM25 + vector, merges, reranks, returns
- Answer — for
ask, the top matches go to the local LLM with a citation-preserving prompt