Getting Started
Configure GNO via index.yml, manage collections, and pick the right model preset.
GNO is driven by a YAML config file and a set of CLI verbs. Most users never touch the file directly — they manage collections and presets via the CLI or the Web UI. But when you need it, here’s where everything lives.
~/.config/gno/config/index.yml%APPDATA%\gno\config\index.ymlRun gno doctor to see the resolved paths for your machine.
Each collection is a named source of documents. Collections have their own include/exclude rules and can optionally override model presets.
# Add a collection
gno collection add ~/notes --name notes
# Add with a glob pattern
gno collection add ~/code --name code --pattern "**/*.{ts,md}"
# List collections
gno collection list
# Remove a collection
gno collection remove notesAdd an exclude array to a collection to skip files and directories:
collections:
notes:
path: ~/notes
exclude:
- node_modules
- .git
- "**/*.tmp"GNO ships four built-in presets. Switch with one command; the first pull downloads the model files and caches them locally.
gno models use balanced
gno models pull
gno models listOverride embed, rerank, or answer models for a single collection — useful when your code collection wants a code-specific embedding while your prose collection sticks with the general-purpose model.
collections:
code:
path: ~/code
models:
embed: nomic-ai/nomic-embed-code-v1.5Point GNO at an OpenAI-compatible server (Ollama, LM Studio, vLLM) running on another machine on your LAN.
models:
answer:
uri: http://localhost:11434/v1#llama3.1:8bGNO uses node-llama-cpp for local GGUF models. The default path uses prebuilt backends only; source builds are opt-in so normal indexing does not unexpectedly require local compiler toolchains.
GNO_LLAMA_GPU — choose auto, true, false, cuda, vulkan, or metal. NODE_LLAMA_CPP_GPU remains a compatibility alias when this is unset.GNO_LLAMA_BUILD — backend build mode. Default: never. Set autoAttempt only when you intentionally want node-llama-cpp to try a local source build.GNO_LLAMA_INIT_TIMEOUT_MS — local backend initialization timeout. Default: 30000.GNO_EMBED_CONTEXTS — override CPU embedding context count, clamped from 1 to 4. CPU-only runs choose a small adaptive pool automatically: one context on low-memory Windows machines, otherwise at most two contexts unless you explicitly override it.GNO_EMBED_THREADS — override CPU threads per embedding context.GNO_EMBED_CONTEXT_SIZE — override native embedding context size. Minimum: 128.GNO_NO_AUTO_DOWNLOAD — disable automatic model downloads; explicit gno models pull still works.