Getting Started

Configuration

Configure GNO via index.yml, manage collections, and pick the right model preset.

GNO is driven by a YAML config file and a set of CLI verbs. Most users never touch the file directly — they manage collections and presets via the CLI or the Web UI. But when you need it, here’s where everything lives.

Config file location

macOS / Linux: ~/.config/gno/config/index.yml
Windows: %APPDATA%\gno\config\index.yml

Run gno doctor to see the resolved paths for your machine.

Collections

Each collection is a named source of documents. Collections have their own include/exclude rules and can optionally override model presets.

# Add a collection
gno collection add ~/notes --name notes

# Add with a glob pattern
gno collection add ~/code --name code --pattern "**/*.{ts,md}"

# List collections
gno collection list

# Remove a collection
gno collection remove notes

Exclusions

Add an exclude array to a collection to skip files and directories:

collections:
  notes:
    path: ~/notes
    exclude:
      - node_modules
      - .git
      - "**/*.tmp"

Model presets

GNO ships four built-in presets. Switch with one command; the first pull downloads the model files and caches them locally.

slim-tuned (~1GB) — fine-tuned local expansion model on top of the slim stack. Default for new installs.
slim — leaner models, faster on modest hardware.
balanced (~2GB) — larger models, better recall.
quality (~2.5GB) — highest quality on every stage.

gno models use balanced
gno models pull
gno models list

Per-collection model overrides

Override embed, rerank, or answer models for a single collection — useful when your code collection wants a code-specific embedding while your prose collection sticks with the general-purpose model.

collections:
  code:
    path: ~/code
    models:
      embed: nomic-ai/nomic-embed-code-v1.5

Remote model servers

Point GNO at an OpenAI-compatible server (Ollama, LM Studio, vLLM) running on another machine on your LAN.

models:
  answer:
    uri: http://localhost:11434/v1#llama3.1:8b

Local model runtime

GNO uses node-llama-cpp for local GGUF models. The default path uses prebuilt backends only; source builds are opt-in so normal indexing does not unexpectedly require local compiler toolchains.

GNO_LLAMA_GPU — choose auto, true, false, cuda, vulkan, or metal. NODE_LLAMA_CPP_GPU remains a compatibility alias when this is unset.
GNO_LLAMA_BUILD — backend build mode. Default: never. Set autoAttempt only when you intentionally want node-llama-cpp to try a local source build.
GNO_LLAMA_INIT_TIMEOUT_MS — local backend initialization timeout. Default: 30000.
GNO_EMBED_CONTEXTS — override CPU embedding context count, clamped from 1 to 4. CPU-only runs choose a small adaptive pool automatically: one context on low-memory Windows machines, otherwise at most two contexts unless you explicitly override it.
GNO_EMBED_THREADS — override CPU threads per embedding context.
GNO_EMBED_CONTEXT_SIZE — override native embedding context size. Minimum: 128.
GNO_NO_AUTO_DOWNLOAD — disable automatic model downloads; explicit gno models pull still works.