← All features

Benchmarks

Measure before you switch

Benchmark retrieval quality from the public CLI against local fixtures before changing models, modes, or defaults. GNO's research loops are built around measurable wins, not vibes.

Use cases
Evaluating a new embedding or reranker before shipping
Comparing candidates against a stable baseline
Catching retrieval regressions with a fixture your team owns

What it gives you

  • Public `gno bench <fixture>` command for repeatable checks
  • BM25, vector, and hybrid mode comparisons from the same fixture
  • Recall@K, precision, nDCG, MRR, and latency metrics
  • Per-query misses and hit lists for debugging regressions
  • JSON output for CI, model experiments, and release notes
  • Example fixture schema for team-owned benchmark corpora

Try it yourself

Representative commands and entry points. Full reference lives in the documentation.

gno bench docs/examples/bench-fixture.json
gno bench fixture.json --modes bm25,hybrid
gno bench fixture.json --json

Keep reading

Related features and docs.