Search Architecture
Цей контент ще не доступний вашою мовою.
Vault search is a multi-layer pipeline that combines full-text search, sparse TF-IDF scoring, optional dense vector embeddings, and seven weighted relevance signals.
For a quick overview, see Under the Hood.
The pipeline
Section titled “The pipeline”A search query flows through four layers:
Query (search_intelligent op) → Facade (mode selection, filters) → Brain (scoring, hybrid ranking) → VaultManager (federated tier search) → SQLite FTS5 + optional vector recallEach layer adds precision. FTS5 provides broad recall, the brain provides relevance ranking, and the vault manager federates across multiple knowledge sources.
Two-pass retrieval
Section titled “Two-pass retrieval”Search supports two modes to keep context lean:
| Mode | Returns | Use case |
|---|---|---|
scan | Lightweight results: title, score, snippet, token estimate | Browsing, triage |
full | Complete entries with score breakdowns | Deep reads, planning |
The recommended workflow is to scan first, pick the top 2-4 results by score, then load only those entries. This avoids flooding context with entries you don’t need.
"Search for authentication patterns" → scan (10 lightweight results)"Load entries auth-jwt-001, auth-002" → full (2 complete entries)Layer 1: SQLite FTS5
Section titled “Layer 1: SQLite FTS5”The vault stores entries in a SQLite database with an FTS5 virtual table:
CREATE VIRTUAL TABLE entries_fts USING fts5( id, title, description, context, tags, content='entries', content_rowid='rowid', tokenize='porter unicode61');Porter stemming reduces words to their root form (authentication and authenticating both match). Unicode normalization handles accented characters and non-Latin scripts.
Query transformation
Section titled “Query transformation”Your natural-language query gets transformed before hitting FTS5:
- Split on whitespace and punctuation (hyphens, underscores, dots, slashes)
- Lowercased
- Non-alphanumeric characters stripped from each term
- Terms shorter than 2 characters dropped
- Joined with
ORfor broad matching
Note: stop words are not removed at the FTS5 query level. Stop word removal only happens in the TF-IDF tokenizer used by the brain’s semantic scoring layer (see Semantic scoring below).
A query like “how does JWT validation work” becomes how OR does OR jwt OR validation OR work.
BM25 ranking
Section titled “BM25 ranking”FTS5 results are ranked using BM25 with per-field weights:
| Field | Weight | Why |
|---|---|---|
| title | 10.0 | Title matches are the strongest signal |
| id | 5.0 | Entry IDs often contain meaningful slugs |
| description | 3.0 | The main content body |
| tags | 2.0 | Tag matches indicate topical relevance |
| context | 1.0 | Context is supplementary |
If BM25 is unavailable (older SQLite builds), search falls back to the default FTS5 rank function.
Layer 2: VaultManager (federation)
Section titled “Layer 2: VaultManager (federation)”The vault manager searches across multiple tiers, which are separate SQLite databases with different scopes:
| Tier | Scope | Priority weight |
|---|---|---|
agent | Knowledge specific to this agent | 1.0 (highest) |
project | Shared across agents in a project | 0.8 |
team | Shared across team members | 0.6 |
Each tier is searched independently. Results are then:
- Weighted by tier priority
- Deduplicated. If the same entry exists in multiple tiers, the highest-priority version wins
- Merged into a single result set, sorted by weighted score
This means an agent-level pattern always outranks the same pattern at team level.
Layer 3: Brain scoring
Section titled “Layer 3: Brain scoring”FTS5 gives us broad recall. The brain turns it into precise relevance ranking.
Over-fetching
Section titled “Over-fetching”The brain requests 3x the desired limit (or 30 results, whichever is larger) from FTS5. This provides headroom. Many entries that rank well in FTS5 may score poorly on other signals. Over-fetching ensures the final top-N are the best matches across all factors.
Seven scoring signals
Section titled “Seven scoring signals”Every result is scored across seven factors:
| Signal | Weight (FTS only) | Weight (hybrid) | How it works |
|---|---|---|---|
| Semantic | 0.35 | 0.20 | TF-IDF cosine similarity between query and entry |
| Vector | 0.00 | 0.15 | Dense embedding cosine similarity |
| Severity | 0.10 | 0.10 | critical = 1.0, warning = 0.7, suggestion = 0.4 |
| Temporal decay | 0.10 | 0.10 | Exponential decay with 365-day half-life |
| Tag overlap | 0.10 | 0.10 | Jaccard similarity between query tags and entry tags |
| Domain match | 0.10 | 0.10 | 1.0 if the query domain matches the entry domain, 0.0 otherwise |
| Graph proximity | 0.15 | 0.15 | Link-graph distance between query-relevant entries and this entry |
The total score is the weighted sum of all factors.
Semantic scoring (TF-IDF)
Section titled “Semantic scoring (TF-IDF)”The brain maintains a TF-IDF vocabulary across all vault entries:
- Vocabulary building: tokenizes every entry (title + description + context + tags), computes IDF for each term:
IDF = log((docCount + 1) / (df + 1)) + 1 - Query vector: tokenizes the query using the same rules, computes a TF-IDF vector
- Entry vectors: each entry gets a TF-IDF vector using the same vocabulary
- Cosine similarity: the dot product of query and entry vectors, normalized by their magnitudes
This is why rare terms are more valuable than common ones. “JWT” carries more weight than “the” because it appears in fewer entries.
Temporal decay
Section titled “Temporal decay”Entries lose relevance over time:
Without a validity window:
decay = exp(-ln(2) * age / halfLife)Half-life is 365 days. An entry scores 1.0 when new, 0.5 after one year, 0.25 after two years.
With a validity window (entries that have valid_from/valid_until dates):
decayZone = totalWindow * 0.25if remaining > decayZone: decay = 1.0 (first 75% of the window)else: decay = remaining / decayZone (linear ramp-down in last 25%)In other words, an entry stays at full strength for the first 75% of its validity window, then linearly drops to zero over the final 25%.
Adaptive weights
Section titled “Adaptive weights”After 30+ feedback entries (excluding failed actions, which are system errors), the brain adjusts its scoring weights:
acceptedcounts as 1.0 positive,modifiedcounts as 0.5 positive (user adjusted but didn’t dismiss),dismissedcounts as 0.0- Accept rate =
(accepted + modified * 0.5) / totalFeedback - High accept rate (>50%): increase semantic weight (up to +0.15)
- Low accept rate (<50%): decrease semantic weight (down to -0.15)
- Other weights (severity, temporalDecay, tagOverlap, domainMatch, graphProximity) scale proportionally to maintain a sum of 1.0
This means the search system learns from your usage. If you consistently prefer tag-matched results over text-matched ones, the weights shift accordingly.
Small corpus guard
Section titled “Small corpus guard”If the vault has fewer than 50 FTS results and filtering to the requested limit would discard more than 50% of results, the brain returns all results instead. This prevents over-aggressive filtering on small knowledge bases where every entry matters.
Layer 4: Vector search (optional)
Section titled “Layer 4: Vector search (optional)”When an embedding provider is configured, search adds a dense vector recall phase:
How it works
Section titled “How it works”- Query embedding: the query is embedded into a dense vector (e.g., 1536 dimensions for OpenAI models)
- Cosine search: brute-force similarity computation against all stored entry vectors
- Candidate merging: entries found by vector search but missed by FTS5 are added to the candidate pool
- Score integration: vector similarity becomes a scoring signal (0.15 weight in hybrid mode)
Vector storage
Section titled “Vector storage”Dense vectors are stored as binary float32 blobs in an entry_vectors table with entry_id TEXT PRIMARY KEY. Each entry stores exactly one vector. If you switch embedding models, the old vector is overwritten via upsert.
Performance
Section titled “Performance”For vaults under 100K entries, brute-force cosine search completes in approximately 50ms. No approximate nearest-neighbor index is needed at this scale.
Embedding pipeline
Section titled “Embedding pipeline”New entries are automatically embedded in batches of 100. Only the title, description, and context fields are embedded (not full entry content). Entries that already have vectors for the active model are skipped.
Configuring embedding providers
Section titled “Configuring embedding providers”Vector search is optional and off by default. To enable it, you need two things: an embedding block in your agent.yaml, and the embedding-enabled feature flag turned on.
Supported providers
Section titled “Supported providers”| Provider | Default model | Dimensions | Default batch size | API key source |
|---|---|---|---|---|
openai | text-embedding-3-small | 1536 | 2048 | OpenAI key pool (same keys your LLM uses) |
voyage | voyage-3.5 | 1024 | 128 | VOYAGE_API_KEY env var or explicit apiKey |
More providers (like Ollama for local embeddings) can be added, but these two ship today.
agent.yaml configuration
Section titled “agent.yaml configuration”Add an embedding block at the top level of your agent.yaml:
embedding: provider: openai model: text-embedding-3-smallOr for Voyage AI:
embedding: provider: voyage model: voyage-3.5The full set of options:
| Field | Required | Description |
|---|---|---|
provider | Yes | openai or voyage |
model | Yes | Model identifier for the provider |
apiKey | No | Explicit API key (otherwise uses key pool or env var) |
baseUrl | No | Override the API endpoint (useful for proxies or self-hosted instances) |
batchSize | No | Max texts per API call (defaults vary by provider) |
inputType | No | document or query, relevant for Voyage which optimizes differently for each |
Environment variables
Section titled “Environment variables”For OpenAI, the provider uses the same key pool as your LLM calls, so no extra env var is needed if you already have OPENAI_API_KEY configured.
For Voyage, set VOYAGE_API_KEY in your environment. The provider checks this automatically when no explicit apiKey is set in the config.
How embedding works at runtime
Section titled “How embedding works at runtime”When both the config and feature flag are present, the engine initializes the embedding provider at startup. New vault entries get embedded automatically in batches of 100 (only title, description, and context fields). Entries that already have a vector for the active model are skipped.
If you switch models, old vectors are overwritten via upsert the next time the pipeline runs. If the provider fails to initialize (bad key, network issues), the engine logs a warning and continues without embeddings. The brain’s vector weight stays at 0.0, and search falls back to pure FTS5 + TF-IDF scoring.
Both providers handle rate limiting with automatic retry (up to 3 attempts) and key rotation on 429 responses when multiple keys are available in the pool.
Result format
Section titled “Result format”Full mode
Section titled “Full mode”{ entry: IntelligenceEntry, // Complete entry object score: number, // Total weighted score breakdown: { // Per-factor scores semantic: number, vector: number, severity: number, temporalDecay: number, tagOverlap: number, domainMatch: number, graphProximity: number, total: number }}Scan mode
Section titled “Scan mode”{ id: string, title: string, score: number, type: string, domain: string, severity: string, // 'critical' | 'warning' | 'suggestion' tags: string[], snippet: string, // First 120 chars of description tokenEstimate: number // Rough token count (chars / 4)}Memory search
Section titled “Memory search”Memory entries (session history, captured context) are searched separately using the same FTS5 engine. Memory results are merged with vault results and sorted by score. Memory entries use a fixed score of 0.5 as a baseline.
Key files
Section titled “Key files”| File | Role |
|---|---|
packages/core/src/runtime/capture-ops.ts | search_intelligent facade op |
packages/core/src/brain/brain.ts | Scoring, TF-IDF, hybrid ranking |
packages/core/src/vault/vault-manager.ts | Federated tier search |
packages/core/src/vault/vault-entries.ts | FTS5 queries, vector cosine search |
packages/core/src/persistence/sqlite-provider.ts | SQLite FTS5 implementation |
packages/core/src/vault/vault-schema.ts | FTS5 table schema |
packages/core/src/embeddings/pipeline.ts | Batch embedding pipeline |
Next: Security & Privacy - understand where your data lives and who can access it.