Beever Atlas v0.1 has launched! Star us on GitHub
Beever AtlasBeever Atlas

Query Router

When you ask a question, the Query Router decides the best way to answer it. It classifies your question, routes it to the appropriate memory systems (semantic, graph, or both), and merges the results into a comprehensive answer with citations.

How Query Routing Works

Loading diagram...

Step 1: Query Decomposition

Complex questions are decomposed into focused parallel sub-queries:

Simple questions → Single internal query, no decomposition

  • "What was discussed about auth?"
  • "Who decided to use JWT?"

Complex questions → 2-4 internal + 0-2 external sub-queries

  • "What auth method did we decide on and how does it compare to best practices?"
    • Internal: "authentication decision JWT"
    • Internal: "OAuth implementation alice"
    • External: "JWT vs OAuth best practices 2025"

Decomposition enables parallel execution across all sub-queries.

Step 2: Query Understanding

An LLM classifies each sub-query for routing:

Classification:

  • route: semantic | graph | both
  • semantic_depth: overview | topic | detail
  • entities: Named entities mentioned
  • topics: Topic areas referenced
  • temporal_scope: recent | any | historical
  • confidence: 0.0-1.0

Cost: ~$0.001 per query (Gemini Flash Lite)

Step 3: Routing Strategy

The router selects the optimal strategy based on classification:

Semantic-Only Queries

Route when: Looking for facts, discussions, topics, documents

Examples:

  • "What was discussed about auth?" → Semantic search
  • "Find deployment docs" → Cross-modal search (find PDFs)
  • "Show me the overview" → Tier 0 cached summary (FREE)
  • "Tell me about deployment" → Tier 1 topic cluster (FREE)

How it works:

  • Route to Weaviate based on semantic_depth
  • overview → Tier 0 summary
  • topic → Tier 1 clusters → Tier 2 atomics
  • detail → Tier 2 atomics directly
  • Hybrid BM25+vector search for optimal relevance

Cost: $0.001 (FREE for Tier 0/1) Latency: < 200ms (FREE reads: < 50ms)

Graph-Only Queries

Route when: Looking for entity relationships, people, decisions, temporal changes

Examples:

  • "Who decided to use JWT?" → Person → Decision traversal
  • "What is Alice working on?" → Person → Project traversal
  • "How did the auth approach evolve?" → Decision temporal chain
  • "What blocks the migration?" → Project → BLOCKED_BY traversal

How it works:

  1. Resolve entities from query to Neo4j nodes
  2. Graph traversal (1-2 hops)
  3. Follow episodic edges → Weaviate IDs
  4. Fetch full memories from Weaviate
  5. Combine graph structure + memory content

Cost: $0.005 Latency: ~500ms

Both (Parallel) Queries

Route when: Could benefit from both fact retrieval AND relationship context

Examples:

  • "Tell me about the JWT migration" → Needs facts + decisions + people
  • "What happened with auth last week?" → Temporal + factual
  • Low-confidence classifications → Use both for coverage

How it works:

  1. Execute semantic search and graph traversal in parallel
  2. Merge results:
    • Deduplicate by weaviate_id
    • Boost cross-validated results (mentioned in both)
    • Apply temporal decay
    • Quality-score weighted ranking
  3. Generate response from merged context

Cost: $0.006 Latency: ~500ms (parallel execution)

Routing Decision Table

Query PatternRouteWhyCostLatency
"What was discussed about auth?"SemanticFactual lookup$0.001< 200ms
"Show me the overview"Semantic (Tier 0)Cached summaryFREE< 50ms
"Tell me about deployment"Semantic (Tier 1)Topic clusterFREE< 50ms
"Find the architecture diagram"Semantic (cross-modal)Image search$0.001< 200ms
"Who decided to use JWT?"GraphPerson→Decision traversal$0.005~500ms
"What is Alice working on?"GraphPerson→Project traversal$0.005~500ms
"How did the auth approach evolve?"Graph (temporal)Decision→SUPERSEDES chain$0.005~500ms
"What blocks the migration?"GraphProject→BLOCKED_BY traversal$0.005~500ms
"Tell me about the JWT migration"Both (parallel)Facts + relationships$0.006~500ms
"What happened with auth last week?"Both (parallel)Temporal + factual$0.006~500ms

Step 4: Result Merging

When both memory systems are used, results are merged:

Deduplication

  • Remove duplicates by weaviate_id
  • Same fact may appear in both systems

Cross-Validation Boost

  • Facts mentioned in both systems score higher
  • Indicates strong relevance and confidence

Temporal Decay

  • Older facts gradually score lower
  • Exemptions for high-importance facts
  • Slower decay for decisions/architecture

Quality-Weighted Ranking

  • Higher-quality facts score higher
  • Quality score assigned at extraction

Step 5: Response Generation

The final response is generated with:

Citations

  • Every fact linked to source message
  • Slack/Discord/Teams message URL
  • Timestamp and author

Confidence Scoring

  • Response confidence based on:
    • Query understanding confidence
    • Result relevance scores
    • Cross-validation presence

Fallback Handling

  • No results: Suggest query refinement
  • Low confidence: Indicate uncertainty
  • Graph timeout: Fall back to semantic-only
  • External search failure: Return internal-only results

External Search Integration

For questions requiring web knowledge (best practices, documentation, industry comparisons), the router adds external sub-queries:

Examples:

  • "How does our JWT implementation compare to OWASP guidelines?"
  • "What are the best practices for token rotation?"
  • "Show me Django REST framework documentation"

How it works:

  1. Query decomposer generates external_queries
  2. Execute via Tavily API in parallel with internal queries
  3. Merge external results with internal knowledge
  4. Generate response comparing internal vs external

Cost: 1 Tavily credit per external query (1,000 free/month)

Config:

TAVILY_API_KEY=required
ENABLE_EXTERNAL_SEARCH=true
TAVILY_SEARCH_DEPTH=basic  # or advanced (2 credits)
TAVILY_MAX_RESULTS=5

Query Examples

Example 1: Simple Semantic Query

Question: "What was discussed about authentication?"

  1. Decomposition: Single internal query (simple)
  2. Understanding: route=semantic, depth=topic, confidence=0.9
  3. Routing: Tier 1 topic clusters → Tier 2 atomics
  4. Retrieval: 15 facts from "authentication" cluster
  5. Response: "The team discussed JWT vs OAuth, decided on JWT with RS256, Alice is implementing next sprint..."

Example 2: Graph Traversal Query

Question: "Who decided to use JWT and what blocks the implementation?"

  1. Decomposition: Single internal query (simple)
  2. Understanding: route=graph, entities=["JWT"], confidence=0.8
  3. Routing: Graph traversal
  4. Traversal:
    • "JWT" decision node
    • Follow DECIDED_BY → Person (Alice)
    • Follow BLOCKED_BY → Constraint (missing refresh token rotation)
  5. Enrichment: Fetch decision text from Weaviate
  6. Response: "Alice decided on JWT (citation). The implementation is blocked by missing refresh token rotation (citation)."

Example 3: Complex Parallel Query

Question: "Tell me about the authentication migration and how it compares to best practices"

  1. Decomposition:
    • Internal: "authentication migration JWT"
    • Internal: "OAuth implementation alice"
    • External: "JWT vs OAuth best practices 2025"
  2. Understanding: route=both, confidence=0.7
  3. Routing: Parallel execution
    • Semantic: Find migration facts
    • Graph: Traverse decision relationships
    • External: Search Tavily for best practices
  4. Merge: Combine internal knowledge + external comparison
  5. Response: "Our migration from OAuth to JWT (internal)... Compared to OWASP guidelines, we're following best practices for X but should improve Y (external)..."

Cost Optimization

The router is designed for cost efficiency:

StrategyCostWhen Used
Tier 0 cachedFREEOverview queries
Tier 1 cachedFREETopic queries
Tier 2 search$0.001Specific facts
Graph traversal$0.005Relationship queries
Both parallel$0.006Complex queries
LLM synthesis$0.02Response generation
Average~$0.01Typical query

80% of queries are FREE or <$0.001 (Tier 0/1 or Tier 2 only)

Next Steps

How is this guide?

On this page

Ready for production?

Ship to production with SSO, audit logs, spend controls, and guardrails your security team will approve.

Talk to the team

or email hello@beever.ai