How the Search Engine Works

Updated: January 31, 2026 • Created: January 21, 2025

This blog features a sophisticated semantic search system that goes beyond simple keyword matching. The search engine uses vector embeddings, intelligent ranking, and advanced query parsing to help readers find relevant content quickly and accurately.

This is part 8 of the Blog Architecture Deep Dive series. Start with Blog Architecture - Overview if you haven't read it yet.

Overview

The search system combines multiple techniques to provide accurate, fast search results:

Vector Embeddings: Converts content into mathematical vectors for semantic similarity
Intelligent Ranking: Boosts results based on where matches occur (title, excerpt, category)
Query Parsing: Supports quoted phrases, excluded words, and boolean operators
Real-time Search: Debounced search with keyboard navigation
Result Highlighting: Visual feedback showing where matches were found

Architecture

Build-Time Embedding Generation

Search embeddings are generated during the build process (npm run generate-embeddings). This ensures fast search performance at runtime—no API calls needed for each search.

The embedding generation process:

Content Extraction: Reads all published markdown files
Searchable Text Creation: Combines title, excerpt, and categories into searchable text
Vector Generation: Creates vector representations using one of two methods:
- Simple Feature Vectors (default): Zero-cost, text-based vectors using word frequency
- OpenAI Embeddings (optional): Semantic embeddings using OpenAI's API for better understanding
Caching: Saves embeddings to lib/embeddings.json for fast loading

Embedding Data Structure

Each embedding contains:

{
  slug: string;           // URL slug for the post
  title: string;          // Post title
  excerpt: string;        // Post excerpt
  categories: string[];   // Post categories
  searchableText: string; // Combined searchable text
  embedding?: number[];   // OpenAI embedding vector (if available)
  vector?: number[];      // Simple feature vector
  features?: string[];   // Feature names for simple vectors
}

Search Process

1. Query Parsing

The search engine parses queries to extract:

Quoted Phrases: Exact phrase matches using "quantum computing"
Search Words: Individual words (stop words filtered out)
Excluded Words: Words prefixed with - to exclude results

Example queries:

"quantum computing" - Finds exact phrase
fusion -nuclear - Finds fusion but excludes nuclear
planetary science mars - Finds posts matching all words

Interactive Demo: Query Parsing

Type different queries to see how they're parsed. Try:

"exact phrase" - Creates a phrase match
word1 word2 -exclude - Extracts words and exclusions
"multiple phrases" and words -excluded - Combines all features

2. Vector Similarity Calculation

For each post, the system calculates similarity using cosine similarity:

cosineSimilarity(queryVector, postVector) = 
  dotProduct(queryVector, postVector) / 
  (norm(queryVector) * norm(postVector))

This measures the angle between vectors—smaller angles mean more similar content.

Interactive Demo: Cosine Similarity

Try changing the query and document text to see how the similarity score changes. The visualization shows:

Word frequency vectors for both query and document
The calculated cosine similarity score
The angle between vectors (smaller angle = higher similarity)

3. Intelligent Ranking

The base similarity score is boosted based on where matches occur:

Title Match: +0.5 for exact title match, +0.3 for partial
Exact Phrase Match: +0.4 in title, +0.3 in excerpt, +0.2 in content
Word Matches: +0.2 per word in title, +0.15 in categories
All Words Match: +0.2 bonus when all query words are found

This ensures that posts with matches in titles or categories rank higher than those with matches only in content.

Interactive Demo: Ranking & Boosting

Try different search queries to see how documents are re-ranked based on:

Where matches occur (title vs excerpt vs category)
How many query words match
The base similarity score

Documents with matches in titles rank higher, even if their base similarity score is lower.

4. Filtering

Results can be filtered by:

Category: Filter to specific categories
Minimum Score: Set a relevance threshold
Excluded Words: Automatically filter out posts containing excluded terms

Search API

The search API (/api/search) accepts:

POST /api/search
{
  query: string;        // Search query
  category?: string;    // Optional category filter
  limit?: number;       // Max results (default: 5)
  minScore?: number;    // Minimum relevance score
}

Returns:

{
  results: Array<{
    slug: string;
    title: string;
    excerpt: string;
    score: number;
    matchType: 'title' | 'excerpt' | 'category' | 'content';
  }>;
  query: string;
  totalResults: number;
}

User Interface Features

Real-Time Search

The search input uses debouncing (300ms) to avoid excessive API calls while typing. Results update automatically as you type.

Keyboard Navigation

Arrow Down/Up: Navigate through results
Enter: Open selected result
Escape: Close search and clear query

Visual Feedback

Highlighting: Search terms are highlighted in results using <mark> tags
Match Type Indicators: Shows whether match was in title, excerpt, or category
Selected State: Visual outline shows currently selected result

Search Tips

When no results are found, the UI displays helpful tips:

Using quotes for exact phrases
Excluding words with minus
Trying broader search terms

Embedding Options

Simple Feature Vectors (Default)

Pros:

Zero cost (no API calls)
Works offline
Fast generation
Good for keyword-based searches

Cons:

Less semantic understanding
May miss synonyms
Limited context awareness

OpenAI Embeddings (Optional)

Pros:

Better semantic understanding
Handles synonyms and related concepts
More context-aware
Better for natural language queries

Cons:

Requires API key
Has costs (though minimal with text-embedding-3-small)
Requires network during build

To enable OpenAI embeddings, set OPENAI_API_KEY environment variable during build.

Performance Optimizations

Build-Time Generation: Embeddings created once at build time, not per-request
Cached Loading: Embeddings loaded from JSON file (fast file I/O)
Debounced Queries: Reduces API calls while typing
Limited Results: Default limit of 5-8 results keeps response small
Early Filtering: Filters applied before expensive similarity calculations

Example Searches

Exact Phrase

"quantum computing"

Finds posts containing the exact phrase "quantum computing".

Excluded Terms

fusion -nuclear

Finds posts about fusion but excludes those mentioning nuclear.

Category Search

planetary science

Finds posts matching "planetary science" with higher ranking for posts in that category.

Natural Language

articles about 3D graphics

Uses semantic understanding to find posts about 3D graphics, even if they don't contain those exact words.

Future Enhancements

Potential improvements for the search system:

Full-Text Search: Search within full article content, not just metadata
Fuzzy Matching: Handle typos and variations
Search History: Remember recent searches
Search Analytics: Track popular searches
Autocomplete/Suggestions: Show suggestions as user types
Date Range Filtering: Filter by publication date
Series Filtering: Filter to specific series
Advanced Boolean Operators: Support AND, OR, NOT operators

Technical Implementation

Key Files

lib/vector-search.ts: Core search logic with ranking and query parsing
app/api/search/route.ts: Search API endpoint
components/core/HeaderSearch.tsx: Search UI component
scripts/generate-embeddings.js: Build-time embedding generation

Search Flow

User types query
  ↓
Debounce (300ms)
  ↓
Parse query (phrases, words, exclusions)
  ↓
Generate query vector
  ↓
Calculate cosine similarity for all posts
  ↓
Apply boosting (title, category, phrases)
  ↓
Filter by category/exclusions
  ↓
Sort by score
  ↓
Return top N results
  ↓
Highlight matches in UI

Conclusion

The search engine combines vector similarity, intelligent ranking, and advanced query parsing to provide fast, accurate search results. By generating embeddings at build time, the system achieves excellent performance without requiring external APIs at runtime (though OpenAI embeddings are available as an optional enhancement).

The system is designed to be:

Fast: Cached embeddings, debounced queries, limited results
Accurate: Semantic understanding with intelligent ranking
User-Friendly: Keyboard navigation, highlighting, helpful tips
Flexible: Supports multiple embedding methods and query types

For more details on the blog architecture, see the other articles in the Blog Architecture Deep Dive - Series Index series.

Daniel Gray