How the Search Engine Works
This blog features a sophisticated semantic search system that goes beyond simple keyword matching. The search engine uses vector embeddings, intelligent ranking, and advanced query parsing to help readers find relevant content quickly and accurately.
This is part 8 of the Blog Architecture Deep Dive series. Start with Blog Architecture - Overview if you haven't read it yet.
Overview
The search system combines multiple techniques to provide accurate, fast search results:
- Vector Embeddings: Converts content into mathematical vectors for semantic similarity
- Intelligent Ranking: Boosts results based on where matches occur (title, excerpt, category)
- Query Parsing: Supports quoted phrases, excluded words, and boolean operators
- Real-time Search: Debounced search with keyboard navigation
- Result Highlighting: Visual feedback showing where matches were found
Architecture
Build-Time Embedding Generation
Search embeddings are generated during the build process (npm run generate-embeddings). This ensures fast search performance at runtime—no API calls needed for each search.
The embedding generation process:
-
Content Extraction: Reads all published markdown files
-
Searchable Text Creation: Combines title, excerpt, and categories into searchable text
-
Vector Generation: Creates vector representations using one of two methods:
- Simple Feature Vectors (default): Zero-cost, text-based vectors using word frequency
- OpenAI Embeddings (optional): Semantic embeddings using OpenAI's API for better understanding
-
Caching: Saves embeddings to
lib/embeddings.jsonfor fast loading
Embedding Data Structure
Each embedding contains:
{
slug: string; // URL slug for the post
title: string; // Post title
excerpt: string; // Post excerpt
categories: string[]; // Post categories
searchableText: string; // Combined searchable text
embedding?: number[]; // OpenAI embedding vector (if available)
vector?: number[]; // Simple feature vector
features?: string[]; // Feature names for simple vectors
}
Search Process
1. Query Parsing
The search engine parses queries to extract:
- Quoted Phrases: Exact phrase matches using
"quantum computing" - Search Words: Individual words (stop words filtered out)
- Excluded Words: Words prefixed with
-to exclude results
Example queries:
"quantum computing"- Finds exact phrasefusion -nuclear- Finds fusion but excludes nuclearplanetary science mars- Finds posts matching all words
Interactive Demo: Query Parsing
Type different queries to see how they're parsed. Try:
"exact phrase"- Creates a phrase matchword1 word2 -exclude- Extracts words and exclusions"multiple phrases" and words -excluded- Combines all features
2. Vector Similarity Calculation
For each post, the system calculates similarity using cosine similarity:
cosineSimilarity(queryVector, postVector) =
dotProduct(queryVector, postVector) /
(norm(queryVector) * norm(postVector))
This measures the angle between vectors—smaller angles mean more similar content.
Interactive Demo: Cosine Similarity
Try changing the query and document text to see how the similarity score changes. The visualization shows:
- Word frequency vectors for both query and document
- The calculated cosine similarity score
- The angle between vectors (smaller angle = higher similarity)
3. Intelligent Ranking
The base similarity score is boosted based on where matches occur:
- Title Match: +0.5 for exact title match, +0.3 for partial
- Exact Phrase Match: +0.4 in title, +0.3 in excerpt, +0.2 in content
- Word Matches: +0.2 per word in title, +0.15 in categories
- All Words Match: +0.2 bonus when all query words are found
This ensures that posts with matches in titles or categories rank higher than those with matches only in content.
Interactive Demo: Ranking & Boosting
Try different search queries to see how documents are re-ranked based on:
- Where matches occur (title vs excerpt vs category)
- How many query words match
- The base similarity score
Documents with matches in titles rank higher, even if their base similarity score is lower.
4. Filtering
Results can be filtered by:
- Category: Filter to specific categories
- Minimum Score: Set a relevance threshold
- Excluded Words: Automatically filter out posts containing excluded terms
Search API
The search API (/api/search) accepts:
POST /api/search
{
query: string; // Search query
category?: string; // Optional category filter
limit?: number; // Max results (default: 5)
minScore?: number; // Minimum relevance score
}
Returns:
{
results: Array<{
slug: string;
title: string;
excerpt: string;
score: number;
matchType: 'title' | 'excerpt' | 'category' | 'content';
}>;
query: string;
totalResults: number;
}
User Interface Features
Real-Time Search
The search input uses debouncing (300ms) to avoid excessive API calls while typing. Results update automatically as you type.
Keyboard Navigation
- Arrow Down/Up: Navigate through results
- Enter: Open selected result
- Escape: Close search and clear query
Visual Feedback
- Highlighting: Search terms are highlighted in results using
<mark>tags - Match Type Indicators: Shows whether match was in title, excerpt, or category
- Selected State: Visual outline shows currently selected result
Search Tips
When no results are found, the UI displays helpful tips:
- Using quotes for exact phrases
- Excluding words with minus
- Trying broader search terms
Embedding Options
Simple Feature Vectors (Default)
Pros:
- Zero cost (no API calls)
- Works offline
- Fast generation
- Good for keyword-based searches
Cons:
- Less semantic understanding
- May miss synonyms
- Limited context awareness
OpenAI Embeddings (Optional)
Pros:
- Better semantic understanding
- Handles synonyms and related concepts
- More context-aware
- Better for natural language queries
Cons:
- Requires API key
- Has costs (though minimal with
text-embedding-3-small) - Requires network during build
To enable OpenAI embeddings, set OPENAI_API_KEY environment variable during build.
Performance Optimizations
- Build-Time Generation: Embeddings created once at build time, not per-request
- Cached Loading: Embeddings loaded from JSON file (fast file I/O)
- Debounced Queries: Reduces API calls while typing
- Limited Results: Default limit of 5-8 results keeps response small
- Early Filtering: Filters applied before expensive similarity calculations
Example Searches
Exact Phrase
"quantum computing"
Finds posts containing the exact phrase "quantum computing".
Excluded Terms
fusion -nuclear
Finds posts about fusion but excludes those mentioning nuclear.
Category Search
planetary science
Finds posts matching "planetary science" with higher ranking for posts in that category.
Natural Language
articles about 3D graphics
Uses semantic understanding to find posts about 3D graphics, even if they don't contain those exact words.
Future Enhancements
Potential improvements for the search system:
- Full-Text Search: Search within full article content, not just metadata
- Fuzzy Matching: Handle typos and variations
- Search History: Remember recent searches
- Search Analytics: Track popular searches
- Autocomplete/Suggestions: Show suggestions as user types
- Date Range Filtering: Filter by publication date
- Series Filtering: Filter to specific series
- Advanced Boolean Operators: Support AND, OR, NOT operators
Technical Implementation
Key Files
lib/vector-search.ts: Core search logic with ranking and query parsingapp/api/search/route.ts: Search API endpointcomponents/core/HeaderSearch.tsx: Search UI componentscripts/generate-embeddings.js: Build-time embedding generation
Search Flow
User types query
↓
Debounce (300ms)
↓
Parse query (phrases, words, exclusions)
↓
Generate query vector
↓
Calculate cosine similarity for all posts
↓
Apply boosting (title, category, phrases)
↓
Filter by category/exclusions
↓
Sort by score
↓
Return top N results
↓
Highlight matches in UI
Conclusion
The search engine combines vector similarity, intelligent ranking, and advanced query parsing to provide fast, accurate search results. By generating embeddings at build time, the system achieves excellent performance without requiring external APIs at runtime (though OpenAI embeddings are available as an optional enhancement).
The system is designed to be:
- Fast: Cached embeddings, debounced queries, limited results
- Accurate: Semantic understanding with intelligent ranking
- User-Friendly: Keyboard navigation, highlighting, helpful tips
- Flexible: Supports multiple embedding methods and query types
For more details on the blog architecture, see the other articles in the Blog Architecture Deep Dive - Series Index series.