Blog Architecture
As a side project getting started with this blog, I worked though
An adventure with 3js 3d Backgrounds in 3js. I wanted something unique. Not quite what I intended when I set off to make it, but it seems like a good start!
For a comprehensive exploration of the blog architecture, see the Blog Architecture Deep Dive - Series Index series. For more about the blog's design philosophy and navigation features, see Blog Design and Navigating Blog Content.
This blog is built on a unique architecture that seamlessly bridges Obsidian markdown files with a Next.js website. The system automatically discovers, processes, and publishes content from an Obsidian vault, maintaining the authoring experience of Obsidian while delivering a modern web experience.
Philosophy
The core philosophy is separation of concerns: write in Obsidian, publish automatically. There's no separate CMS, no database, no manual publishing workflow. You write markdown files in Obsidian, set published: published in the frontmatter, and they appear on the website. The system handles everything else.
Architecture Overview
The architecture consists of several key components:
-
File Discovery System: Recursively scans the Obsidian vault for markdown files
-
Content Processing Pipeline: Extracts metadata, processes markdown, resolves links
-
Obsidian Link Resolution: Converts
<span class="obsidian-link disabled" data-link="wiki links" title="Link to: wiki links (not yet created)">wiki links</span>to HTML links -
Static Site Generation: Pre-renders all pages at build time
-
Development/Production Modes: Different file access strategies for each environment
File Discovery
The system uses a recursive directory scanner that:
-
Recursively traverses the entire Obsidian vault directory structure
-
Finds all
.mdfiles at any depth -
Handles symlinks safely by tracking visited paths to prevent infinite loops
-
Skips hidden files (like
.obsidian,.git) automatically -
Respects depth limits (max 100 levels) to prevent stack overflow
export function getAllMarkdownFiles(
dirPath: string,
arrayOfFiles: string[] = [],
visitedPaths: Set<string> = new Set(),
depth: number = 0,
maxDepth: number = 100
): string[]
The scanner uses fs.realpathSync() to resolve symlinks and tracks visited real paths to prevent circular references: critical for complex Obsidian vault structures.
Publishing Detection
Not all markdown files should be published. The system uses frontmatter metadata:
To mark a post as published:
- Set
published: publishedin frontmatter
To explicitly exclude:
- Set
published: draftin frontmatter, or omit thepublishedfield
This allows for a simple workflow: write drafts normally, add published: published to frontmatter when ready to publish.
Metadata Extraction
The system extracts rich metadata from each post:
Title Extraction (Priority Order)
-
title:field in frontmatter -
First
# H1heading in content -
Filename (with spaces/underscores converted)
Date Extraction
-
date:field in frontmatter (ISO format:YYYY-MM-DD) -
File creation time (for simple posts without frontmatter)
Categories
Extracted from multiple sources:
-
categories:field (array or comma-separated) -
category:field (single category) -
tags:field (all tags except "published")
Excerpt Generation
Automatically generated from content:
-
Removes frontmatter
-
Strips markdown formatting
-
Converts Obsidian links to plain text
-
Truncates to 200 characters
Slug Generation
Slugs are generated from the file's relative path within the vault, preserving directory structure:
// File: Science/Planetary Science & Space.md
// Slug: Science/Planetary_Science___Space
const slug = relativePath
.replace(/\s+/g, '_') // Spaces → underscores
.replace(/[&]/g, '_') // & → underscore
.replace(/[^a-z0-9_\/-]/gi, '_') // Special chars → underscore (keep / and -)
.replace(/_+/g, '_') // Multiple underscores → single
.replace(/\/+/g, '/') // Multiple slashes → single
.replace(/^[-_\/]+|[-_\/]+$/g, ''); // Trim leading/trailing
This means:
-
Directory structure is preserved in URLs:
/blog/Science/Planetary_Science___Space -
Custom URLs can be set via
url:in frontmatter -
Special characters are safely encoded
Obsidian Link Resolution
One of the most complex parts of the system is converting Obsidian's <span class="obsidian-link disabled" data-link="wiki links" title="Link to: wiki links (not yet created)">wiki links</span> syntax to HTML links. The system handles:
Link Formats
-
<span class="obsidian-link disabled" data-link="Link Text" title="Link to: Link Text (not yet created)">Link Text</span>→ Simple link -
<span class="obsidian-link disabled" data-link="Link Text" title="Link to: Link Text (not yet created)">Display Text</span>→ Link with custom display text -
<span class="obsidian-link disabled" data-link="folder/file" title="Link to: folder/file (not yet created)">folder/file</span>→ Path-based links -
<span class="obsidian-link disabled" data-link="file name" title="Link to: file name (not yet created)">file name</span>→ Filename-based links
Matching Strategy
The link resolver uses a three-tier matching strategy:
1. Path-based matching (for links with /):
-
Matches against
relativePath(preserves directory structure) -
Case-insensitive
-
Handles URL encoding (
+→ space)
2. Filename matching:
-
Matches against
fileName(basename without extension) -
Also matches last segment of
relativePath -
Handles variations in spacing/capitalization
3. Fuzzy matching:
-
Normalizes both link and filename (removes spaces, special chars)
-
Handles edge cases like "Moon forming Impact" vs "Moon Forming Impact"
Link Processing Pipeline
Links are processed in multiple stages:
-
Pre-HTML conversion: Process links in markdown before
remarkconverts to HTML -
Post-HTML conversion: Catch any escaped links (from code blocks, etc.)
-
HTML entity handling: Process HTML-escaped brackets (
[[link]])
This multi-pass approach ensures links are resolved even in edge cases.
Link States
Links can be in two states:
-
Active: Points to an existing published post →
<a href="/blog/slug"> -
Disabled: Points to non-existent post →
<span class="obsidian-link disabled">(styled differently, not clickable)
Static Site Generation
The blog uses Next.js 14's App Router with static generation:
Route Structure
-
Home page:
/- Lists all posts, shows "Home" post if exists -
Category pages:
/?category=CategoryName- Filters posts by category -
Post pages:
/blog/[...slug]- Catch-all route for posts (handles nested paths)
generateStaticParams
Next.js pre-generates all post pages at build time:
export async function generateStaticParams() {
const posts = await getAllBlogPosts();
return posts.map((post) => ({
slug: post.slug.split('/'), // Split for catch-all route
}));
}
This ensures:
-
All pages are pre-rendered (fast, SEO-friendly)
-
404s are handled for non-existent posts
-
URL encoding is handled correctly (
/→%2F)
Catch-All Route Handling
The [...slug] route handles nested paths:
// URL: /blog/Science/Planetary_Science
// params.slug = ['Science', 'Planetary_Science']
// slugString = 'Science/Planetary_Science'
Each segment is URL-decoded to handle special characters properly.
Development vs Production Modes
The system operates differently in development and production:
Development Mode
-
Reads directly from source Obsidian vault (
~/Documents/Blog/Blog) -
Live updates: Changes to markdown files appear immediately (no rebuild needed)
-
Fast iteration: Edit in Obsidian, see changes in browser
Production Mode
-
Reads from copied content (
blog-content/directory) -
Build-time copy:
npm run buildcopies vault content toblog-content/ -
Deployment-ready: All content is colocated with the app
This dual-mode approach provides the best of both worlds: fast development and reliable deployment.
Build Process
The build process (npm run build) does:
- Copy blog content:
npm run copy-blogscript
-
Recursively copies entire vault structure
-
Preserves directory structure
-
Handles
~path expansion -
Removes old
blog-content/before copying
- Next.js build:
next build
-
Generates static pages for all posts
-
Pre-renders home page with all posts
-
Creates optimized production bundle
The copy script handles edge cases:
-
Creates destination directory if missing
-
Removes existing content before copying
-
Provides clear error messages for missing source
Linked Posts Feature
The system automatically extracts and displays "related content":
-
Link extraction: Scans rendered HTML for
/blog/links -
Post resolution: Converts slugs back to
BlogPostobjects -
Deduplication: Removes duplicate links
-
Lazy loading: Uses Intersection Observer for infinite scroll
This creates a natural "related posts" section based on what the author actually links to in their content.
URL Encoding and Decoding
The system carefully handles URL encoding:
-
Encoding: Slugs are
encodeURIComponent()'d when creating links -
Spaces →
%20 -
/→%2F -
&→%26 -
Decoding: URLs are
decodeURIComponent()'d when resolving -
Handles malformed URLs gracefully (falls back to original)
-
Normalization: Multiple encoding strategies are tried when matching
This ensures links work correctly even with special characters in filenames.
Error Handling
The system includes comprehensive error handling:
-
Missing vault: Clear error message with setup instructions
-
Permission errors: Helpful macOS-specific guidance (Full Disk Access)
-
Circular symlinks: Detected and skipped with warnings
-
Invalid frontmatter: Gracefully falls back to defaults
-
Broken links: Rendered as disabled spans (not errors)
Performance Considerations
Several optimizations ensure fast performance:
-
Static generation: All pages pre-rendered at build time
-
Parallel processing:
Promise.all()for concurrent file reading -
Efficient matching: Early returns in link resolution
-
Caching: Next.js caches static pages aggressively
-
Lazy components:
LinkedPostsuses Intersection Observer
Design Decisions
Why Obsidian?
-
Familiar authoring: Write in the tool you already use
-
Wiki links: Natural linking between posts
-
No lock-in: Markdown files are portable
-
Rich features: Tags, frontmatter, directory structure
Why Next.js?
-
Static generation: Fast, SEO-friendly pages
-
TypeScript: Type safety for complex link resolution
-
App Router: Modern routing with catch-all support
-
Deployment: Works on Vercel, Netlify, etc.
Why Not a Headless CMS?
-
Simplicity: No database, no API, no admin panel
-
Version control: Markdown files are git-friendly
-
Portability: Content isn't locked into a platform
-
Cost: No hosting fees for a CMS
Future Enhancements
Potential improvements:
-
Incremental builds: Only rebuild changed posts
-
Search: Full-text search across posts
-
RSS feed: Automatic feed generation
-
Image optimization: Next.js Image component integration
-
Syntax highlighting: Code block highlighting
-
Backlinks: Show posts that link to current post
Conclusion
This architecture provides a seamless bridge between Obsidian's authoring experience and modern web publishing. By leveraging Next.js's static generation and carefully handling Obsidian's unique features (wiki links, directory structure), the system creates a blog that feels native to both worlds.
The result is a blog that:
-
Requires no separate CMS or database
-
Maintains a familiar authoring workflow
-
Automatically handles linking and related content
-
Generates fast, SEO-friendly static pages
-
Works in both development and production seamlessly
It's a system that gets out of your way so you can focus on writing.
