Daniel Gray

Thoughts, Notes, Ideas, Projects

← Back to home

Blog Architecture

As a side project getting started with this blog, I worked though

An adventure with 3js 3d Backgrounds in 3js. I wanted something unique. Not quite what I intended when I set off to make it, but it seems like a good start!

For a comprehensive exploration of the blog architecture, see the Blog Architecture Deep Dive - Series Index series. For more about the blog's design philosophy and navigation features, see Blog Design and Navigating Blog Content.

This blog is built on a unique architecture that seamlessly bridges Obsidian markdown files with a Next.js website. The system automatically discovers, processes, and publishes content from an Obsidian vault, maintaining the authoring experience of Obsidian while delivering a modern web experience.

Philosophy

The core philosophy is separation of concerns: write in Obsidian, publish automatically. There's no separate CMS, no database, no manual publishing workflow. You write markdown files in Obsidian, set published: published in the frontmatter, and they appear on the website. The system handles everything else.

Architecture Overview

The architecture consists of several key components:

  1. File Discovery System: Recursively scans the Obsidian vault for markdown files

  2. Content Processing Pipeline: Extracts metadata, processes markdown, resolves links

  3. Obsidian Link Resolution: Converts <span class="obsidian-link disabled" data-link="wiki links" title="Link to: wiki links (not yet created)">wiki links</span> to HTML links

  4. Static Site Generation: Pre-renders all pages at build time

  5. Development/Production Modes: Different file access strategies for each environment

File Discovery

The system uses a recursive directory scanner that:

  • Recursively traverses the entire Obsidian vault directory structure

  • Finds all .md files at any depth

  • Handles symlinks safely by tracking visited paths to prevent infinite loops

  • Skips hidden files (like .obsidian, .git) automatically

  • Respects depth limits (max 100 levels) to prevent stack overflow


  

export function getAllMarkdownFiles(

  

dirPath: string,

  

arrayOfFiles: string[] = [],

  

visitedPaths: Set<string> = new Set(),

  

depth: number = 0,

  

maxDepth: number = 100

  

): string[]

  

The scanner uses fs.realpathSync() to resolve symlinks and tracks visited real paths to prevent circular references: critical for complex Obsidian vault structures.

Publishing Detection

Not all markdown files should be published. The system uses frontmatter metadata:

To mark a post as published:

  • Set published: published in frontmatter

To explicitly exclude:

  • Set published: draft in frontmatter, or omit the published field

This allows for a simple workflow: write drafts normally, add published: published to frontmatter when ready to publish.

Metadata Extraction

The system extracts rich metadata from each post:

Title Extraction (Priority Order)

  1. title: field in frontmatter

  2. First # H1 heading in content

  3. Filename (with spaces/underscores converted)

Date Extraction

  1. date: field in frontmatter (ISO format: YYYY-MM-DD)

  2. File creation time (for simple posts without frontmatter)

Categories

Extracted from multiple sources:

  • categories: field (array or comma-separated)

  • category: field (single category)

  • tags: field (all tags except "published")

Excerpt Generation

Automatically generated from content:

  • Removes frontmatter

  • Strips markdown formatting

  • Converts Obsidian links to plain text

  • Truncates to 200 characters

Slug Generation

Slugs are generated from the file's relative path within the vault, preserving directory structure:


  

// File: Science/Planetary Science & Space.md

  

// Slug: Science/Planetary_Science___Space

  

  

const slug = relativePath

  

.replace(/\s+/g, '_') // Spaces → underscores

  

.replace(/[&]/g, '_') // & → underscore

  

.replace(/[^a-z0-9_\/-]/gi, '_') // Special chars → underscore (keep / and -)

  

.replace(/_+/g, '_') // Multiple underscores → single

  

.replace(/\/+/g, '/') // Multiple slashes → single

  

.replace(/^[-_\/]+|[-_\/]+$/g, ''); // Trim leading/trailing

  

This means:

  • Directory structure is preserved in URLs: /blog/Science/Planetary_Science___Space

  • Custom URLs can be set via url: in frontmatter

  • Special characters are safely encoded

Obsidian Link Resolution

One of the most complex parts of the system is converting Obsidian's <span class="obsidian-link disabled" data-link="wiki links" title="Link to: wiki links (not yet created)">wiki links</span> syntax to HTML links. The system handles:

Link Formats

  • <span class="obsidian-link disabled" data-link="Link Text" title="Link to: Link Text (not yet created)">Link Text</span> → Simple link

  • <span class="obsidian-link disabled" data-link="Link Text" title="Link to: Link Text (not yet created)">Display Text</span> → Link with custom display text

  • <span class="obsidian-link disabled" data-link="folder/file" title="Link to: folder/file (not yet created)">folder/file</span> → Path-based links

  • <span class="obsidian-link disabled" data-link="file name" title="Link to: file name (not yet created)">file name</span> → Filename-based links

Matching Strategy

The link resolver uses a three-tier matching strategy:

1. Path-based matching (for links with /):

  • Matches against relativePath (preserves directory structure)

  • Case-insensitive

  • Handles URL encoding (+ → space)

2. Filename matching:

  • Matches against fileName (basename without extension)

  • Also matches last segment of relativePath

  • Handles variations in spacing/capitalization

3. Fuzzy matching:

  • Normalizes both link and filename (removes spaces, special chars)

  • Handles edge cases like "Moon forming Impact" vs "Moon Forming Impact"

Link Processing Pipeline

Links are processed in multiple stages:

  1. Pre-HTML conversion: Process links in markdown before remark converts to HTML

  2. Post-HTML conversion: Catch any escaped links (from code blocks, etc.)

  3. HTML entity handling: Process HTML-escaped brackets (&#91;&#91;link&#93;&#93;)

This multi-pass approach ensures links are resolved even in edge cases.

Link States

Links can be in two states:

  • Active: Points to an existing published post → <a href="/blog/slug">

  • Disabled: Points to non-existent post → <span class="obsidian-link disabled"> (styled differently, not clickable)

Static Site Generation

The blog uses Next.js 14's App Router with static generation:

Route Structure

  • Home page: / - Lists all posts, shows "Home" post if exists

  • Category pages: /?category=CategoryName - Filters posts by category

  • Post pages: /blog/[...slug] - Catch-all route for posts (handles nested paths)

generateStaticParams

Next.js pre-generates all post pages at build time:


  

export async function generateStaticParams() {

  

const posts = await getAllBlogPosts();

  

return posts.map((post) => ({

  

slug: post.slug.split('/'), // Split for catch-all route

  

}));

  

}

  

This ensures:

  • All pages are pre-rendered (fast, SEO-friendly)

  • 404s are handled for non-existent posts

  • URL encoding is handled correctly (/%2F)

Catch-All Route Handling

The [...slug] route handles nested paths:


  

// URL: /blog/Science/Planetary_Science

  

// params.slug = ['Science', 'Planetary_Science']

  

// slugString = 'Science/Planetary_Science'

  

Each segment is URL-decoded to handle special characters properly.

Development vs Production Modes

The system operates differently in development and production:

Development Mode

  • Reads directly from source Obsidian vault (~/Documents/Blog/Blog)

  • Live updates: Changes to markdown files appear immediately (no rebuild needed)

  • Fast iteration: Edit in Obsidian, see changes in browser

Production Mode

  • Reads from copied content (blog-content/ directory)

  • Build-time copy: npm run build copies vault content to blog-content/

  • Deployment-ready: All content is colocated with the app

This dual-mode approach provides the best of both worlds: fast development and reliable deployment.

Build Process

The build process (npm run build) does:

  1. Copy blog content: npm run copy-blog script
  • Recursively copies entire vault structure

  • Preserves directory structure

  • Handles ~ path expansion

  • Removes old blog-content/ before copying

  1. Next.js build: next build
  • Generates static pages for all posts

  • Pre-renders home page with all posts

  • Creates optimized production bundle

The copy script handles edge cases:

  • Creates destination directory if missing

  • Removes existing content before copying

  • Provides clear error messages for missing source

Linked Posts Feature

The system automatically extracts and displays "related content":

  1. Link extraction: Scans rendered HTML for /blog/ links

  2. Post resolution: Converts slugs back to BlogPost objects

  3. Deduplication: Removes duplicate links

  4. Lazy loading: Uses Intersection Observer for infinite scroll

This creates a natural "related posts" section based on what the author actually links to in their content.

URL Encoding and Decoding

The system carefully handles URL encoding:

  • Encoding: Slugs are encodeURIComponent()'d when creating links

  • Spaces → %20

  • /%2F

  • &%26

  • Decoding: URLs are decodeURIComponent()'d when resolving

  • Handles malformed URLs gracefully (falls back to original)

  • Normalization: Multiple encoding strategies are tried when matching

This ensures links work correctly even with special characters in filenames.

Error Handling

The system includes comprehensive error handling:

  • Missing vault: Clear error message with setup instructions

  • Permission errors: Helpful macOS-specific guidance (Full Disk Access)

  • Circular symlinks: Detected and skipped with warnings

  • Invalid frontmatter: Gracefully falls back to defaults

  • Broken links: Rendered as disabled spans (not errors)

Performance Considerations

Several optimizations ensure fast performance:

  1. Static generation: All pages pre-rendered at build time

  2. Parallel processing: Promise.all() for concurrent file reading

  3. Efficient matching: Early returns in link resolution

  4. Caching: Next.js caches static pages aggressively

  5. Lazy components: LinkedPosts uses Intersection Observer

Design Decisions

Why Obsidian?

  • Familiar authoring: Write in the tool you already use

  • Wiki links: Natural linking between posts

  • No lock-in: Markdown files are portable

  • Rich features: Tags, frontmatter, directory structure

Why Next.js?

  • Static generation: Fast, SEO-friendly pages

  • TypeScript: Type safety for complex link resolution

  • App Router: Modern routing with catch-all support

  • Deployment: Works on Vercel, Netlify, etc.

Why Not a Headless CMS?

  • Simplicity: No database, no API, no admin panel

  • Version control: Markdown files are git-friendly

  • Portability: Content isn't locked into a platform

  • Cost: No hosting fees for a CMS

Future Enhancements

Potential improvements:

  • Incremental builds: Only rebuild changed posts

  • Search: Full-text search across posts

  • RSS feed: Automatic feed generation

  • Image optimization: Next.js Image component integration

  • Syntax highlighting: Code block highlighting

  • Backlinks: Show posts that link to current post

Conclusion

This architecture provides a seamless bridge between Obsidian's authoring experience and modern web publishing. By leveraging Next.js's static generation and carefully handling Obsidian's unique features (wiki links, directory structure), the system creates a blog that feels native to both worlds.

The result is a blog that:

  • Requires no separate CMS or database

  • Maintains a familiar authoring workflow

  • Automatically handles linking and related content

  • Generates fast, SEO-friendly static pages

  • Works in both development and production seamlessly

It's a system that gets out of your way so you can focus on writing.

Explore Categories

Related Content

The 3D Background

The 3D Background The animated 3D background is a procedurally generated landscape that creates an infinite, dynamically rendered terrain with fractal trees, atmospheric effects, and interactive camer...

Blog Design

Blog Design This page serves as a hub for all articles about the design and architecture of this blog. For Readers - Navigating Blog Content - Guide to all navigation methods and features for users of...

Navigating Blog Content

Navigating Blog Content

Navigating Blog Content !images/Pasted image 20251123205959.png This blog is designed with multiple ways to explore and discover content, inspired by tools like Obsidian that emphasize connections and...