Introduction

📖 Part of the RAG Series: This is Part 5 - production integration patterns:

Part 1: RAG Origins and Fundamentals - What embeddings are, why they matter
Part 2: RAG Architecture and Internals - Chunking, tokenization, vector databases
Part 3: RAG in Practice - Building complete RAG systems
Part 4a: ONNX & Qdrant Implementation - CPU-friendly semantic search foundation
Part 4b: Semantic Search in Action - Typeahead, hybrid search, and UI components
Part 5: Hybrid Search & Auto-Indexing (this article) - Production integration patterns

In Part 4a, we built the foundation: ONNX embeddings and Qdrant storage. In Part 4b, we covered the search UI and hybrid search implementation. Now we'll make it production-ready with automatic indexing (zero-touch content updates) via FileSystemWatcher.

Hybrid Search: Best of Both Worlds

Semantic search is powerful but traditional full-text search still excels at exact phrases and technical terms. The solution? Use both.

Why Hybrid? Different approaches have different strengths:

PostgreSQL Full-Text (covered here): Exact matches, technical terms, Boolean operators
Semantic Vector Search: Meaning, context, synonyms, conceptually related content

Reciprocal Rank Fusion (RRF)

We use Reciprocal Rank Fusion to combine results from multiple search sources:

flowchart TB
    A[User Query: 'docker containers'] --> B[PostgreSQL Full-Text Search]
    A --> C[Semantic Vector Search]

    B --> D["Results:<br/>1. 'Docker Basics' (rank 1)<br/>2. 'Containerizing Apps' (rank 2)<br/>3. 'Docker Compose' (rank 3)"]
    C --> E["Results:<br/>1. 'Containerizing Apps' (rank 1)<br/>2. 'Kubernetes Guide' (rank 2)<br/>3. 'Docker Basics' (rank 3)"]

    D --> F[RRF Algorithm]
    E --> F

    F --> G["Combined Results:<br/>1. 'Containerizing Apps'<br/>   (1/61 + 1/62 = 0.0328)<br/>2. 'Docker Basics'<br/>   (1/61 + 1/63 = 0.0322)<br/>3. 'Docker Compose'<br/>   (1/63 = 0.0159)"]

    style A stroke:#10b981,stroke-width:2px
    style B stroke:#3b82f6,stroke-width:2px
    style C stroke:#6366f1,stroke-width:2px
    style D stroke:#3b82f6,stroke-width:2px
    style E stroke:#6366f1,stroke-width:2px
    style F stroke:#ec4899,stroke-width:4px
    style G stroke:#8b5cf6,stroke-width:2px

The RRF formula: score = Σ(1 / (k + rank))

k = 60 (constant to prevent early ranks dominating)
rank = position in that search method's results
Results in both sources get scores from each added together

Why RRF works:

Deduplication: Same result in both sources scores higher
Fairness: No search method dominates unfairly
Simplicity: No complex tuning required

Implementation

public class HybridSearchService : IHybridSearchService
{
    private readonly ISemanticSearchService _semanticSearchService;
    private const int RrfConstant = 60;

    public async Task<List<SearchResult>> SearchAsync(
        string query,
        string language = "en",
        int limit = 10,
        CancellationToken cancellationToken = default)
    {
        // Execute both searches in parallel
        var semanticResults = await _semanticSearchService.SearchAsync(
            query, limit * 2, cancellationToken);

        // Filter by language and apply RRF
        var filteredResults = semanticResults
            .Where(r => r.Language == language)
            .ToList();

        return ApplyReciprocalRankFusion(filteredResults)
            .Take(limit)
            .ToList();
    }

    private List<SearchResult> ApplyReciprocalRankFusion(List<SearchResult> results)
    {
        var rrfScores = new Dictionary<string, RrfScore>();

        for (int i = 0; i < results.Count; i++)
        {
            var result = results[i];
            var key = $"{result.Slug}_{result.Language}";

            if (!rrfScores.ContainsKey(key))
                rrfScores[key] = new RrfScore { Result = result };

            // RRF formula: 1 / (k + rank)
            rrfScores[key].Score += 1.0 / (RrfConstant + i + 1);
        }

        return rrfScores.Values
            .OrderByDescending(x => x.Score)
            .Select(x => x.Result)
            .ToList();
    }
}

Note: This shows semantic search only. In production, execute PostgreSQL full-text search in parallel and include those results in the RRF calculation.

Integration

If you've already implemented PostgreSQL full-text search (as covered here), adding semantic search is straightforward:

// Program.cs
services.AddSemanticSearch(configuration);
services.AddSingleton<IHybridSearchService, HybridSearchService>();

[HttpGet("search/hybrid")]
public async Task<IActionResult> HybridSearch(string query, string language = "en")
{
    var results = await _hybridSearchService.SearchAsync(query, language);
    return PartialView("_SearchResults", results);
}

Automatic Indexing with File System Watcher

The most powerful feature: automatic indexing. Save a blog post, it's immediately searchable - no manual intervention.

How It Works

flowchart TB
    A[Save Markdown File] --> B[FileSystemWatcher Detects Change]
    B --> C{File in Main Directory?}
    C -->|Yes| D[Save to Database]
    C -->|No| E[Save to Database Only]
    D --> F[Create BlogPostDocument]
    F --> G[Generate Embedding via ONNX]
    G --> H[Store in Qdrant]
    H --> I[Post Searchable Immediately]

    style A stroke:#10b981,stroke-width:2px
    style B stroke:#f59e0b,stroke-width:2px
    style C stroke:#ec4899,stroke-width:3px
    style D stroke:#3b82f6,stroke-width:2px
    style E stroke:#6b7280,stroke-width:2px
    style F stroke:#8b5cf6,stroke-width:2px
    style G stroke:#6366f1,stroke-width:3px
    style H stroke:#ef4444,stroke-width:2px
    style I stroke:#10b981,stroke-width:2px

Key Design Decision: Only index files in the main Markdown directory, not subdirectories (translated/, drafts/, comments/). This keeps the search index clean.

File Watcher Integration

The blog already has a MarkdownDirectoryWatcherService. We extend it to trigger semantic indexing:

// In MarkdownDirectoryWatcherService.cs
private async Task OnChangedAsync(WaitForChangedResult e)
{
    if (e.Name == null) return;

    await retryPolicy.ExecuteAsync(async () =>
    {
        var savedModel = await blogService.SavePost(slug, language, markdown);

        // Index ONLY if file is in main directory (no path separators in name)
        if (!e.Name.Contains(Path.DirectorySeparatorChar) &&
            !e.Name.Contains(Path.AltDirectorySeparatorChar))
        {
            await IndexPostForSemanticSearchAsync(scope, savedModel, language);
        }
    });
}

private async Task IndexPostForSemanticSearchAsync(
    IServiceScope scope,
    BlogPostDto post,
    string language)
{
    var semanticSearchService = scope.ServiceProvider.GetService<ISemanticSearchService>();
    if (semanticSearchService == null) return; // Not configured

    var document = new BlogPostDocument
    {
        Id = $"{post.Slug}_{language}",
        Slug = post.Slug,
        Title = post.Title,
        Content = post.PlainTextContent,
        Language = language,
        Categories = post.Categories?.ToList() ?? new List<string>(),
        PublishedDate = post.PublishedDate
    };

    await semanticSearchService.IndexPostAsync(document);
    _logger.LogInformation("Indexed {Slug} ({Language}) in semantic search", post.Slug, language);
}

Handling Deletions

When a post is deleted, remove it from the semantic index:

private async Task OnDeletedAsync(WaitForChangedResult e)
{
    await blogService.Delete(slug, language);

    // Delete from semantic search ONLY if file was in main directory
    if (!e.Name.Contains(Path.DirectorySeparatorChar) &&
        !e.Name.Contains(Path.AltDirectorySeparatorChar))
    {
        var semanticSearchService = scope.ServiceProvider.GetService<ISemanticSearchService>();
        await semanticSearchService?.DeletePostAsync(slug, language);
    }
}

Background Service for Initial Indexing

On startup, a background service indexes existing posts not yet in Qdrant:

flowchart TB
    A[Application Starts] --> B[Wait 10 seconds]
    B --> C[Initialize Semantic Search]
    C --> D{Model Exists?}
    D -->|No| E[Download from Hugging Face]
    D -->|Yes| F[Load ONNX Model]
    E --> F
    F --> G[Scan Main Markdown Directory]
    G --> H{For Each .md File}
    H --> I[Compute Content Hash]
    I --> J{Hash Changed?}
    J -->|Yes| K[Generate Embedding]
    J -->|No| L[Skip - Already Indexed]
    K --> M[Store in Qdrant]
    M --> H
    L --> H
    H -->|Done| N[Indexing Complete]

    style A stroke:#10b981,stroke-width:2px
    style C stroke:#6366f1,stroke-width:2px
    style E stroke:#f59e0b,stroke-width:2px
    style F stroke:#6366f1,stroke-width:3px
    style G stroke:#8b5cf6,stroke-width:2px
    style J stroke:#ec4899,stroke-width:3px
    style K stroke:#6366f1,stroke-width:2px
    style M stroke:#ef4444,stroke-width:2px
    style N stroke:#10b981,stroke-width:2px

public class SemanticIndexingBackgroundService : BackgroundService
{
    protected override async Task ExecuteAsync(CancellationToken stoppingToken)
    {
        // Wait for app to be ready
        await Task.Delay(TimeSpan.FromSeconds(10), stoppingToken);

        // Initialize (downloads model if needed)
        await _semanticSearchService.InitializeAsync(stoppingToken);

        // Get all posts from main directory only
        var markdownFiles = Directory.GetFiles(
            _markdownConfig.MarkdownPath,
            "*.md",
            SearchOption.TopDirectoryOnly);  // NOT subdirectories

        foreach (var file in markdownFiles)
        {
            var needsIndexing = await _semanticSearchService.NeedsReindexingAsync(
                slug, language, contentHash, stoppingToken);

            if (needsIndexing)
                await _semanticSearchService.IndexPostAsync(document, stoppingToken);
        }
    }
}

This ensures:

Lazy model loading - Downloads on first use, not blocking startup
Incremental indexing - Only new/changed posts re-indexed (via content hash)
Main directory only - Drafts and translated files don't pollute the index

What We've Built

Across Parts 4a, 4b, and 5, we now have:

✅ CPU-friendly semantic search - No GPU required
✅ Related posts discovery - Semantically similar content
✅ Natural language search - Find by meaning, not just keywords
✅ Hybrid search - Best of semantic + full-text
✅ Automatic indexing - Zero-touch content updates
✅ Self-hosted - Your data stays on your server

Future enhancements:

Category-Aware Search - Boost results from specific categories
Multilingual Embeddings - Language-specific embedding models
OpenSearch Integration - Add OpenSearch to the hybrid mix (see my OpenSearch article)

Conclusion

This completes the practical implementation of RAG-style semantic search. Combined with Part 4a (foundation) and Part 4b (search UI), you have everything needed to add intelligent search to your .NET application - running entirely on CPU, at zero additional cost.

Continue Learning

RAG Part 1: Origins and Fundamentals - The theory behind embeddings
RAG Part 2: Architecture and Internals - Deep dive into RAG systems
RAG Part 3: Practical Applications - Complete RAG with LLM integration
Part 4a: ONNX & Qdrant Implementation - Foundation: embeddings and vector storage
Part 4b: Semantic Search in Action - Typeahead, hybrid search, and UI
Full-Text Search with PostgreSQL - The full-text side of hybrid search

Resources

Qdrant & Vector Databases

Self-Hosted Vector Databases with Qdrant - Deep dive into Qdrant concepts, HNSW indexing, filtering, and C# client
Qdrant Hybrid Search - Qdrant's native hybrid support

Hybrid Search

Reciprocal Rank Fusion Paper - The RRF algorithm

File System Watching

FileSystemWatcher Class - .NET documentation
BackgroundService Class - Hosted services in ASP.NET Core

Complete Code

All code is available at: github.com/scottgal/mostlylucidweb

Mostlylucid.SemanticSearch/ - Core semantic search library
Mostlylucid/Blog/WatcherService/ - File watcher with semantic indexing

RAG for Implementers: Hybrid Search and Automatic Indexing