Open source tools and packages I've built. Most are available on GitHub with pre-built releases. All run locally without cloud dependencies.
Transform documents or URLs into summaries with citation grounding. Every claim is traceable to its source. Supports PDF, DOCX, PPTX, XLSX, Markdown, HTML, and images. Runs entirely on your machine using ONNX embeddings.
Related articles:
Deterministic data profiling for CSV, Excel, Parquet, JSON, SQLite, and log files. 52K+ rows in ~1 second. Privacy-safe PII detection, drift detection, constraint validation. Optional LLM narration for plain English Q&A like "what drives churn?".
Related articles:
Index multiple documents, search with semantic similarity, and chat with your knowledge base. Uses SQLite + DuckDB for zero-config storage, ONNX for embeddings. Optional GraphRAG entity extraction.
Related articles:
Production-ready .NET client for privacy-focused Umami Web Analytics. Non-blocking background sender, automatic retries, bot detection, token refresh. Supports custom events, page views, and the full Data API.
Related articles:
Fetch and cache remote markdown content in Markdig pipelines. Multiple storage backends: PostgreSQL, SQLite, SQL Server, or in-memory.
Related articles:
ASP.NET Core Tag Helpers for pagination with built-in HTMX support. Drop-in paging for your views. 6,600+ downloads.
Related articles:
A modular framework for building composable processing pipelines. Includes 40+ packages for batching, caching, rate limiting, retry, telemetry, and data persistence. Atoms (primitives), Patterns (compositions), and complete solutions.
Related articles:
Simple, efficient bot detection for ASP.NET Core. Identifies crawlers, scrapers, and automated traffic from user-agent strings. Useful for analytics filtering.
Automatically generate alt text for images using local LLMs. Improves accessibility by describing images for screen readers.
Lightweight middleware for generating realistic mock API responses using local LLMs. Perfect for testing LLM integrations without hitting real endpoints. Add intelligent mock endpoints with 2 lines of code.
Related articles:
Windows WPF application that runs tiny LLMs locally with RAG-based conversation memory. Supports Ollama integration and direct GGUF model loading. Perfect for privacy-focused, offline AI assistance.
Enhances Mermaid diagrams with export (PNG/SVG), panning, zoom, expanding lightbox, and theme switching. Drop-in enhancement for any Mermaid-enabled site.
Related articles:
Minimal .NET SignalR server for managing real-time chat conversations between website visitors and administrators. In-memory storage (easily replaceable with database), typing indicators, read receipts.
Lightweight, embeddable chat widget for real-time communication. Add with a single script tag. Built with Alpine.js, mobile-friendly, browser notifications, conversation history.
Windows system tray application for administrators to manage chat conversations. Desktop notifications, quick replies from the tray.
Web-based multi-document RAG with GraphRAG entity extraction, knowledge graph visualization, web crawling, and conversation memory. Scales from Raspberry Pi to enterprise GPU clusters.
Related articles:
The full mostlylucid.net blog platform. ASP.NET Core with PostgreSQL, full-text search, multilingual support, automated translation, and comprehensive observability.
EasyNMT-based translation service for automatic blog post translation. Supports 12+ languages with local inference.
Related articles:
YARP-based reverse proxy gateway for routing and load balancing.
Bot detection demonstration service. An offshoot of mostlylucid.botdetection showing real-time bot identification and traffic analysis capabilities.
The core RAG pipeline used by DocSummarizer, DataSummarizer, and LucidRAG. ONNX embeddings, DuckDB vector store, segment extraction, and summarization.
GraphRAG implementation for extracting entities and relationships from documents. Builds knowledge graphs that enhance RAG with structured understanding.
Related articles:
Semantic search using ONNX embeddings and Qdrant vector database. Designed to run on CPU without GPU requirements.
Related articles:
Build a RAG knowledge base from your markdown documents. Index, search, and query your content with semantic understanding.
Lexicon-based sentiment analysis for blog posts. Multi-dimensional analysis including sentiment, emotion (8 categories), formality, subjectivity, and readability. CPU-friendly.
A lightweight workflow execution engine for building step-by-step processing pipelines. Define workflows as steps with dependencies and execute them in order.
Security-hardened web content fetching for LLM processing. Sanitizes and extracts text content from web pages safely.
Related articles:
© 2026 Scott Galloway — Unlicense — All content and source code on this site is free to use, copy, modify, and sell.