This is a viewer only at the moment see the article on how this works.
To update the preview hit Ctrl-Alt-R (or ⌘-Alt-R on Mac) or Enter to refresh. The Save icon lets you save the markdown file to disk
This is a preview from the server running through my markdig pipeline
Sunday, 02 November 2025
NOTE: This is the live release notes from gitHub.
Just the release notes from this nuget package. This is 'live' from GitHub so will update frequently.
Focus: Enhanced reliability, comprehensive testing, and improved developer experience
This release focuses on improving chunking reliability, providing comprehensive validation tooling, and streamlining configuration management. All features from v2.0 remain fully compatible.
LLMApi.http
PromptBuilder.cs: "Your FIRST character MUST be: ["ChunkingCoordinator.cs chunk context{...},{...} instead of [{...},{...}])docs/OLLAMA_MODELS.md)appsettings.json: Removed verbose model comments, cleaner structuredocs/OLLAMA_MODELS.md: New comprehensive reference guide (285 lines)
docs/BACKEND_API_REFERENCE.md (600+ lines)
:, /, {, } characters)Code Changes:
mostlylucid.mockllmapi/Services/PromptBuilder.cs: Enhanced array formatting instructions (lines 82-94)mostlylucid.mockllmapi/Services/ChunkingCoordinator.cs: Added array formatting to chunk context (line 447)LLMApi/appsettings.json: Streamlined configurationNew Documentation:
docs/OLLAMA_MODELS.md: Comprehensive model configuration guidedocs/BACKEND_API_REFERENCE.md: Complete management API referenceUpdated Files:
LLMApi/LLMApi.http: Expanded from 448 to 847 lines (70+ new validation tests)Chunking at High Temperature:
?autoChunk=falseSee docs/OLLAMA_MODELS.md for detailed troubleshooting.
NO BREAKING CHANGES - Despite the major version bump, all existing code continues to work!
This is a major milestone release that transforms LLMock API into a comprehensive, production-ready mocking platform. Version 2.0 adds realistic SSE streaming modes, multi-backend load balancing, comprehensive backend selection, and extensive documentation.
Three distinct SSE streaming modes for testing different real-world API patterns:
LlmTokens Mode (Default - Backward Compatible)
{"chunk":"text","accumulated":"fulltext","done":false}CompleteObjects Mode (NEW)
{"data":{object},"index":0,"total":10,"done":false}ArrayItems Mode (NEW)
{"item":{object},"index":0,"total":100,"arrayName":"users","hasMore":true,"done":false}Configuration:
{
"MockLlmApi": {
"SseMode": "CompleteObjects" // LlmTokens | CompleteObjects | ArrayItems
}
}
Per-Request Override:
GET /api/mock/stream/users?sseMode=CompleteObjects
GET /api/mock/stream/data?sseMode=ArrayItems
GET /api/mock/stream/chat?sseMode=LlmTokens
Client Example (CompleteObjects):
const eventSource = new EventSource('/api/mock/stream/users?sseMode=CompleteObjects');
eventSource.onmessage = (event) => {
const response = JSON.parse(event.data);
if (response.done) {
console.log('Complete!');
eventSource.close();
} else {
console.log(`User ${response.index + 1}/${response.total}:`, response.data);
// response.data contains the complete user object
}
};
Distribute requests across multiple LLM backends for high throughput:
Configuration:
{
"MockLlmApi": {
"Backends": [
{
"Name": "ollama-llama3",
"Provider": "ollama",
"Weight": 3,
"Enabled": true
},
{
"Name": "ollama-mistral",
"Provider": "ollama",
"Weight": 2,
"Enabled": true
},
{
"Name": "lmstudio-default",
"Provider": "lmstudio",
"Weight": 1,
"Enabled": true
}
]
}
}
SignalR Hub with Load Balancing:
{
"HubContexts": [
{
"Name": "high-throughput-data",
"BackendNames": ["ollama-llama3", "ollama-mistral", "lmstudio-default"]
}
]
}
Features:
Per-Request Selection (Multiple Methods):
# Via query parameter
GET /api/mock/users?backend=openai-gpt4
# Via header
GET /api/mock/users
X-LLM-Backend: openai-gpt4
# SignalR hub context
{
"HubContexts": [
{
"Name": "analytics",
"BackendName": "openai-gpt4-turbo"
}
]
}
Multiple Providers Simultaneously:
Configuration for Mistral-Nemo with massive 128k context window:
{
"Backends": [
{
"Name": "ollama-mistral-nemo",
"Provider": "ollama",
"ModelName": "mistral-nemo",
"MaxTokens": 128000,
"Enabled": true
}
]
}
Use Cases:
MaxItems=10000+)SignalR Example:
{
"HubContexts": [
{
"Name": "massive-dataset-128k",
"Description": "Massive dataset generation with 128k context",
"BackendName": "ollama-mistral-nemo"
}
]
}
Interactive API documentation with Swagger UI:
/swaggerEnable in Program.cs:
builder.Services.AddEndpointsApiExplorer();
builder.Services.AddSwaggerGen();
app.UseSwagger();
app.UseSwaggerUI();
docs/SSE_STREAMING_MODES.md (2,500+ lines) - Complete SSE guide
LLMApi/SSE_Streaming.http - 30+ HTTP examples for SSE modes
docs/CONFIGURATION_REFERENCE.md - Added SSE modes sectiondocs/MULTIPLE_LLM_BACKENDS.md - Enhanced with load balancingappsettings.Full.json - Added SSE mode examples and Mistral-NemoNew Test Coverage:
Total Test Suite:
New Configuration Options:
{
"MockLlmApi": {
// SSE Streaming Modes (NEW)
"SseMode": "LlmTokens", // LlmTokens | CompleteObjects | ArrayItems
// Multiple LLM Backends with Load Balancing (v1.8.0)
"Backends": [
{
"Name": "backend-name",
"Provider": "ollama", // ollama | openai | lmstudio
"BaseUrl": "http://localhost:11434/v1/",
"ModelName": "llama3",
"ApiKey": null,
"MaxTokens": 8192,
"Enabled": true,
"Weight": 3, // For load balancing
"Priority": 10
}
],
// Legacy Single Backend (Still Supported)
"BaseUrl": "http://localhost:11434/v1/",
"ModelName": "llama3",
// Auto-Chunking (v1.8.0)
"EnableAutoChunking": true,
"MaxInputTokens": 4096,
"MaxOutputTokens": 2048,
"MaxItems": 1000,
// Streaming Configuration
"StreamingChunkDelayMinMs": 0,
"StreamingChunkDelayMaxMs": 0,
// Cache Configuration
"CacheSlidingExpirationMinutes": 15,
"CacheAbsoluteExpirationMinutes": 60,
"MaxCachePerKey": 5
}
}
Comprehensive environment variable support with full documentation:
# SSE Mode
export MockLlmApi__SseMode="CompleteObjects"
# Backend Selection
export MockLlmApi__Backends__0__Name="ollama-llama3"
export MockLlmApi__Backends__0__Provider="ollama"
export MockLlmApi__Backends__0__BaseUrl="http://localhost:11434/v1/"
export MockLlmApi__Backends__0__ModelName="llama3"
export MockLlmApi__Backends__0__Enabled="true"
export MockLlmApi__Backends__0__Weight="3"
# SignalR Hub Contexts with Backend Selection
export MockLlmApi__HubContexts__0__Name="analytics"
export MockLlmApi__HubContexts__0__BackendName="openai-gpt4-turbo"
# Or load balancing
export MockLlmApi__HubContexts__1__BackendNames__0="ollama-llama3"
export MockLlmApi__HubContexts__1__BackendNames__1="ollama-mistral"
Core Implementation:
mostlylucid.mockllmapi/Models/SseMode.cs - SSE mode enummostlylucid.mockllmapi/Services/Providers/ILlmProvider.cs - Provider interfacemostlylucid.mockllmapi/Services/Providers/OllamaProvider.cs - Ollama providermostlylucid.mockllmapi/Services/Providers/OpenAIProvider.cs - OpenAI providermostlylucid.mockllmapi/Services/Providers/LMStudioProvider.cs - LM Studio providermostlylucid.mockllmapi/Services/Providers/LlmProviderFactory.cs - Provider factorymostlylucid.mockllmapi/Services/LlmBackendSelector.cs - Backend selection logicTesting:
LLMApi.Tests/SseModeTests.cs - 22 SSE mode testsDocumentation:
docs/SSE_STREAMING_MODES.md - Complete SSE guide (2,500+ lines)LLMApi/SSE_Streaming.http - 30+ SSE examplesCore Services:
mostlylucid.mockllmapi/LLMockApiOptions.cs - Added SseMode property, LlmBackends arraymostlylucid.mockllmapi/RequestHandlers/StreamingRequestHandler.cs - Added SSE mode routingmostlylucid.mockllmapi/Services/LlmClient.cs - Added backend selection overloadsmostlylucid.mockllmapi/Services/MockDataBackgroundService.cs - Added SignalR backend selectionmostlylucid.mockllmapi/Models/HubContextConfig.cs - Added BackendName and BackendNamesDocumentation:
README.md - Updated to v2.0, comprehensive feature listdocs/CONFIGURATION_REFERENCE.md - Added SSE modes and backend selectiondocs/MULTIPLE_LLM_BACKENDS.md - Enhanced with load balancing examplesappsettings.Full.json - Added comprehensive examplesDemo Application:
LLMApi/Program.cs - Added Swagger configurationLLMApi/Pages/_Layout.cshtml - Added Swagger UI linkStock Market Feed (CompleteObjects):
GET /api/mock/stream/stocks?sseMode=CompleteObjects&shape={"ticker":"AAPL","price":150.25,"change":2.5}
Bulk Customer Export (ArrayItems):
GET /api/mock/stream/export-customers?sseMode=ArrayItems&shape={"customers":[{"id":"string","name":"string"}]}
AI Chat Interface (LlmTokens):
GET /api/mock/stream/chat?sseMode=LlmTokens&shape={"message":"Hello!"}
High-Throughput IoT Sensors (Load Balanced):
{
"HubContexts": [
{
"Name": "iot-sensors",
"BackendNames": ["ollama-llama3", "ollama-mistral", "lmstudio-default"]
}
]
}
Massive Dataset with 128k Context:
GET /api/mock/stream/bulk-data?sseMode=ArrayItems&backend=ollama-mistral-nemo
No Code Changes Required!
Version 2.0 is 100% backward compatible with v1.x:
// v1.x code - still works exactly the same
builder.Services.AddLLMockApi(builder.Configuration);
app.MapLLMockApi("/api/mock", includeStreaming: true);
// SSE streaming defaults to LlmTokens mode (original behavior)
// Legacy single backend config (BaseUrl/ModelName) still works
Opt-In to New Features:
// Same setup, just update appsettings.json
{
"MockLlmApi": {
"SseMode": "CompleteObjects", // Switch to realistic streaming
"Backends": [...] // Add multiple backends
}
}
NONE!
Despite the major version bump to 2.0, there are zero breaking changes:
LlmTokens (original behavior)BaseUrl/ModelName config still supportedThis release represents a fundamental transformation:
v1.x: Mock API with LLM-powered generation
v2.0: Production-Ready Mock Platform
Version 2.0 positions LLMock API as a comprehensive mocking platform capable of handling production-scale testing requirements across diverse use cases.
Thank you to all users and contributors who have helped shape LLMock API into a comprehensive mocking platform. Your feedback and use cases have driven these improvements!
See previous release notes for v1.8.0 features (Multiple LLM Backend Support, Automatic Request Chunking, Enhanced Cache Configuration).
See full release history below for v1.7.x, v1.6.x, v1.5.x, and earlier versions.
© 2025 Scott Galloway — Unlicense — All content and source code on this site is free to use, copy, modify, and sell.