This is a viewer only at the moment see the article on how this works.
To update the preview hit Ctrl-Alt-R (or ⌘-Alt-R on Mac) or Enter to refresh. The Save icon lets you save the markdown file to disk
This is a preview from the server running through my markdig pipeline
Sunday, 23 November 2025
📖 Related to the RAG Series: This article provides a deep dive into Qdrant, the vector database used in:
Qdrant (pronounced "quadrant") is an open-source vector database built in Rust. This article covers core concepts, the C# client, performance tuning, and production patterns.
A vector database stores high-dimensional vectors (embeddings) and enables fast similarity search. Unlike traditional databases that find exact matches, Qdrant finds semantically similar items.
flowchart LR
A[Text: 'Docker deployment'] --> B[Embedding Model]
B --> C["Vector: [0.12, -0.34, 0.56, ...]"]
C --> D[Qdrant]
E[Query: 'container setup'] --> F[Embedding Model]
F --> G["Vector: [0.11, -0.32, 0.58, ...]"]
G --> H[Similarity Search]
D --> H
H --> I[Similar Results]
style B stroke:#6366f1,stroke-width:3px
style D stroke:#ef4444,stroke-width:3px
style F stroke:#6366f1,stroke-width:3px
style H stroke:#10b981,stroke-width:2px
Key Qdrant features:
A collection is like a table - it holds vectors with a fixed dimensionality and distance metric.
flowchart TB
subgraph Collection["Collection: blog_posts"]
A[Vector Size: 384]
B[Distance: Cosine]
C[HNSW Index]
end
subgraph Points
D[Point 1: slug=docker-intro]
E[Point 2: slug=kubernetes-basics]
F[Point N...]
end
Collection --> Points
style A stroke:#6366f1,stroke-width:2px
style B stroke:#6366f1,stroke-width:2px
style C stroke:#f59e0b,stroke-width:2px
style D stroke:#10b981,stroke-width:2px
style E stroke:#10b981,stroke-width:2px
// Create collection - see https://qdrant.tech/documentation/concepts/collections/#create-a-collection
await client.CreateCollectionAsync(
collectionName: "blog_posts",
vectorsConfig: new VectorParams
{
Size = 384, // Must match your embedding model
Distance = Distance.Cosine // Best for text embeddings
}
);
Distance metrics (docs):
A point is a single record containing:
flowchart LR
subgraph Point
A[ID: uuid/int]
B["Vector: float[384]"]
C[Payload: JSON metadata]
end
style A stroke:#8b5cf6,stroke-width:2px
style B stroke:#f59e0b,stroke-width:2px
style C stroke:#10b981,stroke-width:2px
// Upsert points - see https://qdrant.tech/documentation/concepts/points/#upload-points
var point = new PointStruct
{
Id = new PointId { Uuid = Guid.NewGuid().ToString() },
Vectors = embedding, // float[384]
Payload =
{
["slug"] = "my-post",
["title"] = "Vector Databases",
["language"] = "en",
["categories"] = new[] { "AI", "Databases" },
["published"] = DateTimeOffset.UtcNow.ToUnixTimeSeconds()
}
};
await client.UpsertAsync("blog_posts", points: new[] { point });
Filtering runs before similarity search - extremely efficient.
flowchart TB
A[Search Query] --> B{Apply Filters First}
B --> C[Language = 'en']
B --> D[Year >= 2024]
C --> E[Filtered Subset]
D --> E
E --> F[Vector Similarity Search]
F --> G[Ranked Results]
style B stroke:#ec4899,stroke-width:3px
style E stroke:#f59e0b,stroke-width:2px
style F stroke:#6366f1,stroke-width:2px
style G stroke:#10b981,stroke-width:2px
// Filter conditions - see https://qdrant.tech/documentation/concepts/filtering/#filtering-conditions
var filter = new Filter
{
Must = // AND conditions
{
new Condition { Field = new FieldCondition
{
Key = "language",
Match = new Match { Keyword = "en" }
}},
new Condition { Field = new FieldCondition
{
Key = "published",
Range = new Range { Gte = 1704067200 } // 2024-01-01
}}
},
MustNot = // Exclude conditions
{
new Condition { Field = new FieldCondition
{
Key = "slug",
Match = new Match { Keyword = "draft-post" }
}}
}
};
Filter types (docs):
Match.Keyword - Exact string matchMatch.Text - Full-text matchMatch.Any - Match any in arrayRange - Numeric ranges (Gte, Lte, Gt, Lt)GeoBoundingBox / GeoRadius - Geo filteringInstall the official Qdrant.Client package (GitHub):
dotnet add package Qdrant.Client
using Qdrant.Client;
using Qdrant.Client.Grpc;
// gRPC client (recommended) - see https://qdrant.tech/documentation/interfaces/#grpc-interface
var client = new QdrantClient(
host: "localhost",
port: 6334, // gRPC port (6333 is REST)
https: false
);
// With API key - see https://qdrant.tech/documentation/guides/security/
var secureClient = new QdrantClient(
host: "your-qdrant.cloud",
port: 6334,
https: true,
apiKey: "your-api-key"
);
Always use gRPC (port 6334) for production - 3-5x faster than REST.
On Windows, enable unencrypted HTTP/2 before creating the client:
AppContext.SetSwitch("System.Net.Http.SocketsHttpHandler.Http2UnencryptedSupport", true);
// Vector search - see https://qdrant.tech/documentation/concepts/search/
var results = await client.SearchAsync(
collectionName: "blog_posts",
vector: queryEmbedding,
limit: 10,
filter: filter,
scoreThreshold: 0.5f, // Minimum similarity
searchParams: new SearchParams
{
HnswEf = 128, // Search accuracy (higher = better recall)
Exact = false // Use approximate search
},
withPayload: true
);
foreach (var result in results)
{
Console.WriteLine($"{result.Payload["title"].StringValue}: {result.Score}");
}
// Batch operations - see https://qdrant.tech/documentation/concepts/points/#batch-update
var points = documents.Select(doc => new PointStruct
{
Id = new PointId { Uuid = doc.Id },
Vectors = doc.Embedding,
Payload = { ["slug"] = doc.Slug, ["title"] = doc.Title }
}).ToList();
await client.UpsertAsync(
collectionName: "blog_posts",
points: points,
wait: true // Wait for indexing
);
// Delete by filter - see https://qdrant.tech/documentation/concepts/points/#delete-points
await client.DeleteAsync(
collectionName: "blog_posts",
filter: new Filter
{
Must = { new Condition { Field = new FieldCondition
{
Key = "slug",
Match = new Match { Keyword = "old-post" }
}}}
}
);
HNSW (Hierarchical Navigable Small World) is Qdrant's index algorithm.
flowchart TB
subgraph "HNSW Graph Layers"
L2[Layer 2 - Sparse]
L1[Layer 1 - Medium]
L0[Layer 0 - Dense]
end
Q[Query] --> L2
L2 --> L1
L1 --> L0
L0 --> R[Nearest Neighbors]
style L2 stroke:#8b5cf6,stroke-width:2px
style L1 stroke:#6366f1,stroke-width:2px
style L0 stroke:#3b82f6,stroke-width:2px
style Q stroke:#10b981,stroke-width:2px
style R stroke:#ef4444,stroke-width:2px
// HNSW config - see https://qdrant.tech/documentation/concepts/indexing/#hnsw-index
var hnswConfig = new HnswConfigDiff
{
M = 16, // Edges per node (16-32 recommended)
EfConstruct = 100, // Build-time accuracy (100-200)
FullScanThreshold = 10000 // Brute force threshold
};
await client.UpdateCollectionAsync(
collectionName: "blog_posts",
hnswConfig: hnswConfig
);
Search-time accuracy:
var searchParams = new SearchParams
{
HnswEf = 128 // Higher = better recall, slower (64-256)
};
Tuning guidelines:
| Use Case | M | EfConstruct | HnswEf |
|---|---|---|---|
| Fast, low recall | 8 | 64 | 32 |
| Balanced | 16 | 100 | 128 |
| High recall | 32 | 200 | 256 |
Create payload indexes for frequently filtered fields:
// Keyword index - see https://qdrant.tech/documentation/concepts/indexing/#payload-index
await client.CreatePayloadIndexAsync(
collectionName: "blog_posts",
fieldName: "language",
schemaType: PayloadSchemaType.Keyword
);
// Integer index for ranges
await client.CreatePayloadIndexAsync(
collectionName: "blog_posts",
fieldName: "published",
schemaType: PayloadSchemaType.Integer
);
Impact: 10-100x faster filtering on large collections.
Quantization reduces memory usage:
// Scalar quantization - see https://qdrant.tech/documentation/guides/quantization/#scalar-quantization
await client.UpdateCollectionAsync(
collectionName: "blog_posts",
quantizationConfig: new ScalarQuantization
{
Scalar = new ScalarQuantizationConfig
{
Type = ScalarType.Int8, // float32 -> int8
Quantile = 0.99f,
AlwaysRam = true
}
}
);
Trade-off: 4x less memory, ~2% recall loss, 1.5x faster search.
# docker-compose.yml - see https://qdrant.tech/documentation/guides/installation/
services:
qdrant:
image: qdrant/qdrant:v1.12.1 # Pin version!
ports:
- "6333:6333" # REST
- "6334:6334" # gRPC
volumes:
- qdrant_data:/qdrant/storage
environment:
- QDRANT__SERVICE__GRPC_PORT=6334
- QDRANT__SERVICE__HTTP_PORT=6333
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:6333/health"]
interval: 30s
timeout: 10s
retries: 3
volumes:
qdrant_data:
Enable API key authentication:
environment:
- QDRANT__SERVICE__API_KEY=your-secret-key
Qdrant exposes Prometheus metrics at /metrics:
curl http://localhost:6333/metrics
Key metrics:
qdrant_collections_vector_count - Total vectorsqdrant_rest_responses_duration_seconds - Query latencyqdrant_memory_usage_bytes - Memory consumptionCreate backups:
# Create snapshot
curl -X POST http://localhost:6333/collections/blog_posts/snapshots
# List snapshots
curl http://localhost:6333/collections/blog_posts/snapshots
# Restore (copy snapshot to storage/collections/blog_posts/snapshots/)
Error: expected dim: 384, got 768
Your embedding model and collection must match:
all-MiniLM-L6-v2: 384 dimensionsnomic-embed-text: 768 dimensionstext-embedding-3-small: 1536 dimensionsHNSW lazy-loads into memory. Warm up after startup:
await client.SearchAsync("blog_posts", new float[384], limit: 1);
Use Match.Any for array fields:
new Match { Any = new RepeatedStrings { Strings = { "AI", "ML" } } }
All code available at: github.com/scottgal/mostlylucidweb
Mostlylucid.SemanticSearch/Services/QdrantVectorStoreService.cs - Qdrant integration© 2025 Scott Galloway — Unlicense — All content and source code on this site is free to use, copy, modify, and sell.