1 of 11

You Can’t Just Plug In a Vector Index

Making Vector Indexes Work in Relational Databases

Lessons from SQL Server + DiskANN

Krithika Subramanian, Microsoft

2 of 11

Next Database Imperative: Vector Search in OLTP Systems

ASPECT	CHALLENGE/NEED	IMPLICATION
Industry Need	Vector search is core to AI workloads such as semantic search, RAG, and recommendations.	Enterprises want native vector search in OLTP systems to support AI features.
Existing Indexes	Current vector indexes are not suitable for real-time transactional updates.
OLTP Challenges	ANN indexes are graph-based and expensive to update.	Low-latency and high concurrency DML are required, which breaks graph indexes.
Engine Optimization	Relational engines favor B‑trees and columnar scans.	Random access of adjacency lists in graph indexes is inefficient for OLTP.

3 of 11

Why “Just Plugging in ANN” into OLTP systems Doesn’t Work

Filters Break the Index (and Your Results)

Filtering either kills recall or forces brute-force scans → poor accuracy or latency

The Query Optimizer Is Blind to ANN Behavior

ANN cost depends on recall/latency knobs, not just data size
Traditional optimizers pick wrong plans because they can’t model ANN tradeoffs

Transactional DML Conflicts with ANN Maintenance

Inserts/updates/deletes to vector indexes results in write amplification of many rows
Real systems often restrict writes when vector indexes are present → not viable for OLTP

Approximate Search vs SQL Correctness

ANN deliberately trades accuracy for speed
SQL expects deterministic, correct semantics across filters, joins, and ordering

Doesn’t Compose with Relational Query Pipelines

ANN is designed as a top‑K retrieval operator
Hard to integrate with joins, aggregations, and ordering → breaks end-to-end query execution

4 of 11

Our Solution: Design Principles That Make It Work

Why naïve ANN fails in DB systems	SQL Design Principle (Our solution)
ANN systems assume in-memory / specialized infrastructure, not general-purpose DB engines	Reuse relational infrastructure wherever possible�→ Represent index as internal tables + SQL operators, leveraging existing storage, logging, and recovery
ANN indexes (graphs) do not integrate cleanly with SQL query model (joins, filters, composability break)	Vector search must be a first-class system feature
ANN graph structures are not transactional and cannot support OLTP-style updates	Decouple transactional DML from expensive graph mutation�
staleness vs performance tradeoff for search	Separate base data, staging, and index structures explicitly�→ Combine graph traversal + staged row scan at query time
Approximate search introduces unclear correctness guarantees	Make approximation controllable and observable�→ Explicit parameters + exact re-ranking

5 of 11

Why DiskANN ? – Introducing other Alternatives as well

Core Question

Which ANN algorithm works inside a transactional, disk-backed database?�

Evaluation Lens Must work with disk + limited memory

Must scale to 10M–1B vectors

Must support incremental build / async ingestion

Must balance recall, latency, and storage

In memory graphs are not disk friendly, high memory footprint

Why DiskANN Fits ?

Hybrid design → memory (quantized vectors) + disk (graph)
High recall with bounded latency
Scales to large datasets
Supports incremental graph construction
Works naturally with quantization (PQ / RabitQ)

6 of 11

System Overview — Rebuilding ANN for SQL

DiskANN as the ANN backbone

Vector index represented as internal relational tables
Asynchronous background maintenance of the graph
Unified query execution for relational and vector operators

7 of 11

Solving “Mutable Data Breaks ANN”

8 of 11

Solving “Stale or Wrong Results”

9 of 11

Solving “ANN Doesn’t Compose with SQL”�

Filtered, Hybrid Search and Streaming search

10 of 11

Performance Benchmark – Comparative results, Benefits

11 of 11

Key Takeaways

Treating ANN as “just another index” is fundamentally incorrect. - If a data structure touches query semantics and write paths, it must be designed as a system abstraction, not a storage primitive.
Decoupling Is the Only Way to Scale Mutable ANN
Make New Capabilities First-Class in the Execution Engine - If you want composability in a DBMS, new functionality must be expressed in the query algebra, not hidden behind UDFs.
Explicit consistency enables correctness and performance
Reuse the Engine — Don’t Fight It - The most scalable designs don’t introduce new primitives — they reinterpret existing ones (tables, operators, metadata) to support new workloads.

“Making ANN work in a database is not about better search algorithms — it’s about reconciling approximation with transactions, composability, and system invariants.”