1 of 11

You Can’t Just Plug In a Vector Index

Making Vector Indexes Work in Relational Databases

Lessons from SQL Server + DiskANN

Krithika Subramanian, Microsoft

2 of 11

Next Database Imperative: Vector Search in OLTP Systems

ASPECT

CHALLENGE/NEED

IMPLICATION

Industry Need

Vector search is core to AI workloads such as semantic search, RAG, and recommendations.

Enterprises want native vector search in OLTP systems to support AI features.

Existing Indexes

Current vector indexes are not suitable for real-time transactional updates.

OLTP Challenges

ANN indexes are graph-based and expensive to update.

Low-latency and high concurrency DML are required, which breaks graph indexes.

Engine Optimization

Relational engines favor B‑trees and columnar scans.

Random access of adjacency lists in graph indexes is inefficient for OLTP.

3 of 11

Why “Just Plugging in ANN” into OLTP systems Doesn’t Work

  • Filters Break the Index (and Your Results)
    • Filtering either kills recall or forces brute-force scans → poor accuracy or latency
  • The Query Optimizer Is Blind to ANN Behavior
    • ANN cost depends on recall/latency knobs, not just data size
    • Traditional optimizers pick wrong plans because they can’t model ANN tradeoffs
  • Transactional DML Conflicts with ANN Maintenance
    • Inserts/updates/deletes to vector indexes results in write amplification of many rows
    • Real systems often restrict writes when vector indexes are present → not viable for OLTP
  • Approximate Search vs SQL Correctness
    • ANN deliberately trades accuracy for speed
    • SQL expects deterministic, correct semantics across filters, joins, and ordering
  • Doesn’t Compose with Relational Query Pipelines
    • ANN is designed as a top‑K retrieval operator
    • Hard to integrate with joins, aggregations, and ordering → breaks end-to-end query execution

4 of 11

Our Solution: Design Principles That Make It Work

Why naïve ANN fails in DB systems

SQL Design Principle (Our solution)

ANN systems assume in-memory / specialized infrastructure, not general-purpose DB engines

Reuse relational infrastructure wherever possible�→ Represent index as internal tables + SQL operators, leveraging existing storage, logging, and recovery

ANN indexes (graphs) do not integrate cleanly with SQL query model (joins, filters, composability break)

Vector search must be a first-class system feature

ANN graph structures are not transactional and cannot support OLTP-style updates

Decouple transactional DML from expensive graph mutation

staleness vs performance tradeoff for search

Separate base data, staging, and index structures explicitly�→ Combine graph traversal + staged row scan at query time

Approximate search introduces unclear correctness guarantees

Make approximation controllable and observable�→ Explicit parameters + exact re-ranking

5 of 11

Why DiskANN ? – Introducing other Alternatives as well

Core Question

Which ANN algorithm works inside a transactional, disk-backed database?

Evaluation Lens Must work with disk + limited memory

Must scale to 10M–1B vectors

Must support incremental build / async ingestion

Must balance recall, latency, and storage

In memory graphs are not disk friendly, high memory footprint

Why DiskANN Fits ?

  • Hybrid design → memory (quantized vectors) + disk (graph)
  • High recall with bounded latency
  • Scales to large datasets
  • Supports incremental graph construction
  • Works naturally with quantization (PQ / RabitQ)

6 of 11

System Overview — Rebuilding ANN for SQL

  • DiskANN as the ANN backbone
    • Vector index represented as internal relational tables
    • Asynchronous background maintenance of the graph
    • Unified query execution for relational and vector operators

7 of 11

Solving “Mutable Data Breaks ANN”

8 of 11

Solving “Stale or Wrong Results”

9 of 11

Solving “ANN Doesn’t Compose with SQL”�

  • Filtered, Hybrid Search and Streaming search

10 of 11

Performance Benchmark – Comparative results, Benefits

11 of 11

Key Takeaways

    • Treating ANN as “just another index” is fundamentally incorrect. - If a data structure touches query semantics and write paths, it must be designed as a system abstraction, not a storage primitive.
    • Decoupling Is the Only Way to Scale Mutable ANN
    • Make New Capabilities First-Class in the Execution Engine - If you want composability in a DBMS, new functionality must be expressed in the query algebra, not hidden behind UDFs.
    • Explicit consistency enables correctness and performance
    • Reuse the Engine — Don’t Fight It - The most scalable designs don’t introduce new primitives — they reinterpret existing ones (tables, operators, metadata) to support new workloads.

“Making ANN work in a database is not about better search algorithms — it’s about reconciling approximation with transactions, composability, and system invariants.”