1 of 41

When a database does too much: Reflecting on PostgreSQL’s hidden capabilities

Kowshiq Kattamuri

Advanced App Engineer, Accenture

2 of 41

Author | YouTuber | Tutor

    • Working in the Big Data domain with a focus on data engineering and analytics
    • Exploring intersections of AI, ML, and Quantum Computing for next-gen problem solving
    • Founder of Kowshiq’s Community – a 13K+ member platform sharing job opportunities, hackathons & events for students and freshers
    • A recent B.Tech graduate from NIT Calicut, carrying forward a strong foundation in technology

Kowshiq Kattamuri

Advanced App Engineer, Accenture

3 of 41

4 of 41

Built fully on PostgreSQL

Core of product architecture

Initially monolithic, then sharded across instances

Early scale with PostgreSQL

Millions of users supported

Primary DB till date

5 of 41

The Polyglot Problem

Complexity and Cost

PostgreSQL for core data

Redis or Memcached for caching

RabittMQ or Kafka for job queues

Elasticsearch for search

6 of 41

Queue Systems

01

Key Value Storage

02

Full Text Search

03

Real Time Features

04

7 of 41

8 of 41

Kafka

Topics, partitions, replication factor, retention policies, consumer groups, maybe schema registry

Redis

Lists/BRPOP or streams/XADD/XREAD, stream groups, persistence config (AOF/RDB), eviction policy

RabbitMQ

Exchanges, queues, bindings, dead-letter queues, delayed-retry plugin

Conventional way

Options to consider when you are processing tens of thousands of events/sec

9 of 41

Postgres as Queue

10 of 41

Postgres as Queue

Creating Indexes

11 of 41

Postgres as Queue

Enqueue + wake workers (atomic + instant)

12 of 41

Postgres as Queue

Worker: claim with SKIP LOCKED (no double work), then process

13 of 41

Postgres as Queue

Finish job or retry with backoff

14 of 41

The big process

vs

Postgres

way

A poor man’s perspective

    • New service to run, secure, back up
    • Complex constructs (topics, DLQs, retries)
    • Separate client libs & idempotency store
    • Extra monitoring, cost, on-call

    • One database to run & back up
    • One transaction: data + job
    • Simple SQL for claim/retry/backoff
    • Built-in LISTEN/NOTIFY for wakeups

15 of 41

16 of 41

Amazon Dynamo DB

Plan partition keys and sort keys, manage provisioned vs. on-demand capacity, handle global/local secondary indexes,

Redis

Set up the server, manage persistence (RDB or AOF), configure clustering/replication for fault tolerance

Memcached

Configure the daemon, decide on memory allocation, and ensure consistent hashing

Conventional way

Options to consider when ultra-low latency and high throughput at massive scale are critical.

17 of 41

18 of 41

Postgres as Cache

Creating a lightning fast cache table

19 of 41

Postgres as Cache

Insert and fetch with SQL

20 of 41

Postgres as Cache

Emulating TTL

21 of 41

The big process

vs

Postgres

way

A poor man’s perspective

    • New service to run, secure, back up
    • Very fast, but adds infra complexity
    • Limited querying
    • Separate persistence & durability strategies

    • One database to run & back up
    • Key + value stored in standard tables or JSONB
    • ACID transactions out-of-the-box
    • Rich querying (indexes, filters, joins)

22 of 41

23 of 41

Apache Solr

Dedicated search server with configs, schema tuning, and monitoring overhead.

Algolia

Managed service but adds cost, vendor lock-in, and data sync maintenance.

Elasticsearch

Requires setting up separate clusters, managing nodes, and syncing data from DB.

Conventional way

Options to consider when ultra fast, web-scale, search-tuned infrastructure is needed.

24 of 41

Postgres for full text search

Searching for JPMorgan Chase using a query like jp

25 of 41

Postgres for full text search

For efficiency during lookups

26 of 41

Postgres for full text search

Optimizing Further

27 of 41

Postgres for full text search

Full text search

28 of 41

Postgres for full text search

Optimized full text search

29 of 41

The big process

vs

Postgres

way

A poor man’s perspective

    • Extra infra: clusters/servers to deploy and maintain
    • Data pipelines needed to keep DB and search index in sync
    • Eventual consistency issues between DB and search system
    • Higher cost (compute, storage, licensing/usage fees)

    • All inside the same database
    • Native full-text support (stemming, ranking, multiple languages)
    • Strong consistency with structured + unstructured search together
    • ACID guarantees preserved

30 of 41

Real time features

31 of 41

Pusher

Integrate SDKs, manage channels/events, and pay recurring SaaS fees for scale.

Apache Kafka

Set up clusters, brokers, topics, and a separate pipeline just to stream events.

Firebase Realtime Database

Spin up a separate cloud service, sync app with it, and depend on vendor infra.

Conventional way

Options to consider when live updates need to be handled outside the database.

32 of 41

Postgres for real time updates

Setting up PostgreSQL Triggers

33 of 41

Listening for notifications

34 of 41

The big process

vs

Postgres

way

A poor man’s perspective

    • Stand up external infra.
    • Sync DB writes with the service
    • Risk of data drift
    • Complex to scale and monitor
    • Higher costs (infra + licensing)

    • Real-time publish/subscribe built into the DB
    • Events tied directly to committed transactions
    • No external pipelines or services to maintain
    • Strong consistency, instant push to clients

35 of 41

Development Complexity

Different client libraries and connection patterns

Coordinating deployments across multiple services

Operational Overhead

Different services to monitor, update, and debug

Different scaling patterns and failure modes

Multiple configurations to maintain

Separate backup and disaster recovery procedures

Inconsistent data between systems

Different performance tuning approaches

36 of 41

Let’s do some Math

20 $ PM

Redis

Message queue

25 $ PM

Search service

50 $ PM

Monitoring for 3 services

30 $ PM

Grand Total

125 $ PM

37 of 41

The Single Database That Scales

Single Postgres cluster handling millions of bookings

Entire DevOps platform on Postgres

38 of 41

The magic is in Postgres's architecture

Proven options for horizontal scaling

Read replicas for query scaling

Partitioning for large tables

Connection pooling for concurrency

Logical replication for distributed setups

39 of 41

Here are some of the best articles and posts on PostgreSQL that I found in my search.

40 of 41

Postgres might be too good for its own good.

41 of 41

Thank you all for being such a fantastic audience!!!

My Linktree QR code