DSPy - Declarative Programming in the era of AI
Jayita Bhattacharyya
@jayitabhattac11
$Whoami
👋🏻 Jayita Bhattacharyya or JB is what I go by name!
🤺AKA a vibe-debugger & a glorified if-else coder inside Jupyter notebook
🫠 Pretending to be a Data Scientist these days.
🪄A Tech Speaker and Hackathon Wizard.
📝 I like to pen down my tech thoughts Medium Blogs
Contribute to opensource jayita13
Started sharing my deep tech opinions on @jayitabhattac11
Volunteer @ Bangpypers (Bangalore Python User Group)
My video tutorials & conferences.
Table of Contents
Ain’t this normal?
But…
The problem or challenge
LLMs are just stochastic parrots and…
We’re developers, not parrots…
So let's change the paradigms with DSPy
DSPy - The Vibe Prompter via Programming
DSPy, an open source declarative framework for building and optimizing modular AI software, emphasizing programming—not prompting—language models (LMs).
No more prompt engineering guesswork—DSPy handles the 'how' so you focus on the 'what'.
Python native, has type hints, reliable with pydantic in built, has support for multiple LLMs with LiteLLM backend & ofcourse the optimizers are a charmer, that will fine-tune your prompts based on your dataset & metrics.
Ain’t that cool enough to already get excited about DSPy?
Lets see it in action now!!!!!
Signatures - from prompt to programs
How does the actual prompt look?
More Efficient Signatures
CoT
ReAct
Optimizers - the magician in DSPy
Teleprompters are general-purpose optimization strategies that determine how the DSPy modules should learn from data. They are designed to automate the task of prompting, ensuring it happens "at a distance, without manual intervention"
Kinds of Optimizers:
Optimizers - some ground rules
BootStrapFewShot
The primary function of dspy.BootstrapFewShot is to automatically synthesize good few-shot examples (demonstrations) for the modules within a DSPy program, optimizing them based on a defined metric, without requiring manual prompt engineering.
Trace with MLFlow
Initial vs Final Prompt
MIPROv2
MIPROv2 (Meta-Optimization via Iterative Prompt Refinement) is considered one of the key optimization techniques in DSPy, aiming to find the best instructions and demonstrations to maximize a given metric.
Uses Bayesian Optimization to effectively search over the space of generation instructions/demonstrations across your modules.
Trace with MLFlow
Evaluation Results
Evaluation Results
Evaluation Results
Compare Eval runs with MLFlow
Improvement on generation with training
GEPA - the game changer
GEPA (Genetic-Pareto) is a framework for optimizing arbitrary systems composed of text components—like AI prompts, code snippets, or textual specs—against any evaluation metric. It employs LLMs to reflect on system behavior, using feedback from execution and evaluation traces to drive targeted improvements.
Through iterative mutation, reflection, and Pareto-aware candidate selection, GEPA evolves robust, high-performing variants with minimal evaluations, co-evolving multiple components in modular systems for domain-specific gains.
GEPA Results
The optimized prompt that GEPA generates for AIME, which achieves improves GPT-4.1 Mini's performance from 46.6% to 56.6%, an improvement of 10% on AIME 2025. Note the details captured in the prompts in just 2 iterations of GEPA. GEPA can be thought of as precomputing some reasoning (during optimization) to come up with a good plan for future task instances.
References