Navigating RAG for Social Science
Atita Arora
Solution Architect, Qdrant
About me
Started 2008
- Computer Applications�- Strategic Business Management�
�- Vector / Semantic Search
- Language analysis �- Information retrieval �
Opensource
Loves to travel, eat, cook
Mom of 2 boys
Hottest 3 letter word in Gen AI right now.. RAG !!!
Adoption of Generative AI
Structured vs Unstructured Data use-cases
Datasphere forecast
© Maxiphoto | iStockphoto
The global datasphere will grow to 163 zettabytes by 2025, and about 80% of that will be unstructured
Challenges�
Information Retrieval
The Evolution of
Open Source LLM Multimodal /
Multilingual IR
2023-Present
Semantic Search
(Word Embeddings)
2018
Learning to Rank (LTR)
2017
Named Entity Recognition
2015
Personalization
2013
2010
Natural Language Processing
(Intent)
Using language processing
techniques
2000
Text Analysis
Using Synonyms, Stemming, Lemmatization
Pre 2000
Pattern search , Exact search
Mostly driven by databases
2011
Multi-word Synonyms
Initial implementation and
later developed into its advance
form
Closed Source LLM
Multilingual IR
2022
AI Models
The Rise of
Models by average English MTEB score (y) vs speed (x)
vs embedding size (circle size).
https://informationisbeautiful.net/visualizations/the-rise-of-generative-ai-large-language-models-llms-like-chatgpt/
Discovery of common language!
The magic of Embeddings !!
Key discussion points :
Why do you / anyone need RAG?
👎
Question
Answer
Why do you / anyone need RAG?
👎
Question
Answer
Why do you / anyone need RAG?
👍
Question
Answer
Context
What is RAG?
R
A
G
Retrieval of relevant data / information per User Query
Augmented augmentation of retrieved relevant data / information to the LLM prompt context
Generation of answer using the prompt, augmented with context with relevant information
And how does it compares to Fine-tuning
https://arxiv.org/pdf/2312.10997v1.pdf
Benefits of RAG?
Saves time
Multiple applications
Contextual
Up-to-date
Enhances Engagement
Multilingual*
Saves Cost
Works with custom data
How do you build RAG?
Embedding Storage
Embedding Generation and Ingestion
Document Processing
Query Search and Document Retrieval
Response Synthesis and Answer Generation
How do you build RAG?
Model Evaluation - https://huggingface.co/spaces/mteb/leaderboard
Chunk Visualization - https://github.com/gkamradt/ChunkViz
Flavours of RAG - Naive RAG
• Document Processing:
- Extract text from documents
- Split documents into appropriate chunks
• Embedding Generation and Ingestion:
- Generate embeddings for document chunks
- Store embeddings in vector database
• Query Processing:
- Embed user query
- Retrieve top-k relevant documents
from vector database
• Response Generation:
- Enrich LLM prompt with retrieved documents
- Generate response using LLM
Flavours of RAG - Advance RAG
• Query Treatment:
• Retrieval Response Treatment:
Flavours of RAG - Agentic / Self Improving RAG
Enterprise-Ready, Massive-Scale Vector Search Technology for the Next AI Generation
Performance
Centric
Quick and
easy to start
Resource
optimization
focussed
All embeddings
supported OOTB
Fully Open
Source project
Scalability
Oriented
Most Loved
open-source vector search database
> 10 K+
Adopters
Worldwide
> 7 M+
Downloads
> 30 K+
Community
Members
Let’s build RAG
Challenges
of RAG
How can these challenges affect our Applications?
Ref : https://theconversation.com/eat-a-rock-a-day-put-glue-on-your-pizza-how-googles-ai-is-losing-touch-with-reality-230953
RAG Improvement Techniques
ℹ️
Data Cleaning
Leverage Metadata
Advanced data extraction
✂️
Data Chunking
Embedding Model
Retrieval Window
🔍
Indexing Algorithms
🗂️
Multi Vector Indexing
Document Reranking
LLM
Prompt Engineering
Prompt
Agents
How do you evaluate RAG?
Document Processing Evaluation
Model Evaluation
Retrieval Evaluation
Prompt Evaluation
Response Evaluation
LLM Evaluation
Performance Evaluation
Evaluation is Paramount !!
Why should you evaluate ?
Landscape of RAG Evaluation
Evaluation Metrics
Precision & Relevance
�
Semantic and Syntactic Similarity
Context Utilization and Sufficiency
- Answer Relevance
- Context Precision
- Context Relevancy
- Context Recall
- Query Fulfillment
- Context Similarity
- Faithfulness
- Groundedness
- Knowledge F1 score
- ROUGE
- Hallucination
- SelfcheckGPT-NLI
-Summarization accuracy
- Answer correctness
- Exact Match
- F1 Score
- Jaccard Similarity
- Answer semantic similarity
- BERT SentenceSimilarity
- BERTScore
- ROUGE
- SacreBLEU
- Coherence
- Conciseness
- Completion Verbosity
- Verbosity Ratio
- No gibberish
Coherence and Conciseness
Hallucination Management
- Context Utilization
- Context sufficiency
- Summarization accuracy
- Maliciousness
- Harmfulness
- Personal Information detection
- Prompt injection
- OpenAI content moderation
- Safe for work
- No sensitive topics
- Controversiality
- Misogyny
- Criminality
- Controversiality
- Insensitivity
- Toxicity
- Helpfulness
Faithfulness and Groundedness
Correctness and Accuracy
Safety & Guardrails
Summarization
Domain-specific eval is essential for high-quality RAG apps
RAG quality is inherently use-case-dependent. It depends on the database and its contents.
Quantitative
Reliable
Explainable
Debuggable
How to pick relevant metrics?
Take an example of RAG built on Documentation
Quality of Answer
├── Answer Correctness
│ ├── Query Fulfillment
│ │ └── Completeness (SelfCheckGPT)
│ ├── Faithfulness and Groundedness
│ │ ├── Context Utilization
│ │ └── Derived from Document Chunks
│ │ ├── Context Sufficiency
│ │ └── Quality of Retrieved Chunks - Precision / Recall / nDCG
├── Helpfulness
├── Bias-Free
├── Non-Malicious
├── Privacy Compliance
│ └── No Personal Information Shared (PII)
├── Policy Compliance
├── Conciseness
│ └── Designated Number of Tokens (Cost)
├── Latency Requirements
Evaluation Code Walkthrough
Other Evaluation Metrics
Further Experiments
To sum up
Further Readings
Colbert - https://medium.com/@varun030403/colbert-a-complete-guide-1552468335ae�
Multimodal RAG (ColPali) - https://huggingface.co/blog/manu/colpali
DSPY - https://dspy-docs.vercel.app/docs/cheatsheet#dspychainofthought
More RAG and Evaluation examples - https://github.com/qdrant/qdrant-rag-eval
RAG in the ERA of Large Context Window -https://qdrant.tech/articles/rag-is-dead/
References
�
Thank you !!
A free forever 1GB cluster included for trying out. No credit card required.
Feel free to reach us at : �info@qdrant.com