Understanding RAG and
Fine Tuning for
Large Language Models
Kathmandu
$whoami
Issues with �Large Language Models
Kathmandu
What is
Retrieval Augmented Generation (RAG)
Kathmandu
External Data
External Data
Data Preprocessing
External Data
Data Preprocessing
Vector
Embeddings
External Data
Data Preprocessing
Vector
Embeddings
Vector
Database
Reference: Building RAG- Anyscale Blog
Implement RAG with GenAIStack
What is
Fine-Tuning?
Kathmandu
Let’s imagine you have a Robot
A smile on Robot face after training on more data
2018
2019
2020
2023
BERT
GPT3.5
GPT-3
Llama
PALM
GPT-2
T5
MUM
Falcon
Pre-trained LLMs
Mistral
2021
2022
GPT4
2024
?
Why Fine-Tune
pre-trained models?
Kathmandu
Why Fine-Tune?
Downside of Fine-tuning
Kathmandu
Parameter efficient
Fine-Tuning (peft)
technique
Kathmandu
Enter LoRA: Low Rank Adaptation
Quantization
127/5.4
QLoRA: Efficient Fine Tuning of Quantized LLMs
QLORA introduces a number of innovations to save memory without sacrificing performance:
Vertex AI by Google�Guide to Fine Tune Foundational Models (PaLM)
Awesome Fine-Tune LLMs
lucifertrj/
Awesome-Fine-Tuning-LLMs/
Thank You
Let’s Connect for further discussions on GenAI and ML. Happy learning!
Reference