LangChain makes it as easy as possible to develop LLM-powered applications.
Building Blocks
Open source
Document
Loaders
Document
Transformers
Embedding
Models
Prompts
LLMs
Use-cases
Template apps
Platform
Observability, Data Management, Eval
Overview
Chains
Vectorstores
Central concepts
Two ways pre-trained LLMs learn
Weight updates via fine-tuning
Prompt (e.g., via retrieval)
Cramming before a test
Bad for factual recall, good for tasks
Form (e.g., extraction, text-to-SQL)
Open book exam
Good for factual recall
Facts (e.g., QA)
LLM
(Memory only)
Search engines
(Retrieval only)
Retrieval Augmented Generation
(Add task-related documents to
LLM content window / working memory)
Building Blocks
Unstructured
Public
Structured
Proprietary
| |
| |
Private / Company Data
.pdf, .txt, .json., ,md, …
Datastores
Document
Loaders
Document Loaders: > 140 Integrations
Text splitters
Document
Transformers
Beyond basic splitting: Context-aware splitting, Function calling (Doctran)
Document
Transformers
Def foo(...):
for section in sections:
sect = section.find("head")
Code:
# Introduction
Notion templates are effective . . .
Markdown:
PDF:
Abstract
The dominant sequence
transduction models . . .
Def foo
for section in sections:
sect = section.find("head")
Introduction
Notion templates are effective . . .
The dominant sequence . . .
Abstract
Document
Loaders
Document
Transformers
Embeddings +
Storage
> 40 vectorstore integrations, > 30 embeddings
Document
Transformers
Embeddings +
Storage
Document Loaders
(> 140 Integrations)
Document
Transformers
(e.g., Text Splitters, OAI functions)
Vector Storage
(> 40 Integrations)
Embeddings
(> 30 Integrations)
Hosted or Private vectorstore + embeddings
Document
Transformers
Embeddings +
Storage
Vector Storage
Embeddings
Hosted
Private (on device)
Document Loaders
(> 140 Integrations)
Document
Transformers
(e.g., Text Splitters, OAI functions)
Vector Storage
(> 40 Integrations)
Embeddings
(> 30 Integrations)
> 60 LLM integrations
LLM
LLMs
(> 60 Integrations)
LLM landscape
LLM
| OpenAI | Anthropic | Llama-2 (SOTA OSS) |
Context Window (tokens) | 4k - 32k | 100k | 4k |
Performance | GPT-4 SOTA (best overall) | Claude-2 getting closer to GPT4 | 70b on par w/ GPT-3.5-turbo* |
Cost | $0.06 / 1k token (input) | 4-5x cheaper than GPT-4-32K | Free |
Code
Math
*Llama2-70b on par w/ GPT-3.5-turbo on language, but lags on coding
180B
340B
BLOOM
OPT
LLaMA
Falcon
LLaMA-2
Base Model
(Training tokens)
Fine Tune
(Instructions)
1.4T
52k
70k
Vicuna
Alpaca
Koala
150k
800K
GPT4All
300k
300k
2T
15k
StableLM
1.5T
Tuned
1.5M
15k
100k
LLaMA-2-Chat
GPT-J
GPT-NeoX-20b
MPT
400B
1T
Open Source LLMs
SOTA
LLM
OSS models can run on device (private)
LLM
Llama2-13b running ~50 tok / sec (Mac M2 max, 32gb)
Integrations Hub
Use Cases
RAG: Load working memory w/ retrieved information relevant to a task
Splits
Question
Relevant
Splits
Query
PDFs
Prompt
URLs
Database
Documents
LLM
Answer
Document Loading
Splitting
Storage
Retrieval
Output
RAG
Pick desired level of abstraction
VectorstoreIndexCreator
RetrievalQA
Answer
Answer
Load_QA_chain
Answer
Relevant
Splits
Abstraction / Simplicity
RAG
Or, use runnables
RAG
LangSmith trace for RetrievalQA chain
RAG
Retrieved docs
Question
Trace
Response
Prompt
Distilling useful ideas / tricks to improve RAG
RAG
Idea | Example | Sources |
Base case RAG | Top K retrieval on embedded document chunks, return doc chunks for LLM context window | |
Condensed content embedding | Top K retrieval on embedded document summaries, but return full doc for LLM context window | |
Top K retrieval on embedded chunks or sentences, but return expanded window or full doc | ||
Fine-tune RAG embeddings | Fine-tune embedding model on your data | |
2-stage RAG | First stage keyword search followed by second stage semantic Top K retrieval | |
Agents | May benefit more complex RAG use-cases |
Useful ideas / tricks
RAG
Top K RAG can fail when we do not …
Documents
Condensed Content
Chunk
Summary
Questions
Embed
Chunk
Question
Top K RAG can fail when we do not ...
Summary
Store documents with condensed content embedding
When can top K RAG fail?
Question + Embedding
Answer
Top K RAG can fail when we do not ...
Retrieve full documents
LLM
Chat: Persist conversation history
Chatbots
Question
Answer
Prompt
Memory
Retrieval (Optional)
Retrieved Chunks
Storage
LLM
LangSmith trace for LLMChain w/ chat model + memory
Chatbots
Response
Prompt
Chat history
Summarization: Summarize a corpus of text
Summary
Final
Summary
Final
Summary
Embed-and-cluster
Cluster
Summaries
Summarize each cluster
Prompt
Summarize themes in
the group of docs
LLM
Prompt
Extract final summary
from input list
LLM
Prompt
Summarize themes in
the group of docs
LLM
Does not fit in LLM
context window
Prompt
Extract final summary
from input list
LLM
Stuff document in context window
Distill into summary (reduce)
Summarize chunks (map)
Sample from clusters
Fits in LLM
context window
Document Loader
Summarize
Case-study: Apply to thousands of questions asked about LangChain docs
Summarized themes from LangChain questions using different methods and LLMs
User Questions
(Thousands)
Summarize
Extraction: Getting structured output from LLMs
Input
Alex is 5 feet tall. Claudia is 1 feet taller Alex and jumps higher than him. Claudia is a brunette and Alex is blonde.
{'name': 'Alex', 'height': 5, 'hair_color': 'blonde'}
{'name': 'Claudia', 'height': 6, 'hair_color': 'brunette'}
LLM
Function call
Schema (tell LLM schema we want)
schema = {
"properties": {
"name": {"type": "string"},
"height": {"type": "integer"},
"hair_color": {"type": "string"},
},}
Function (tell the LLM function)
"name": "information_extraction",
"description": "Extracts information from the passage.",
"parameters": {schema}
Extraction
Extraction
Response
Prompt
Output from function call
LangSmith trace for LLMChain w/ function call + output parsing
Text-to-SQL
Question
Answer
LLM
Query
LLM
Optional: SQL Agent
SQL
LangSmith trace for text-to-sql
SQL
CREATE TABLE description for each table and and three example rows in a SELECT statement
Response
Prompt
Chat Chains
ConversationalRetrievalChain
Basic LLM
API / Function Chains
APIChain
Access to tools
Access to memory
Yes
No
Yes
No
Agents
Tools
Memory
Plan
Agent
Action
LLMs
Agents
Long-term
Short-term
(Buffers)
Autonomous
Simulation
Action
Agents
Large agent ecosystem (will focus on ReAct as one example)
Multi-Step Reasoning
Action-Observation (Tool Use)
Yes
No
Yes
No
*Condition LLM to show its work
Agents
LangSmith trace for SQL ReAct agent
Agents
Response
Prompt
Tool / action
Observation
(Chain-of-thought) reasoning
Uses tool at
next step
Case-study on reliability: Web researcher started an agent, retriever was better
Research
Question
Answer
Vector
Storage
LLM
Query 1
Query N
Query 2
HTML
pages
LLM
Retrieved
Chunks
Document Loader
Document Retrieval + QA
Document
Transformation
Agents
Case-study on reliability: Web researcher started an agent, retriever was better
Agents
Tooling
Weight updates via fine-tuning
Prompt (e.g., via retrieval)
Cramming before a test
Bad for factual recall, good for tasks
Form (e.g., extraction)
Open book exam
Good for factual recall
Facts (e.g., QA)
LangSmith Case study: Fine-tuning for extraction
LangSmith Case study: Fine-tuning for extraction of knowledge graph triples
LangSmith Case study: Fine-tuning
Dataset
App generations
Train
Test
Data
Cleaning
LLM
Synthetic
Data
Eval
Fine-tune
LangSmith evaluation: fine-tuning vs few-shot prompting for triple extraction
LLaMA-7b-chat informal answers with hallucinations
LLaMA-7b-chat fine-tuning closer to reference
Case-study lessons
Questions