LlamaIndex: Basics to Production
Ravi Theja:
Data Scientist - Glance (Inmobi)
Open Source Contributor at LlamaIndex.
Context
Use Cases
Question-Answering
Text Generation
Summarization
Planning
LLM’s
Context
Use Cases
Question-Answering
Text Generation
Summarization
Planning
LLM’s
API’s
Raw Files
SQL DB’s
Vector Stores
?
Paradigms for inserting knowledge
Fine-tuning - baking knowledge into the weights of the network
LLM
Before college the two main things I worked on, outside of school, were writing and programming. I didn't write essays. I wrote what beginning writers were supposed to write then, and probably still are: short stories. My stories were awful. They had hardly any plot, just characters with strong feelings, which I imagined made them deep...
RLHF, Adam, SGD, etc.
Paradigms for inserting knowledge
Fine-tuning - baking knowledge into the weights of the network
Downsides:
Paradigms for inserting knowledge
In-context learning - Fix the model, put context into the prompt
LLM
Before college the two main things I worked on, outside of school, were writing and programming. I didn't write essays. I wrote what beginning writers were supposed to write then, and probably still are: short stories. My stories were awful. They had hardly any plot, just characters with strong feelings, which I imagined made them deep...
Input Prompt
Here is the context:
Before college the two main things…
Given the context, answer the following question:�{query_str}
Key challenges of in-context learning
LlamaIndex: A data framework for LLM applications
Data Ingestion (LlamaHub 🦙)
Data Structures
Retrieval and Query Interface
LlamaIndex
Knowledge-Intensive LLM Applications
LlamaIndex
Data framework for LLM app development
Foundation Models
Input: rich query description
Output: rich response with references, actions, etc
Sales
Marketing
Recruiting
Dev
Legal
Finance
…
Data Connectors: powered by LlamaHub 🦙
<10 lines of code to ingest from Notion
Data Indices + Query Interface
Your source documents are stored in a data collection
In-memory, MongoDB
Our data indices help to provide a view of your raw data
Vectors, keyword lookups, summaries
A retriever helps to retrieve relevant documents for your query
A query engine manages retrieval and synthesis given the query.
Vector Store Index
Doc
Doc
Doc
Vector Store
Node1
Node2
Node3
Embedding1
Embedding2
Embedding3
Raw Documents
Stored as Nodes in a vector store
Each Node is indexed with an embedding
Vector Store Index
Response Synthesis
Demo Walkthrough
Let’s play around with LlamaHub + index + query!
Easily ingest data
Use Case: Semantic Search
Answer
The author grew up writing short stories, programming on an IBM 1401, and working on microcomputers. He wrote simple games, a program to predict how high his model rockets would fly, and a word processor. He studied philosophy in college, but switched to AI. He reverse-engineered SHRDLU for his undergraduate thesis and wrote a book about Lisp hacking. He visited the Carnegie Institute and realized he could make art that would last.
from llama_index import GPTVectorStoreIndex, SimpleDirectoryReader
documents = SimpleDirectoryReader('data').load_data()
index = GPTVectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine(response_mode="tree_summarize")
response = query_engine.query(
"What did the author do growing up?"
)
Use Case: Summarization
Answer
from llama_index import GPTListIndex, SimpleDirectoryReader
documents = SimpleDirectoryReader('data').load_data()
index = GPTListIndex.from_documents(documents)
query_engine = index.as_query_engine(response_mode="tree_summarize")
response = query_engine.query("Could you give a summary of this article in newline separated bullet points?")
Using Open Source Models - GPT4ALL
local_llm_path = './ggml-gpt4all-j-v1.3-groovy.bin'
llm = GPT4All(model=local_llm_path, backend='gptj', streaming=True, n_ctx=512)
llm_predictor = LLMPredictor(llm=llm)
embed_model = LangchainEmbedding(HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2"))
prompt_helper = PromptHelper(max_input_size=512, num_output=256, max_chunk_overlap=1000)
service_context = ServiceContext.from_defaults(
llm_predictor=llm_predictor,
embed_model=embed_model,
prompt_helper=prompt_helper,
node_parser=SimpleNodeParser(text_splitter=TokenTextSplitter(chunk_size=300, chunk_overlap=20))
)
index = GPTVectorStoreIndex.from_documents(documents, service_context=service_context)
query_engine = index.as_query_engine(similarity_top_k=1, service_context=service_context)
response_stream = query_engine.query("What are the main climate risks to our Oceans?")
Use Case: Building a Unified Query Interface
Can use a “Router” abstraction to route to different query engines.
For instance, can do joint semantic search / summarization
Use Case: Document Comparisons
Say you want to compare the 2021 10-K filings for Uber and Lyft
Question: “Compare and contrast the customer segments and geographies that grew the fastest.”
Generate a query plan over your document sources.
Use Case: Exploiting Temporal Relationships
Given a question, what if we would like to retrieve additional context in the past or the future?
Example question: “What did the author do after his time at Y Combinator?”
Requires looking at context in the future!
Use Case: Recency Filtering / Outdated nodes
Imagine you have three timestamped versions of the same data.
If you ask a question over this data, you want to make sure it’s over the latest document.
Use Case: Masking PII information
NER PII Node processor
My name is Ravi Theja and I live in Bangalore. My email address is ravi.theja@gmail.com and my phone number is +91 9550164716.
My name is [NAME] and I live in [PLACE]. My email address is [EMAIL] and my phone number is [CONTACT]
Masking personal information is beneficial prior to inputting it into LLM.
Use Case: Text-to-SQL (Structured Data)
from llama_index import GPTSQLStructStoreIndex, SQLDatabase
sql_database = SQLDatabase(engine, include_tables=["city_stats"])
# NOTE: the table_name specified here is the table that you
# want to extract into from unstructured documents.
index = GPTSQLStructStoreIndex.from_documents(
wiki_docs,
sql_database=sql_database,
table_name="city_stats",
)
# set Logging to DEBUG for more detailed outputs
query_engine = index.as_query_engine(mode="default")
response = query_engine.query("Which city has the highest population?")
print(response)
SELECT city_name, population FROM city_stats ORDER BY population DESC LIMIT 1
Generated SQL
Use Case: Joint Text-to-SQL and Semantic Search
Query with SQL over structured data, and “join” it with unstructured context from a vector database!
Combine expressivity of SQL with semantic understanding
Evaluation:
Response Evaluator
Takes in response source information and output response to evaluate the correctness of the response.
Query Response Evaluator
Takes in Query, response source information and output response to evaluate the correctness of the response.
Source Context Evaluation
Takes in Query, each source information and output response to evaluate the correctness of the response.
AlBus
Workflow:
LLamaIndex in production
OpenAI Pydantic Program
Easily perform structured data extraction into a Pydantic object with our `OpenAIPydanticProgram`
Integrations with Ecosystem
Thanks!
Check out our docs for more details