Assignment 2
Ryan Bell
July 18th, 2022
Tailored Learning for Cost Modeling:
An Open Source RAG-Based Tool
Ryan Bell
Ryan Longshore
Dr. Raymond Madachy
1
Agenda
2
Abstract
For professionals entering the specialized field of cost modeling, there is often a significant gap in domain-specific knowledge. To address this, we present an AI-driven approach that leverages Python’s Haystack package to employ a Retrieval-Augmented Generation (RAG) pipeline to offer customized interactions with foundational cost modeling resources, including Barry Boehm’s books and other prominent cost modeling documents. This presentation explores the integration of this AI tool to deliver on-demand, conversational access to cost modeling knowledge, as well as a secondary feature for generating multiple-choice questions to reinforce learning.
The RAG pipeline enables engineers to query specific documents and receive relevant information from established cost modeling tools, simulating a mentor-mentee interaction and providing a much-needed knowledge repository. By dynamically generating both responses to complex inquiries and formative assessment items, this system aims to facilitate the understanding and application of core cost modeling principles for those new to the field. Initial deployment results will be presented, demonstrating its potential to fill knowledge gaps and improve the ability of engineers to contribute effectively to cost estimation projects.
3
Large Language Model (LLM) Overview
There are other challenges, but these are the two most prominent for this presentation
4
Improving LLM Responses
COMPUTE / COMPLEXITY / COST
Training:
The process of teaching an LLM to perform tasks by exposing it to large datasets and optimizing it to recognize patterns, relationships, and context within the data.
Inferencing:
The LLM’s process of generating responses or predictions based on new inputs, using the knowledge it gained during training.
Context Optimization
What the model needs to know
LLM Optimization
How the model needs to act
5
Building the RAG Pipeline
Base image from Haystack.deepset.ai
6
Creating the GUI
QUERY
RESPONSE
HISTORY
import gradio as gr
�# the function called when the button is clicked
def get_answer(prompt, history):
# Call the RAG pipeline to get the response
results = query_pipeline.run({
"text_embedder": {"text": prompt},
"prompt_builder": {"query": prompt}
})
� # Extract the response and reference section
response = results["generator"]["replies"][0]
reference = "Cost Modeling Document - Section X.Y" # Example reference format
� # Add the new interaction to the history
history.append((prompt, response, reference))
� # Format the history for display in reverse order
history_display = "\n".join([f" **Q:** {q} \n \n \n **A:** {a} \n \n **Reference:** {ref} \n \n" for q, a, ref in reversed(history)])
7
Demonstration
8
Contact Information
Some of our other research:
9
Questions?
10