Large Language Models for
Data Management Tasks
Anna Fariha
University of Utah
Northwest Database Society (NWDS) Annual Meeting 2024
Questions to think about …
Can LLMs do cardinality estimation and query optimization?
Can LLMs help in database index tuning?
Can LLMs help with homogenizing data formats?
Which data-management tasks are well suited for LLM?
Should I use ChatGPT for cleaning up the addresses?
What factors of a task determines LLM’s suitability for it?
Uncertainty
5
Code Requirement
4
Domain Expertise
3
System Context
2
Data Context
1
Interviews over 14 data scientists [Chopra et al. 2023]
Results of survey over 114 data scientists [Chopra et al. 2023]
Whether to ask for the mechanism or the result?
More control
Less control
Difficult to verify
Easy to verify
Reusable
Not reusable
Identify low-hanging fruits!
Data cleaning
Data organization and categorization
Data summarization
Thank you