Python for Data Retrieval
and Visualization
Michael Shensky
Head of Research Data Services
m.shensky@austin.utexas.edu
Ian Goodale
European Studies and Linguistics Librarian
ian.goodale@austin.utexas.edu
What Workshops covering research data practices and software
When 12pm - 1:15pm on the dates listed below
Where Zoom (all dates) / PCL Scholars Lab (select dates)
Open Source GIS: From QGIS to Python
Intro to Python for Data Management
Python for Data Retrieval and Visualization
Making Beautiful Plots in R’s ggplot2
Where and How to Publish Research Data
How to Share Sensitive (Human) Data
January 31
February 11
February 12
February 13
February 14
February 28
Fri
Tue
Wed
Thu
Fri
Fri
Workshop Logistics
Goals for this Workshop
What is Python?
What is an API?
What is REST API?
Learning to Use a REST API
Recommendations for Using APIs?
Using a REST API and Python
Why Use Python and APIs to Retrieve Data?
Accessing data with a scripted process can be quicker than manually downloading it
Graphical user interfaces for data portals can be sometimes be difficult to navigate and use
A scripted process for accessing data allows you to reproduce your workflow later and allow others to replicate your work
If you are accessing frequently updating data from an external source, running a script to retrieve data at regular intervals can be useful
Data can get messy if you do not organize it after downloading it manually, but if downloading data using a script you can enforce naming conventions and a file organization structure
Efficiency
Ease of Use
Reproducibility
Data Updates
File
Management
Working with Data Returned by an API Call Using Python
Google Colab
https://research.google.com/colaboratory/faq.html
Colab FAQs at https://research.google.com/colaboratory/faq.html
PRACTICE: Retrieving Data Using Python and APIs
Please save a copy of this script to your drive so you can edit it
Transition to Data Cleaning and Visualization
Key Technologies
Natural Language Toolkit (NLTK)
Pandas
Matplotlib
Wordcloud and Gensim
Link to publicly shared Google Colab Notebook #2:
https://colab.research.google.com/drive/16FIcE9QUmvD6p1UhZMtD7g_cFZiPLugL?usp=sharing
PRACTICE: Cleaning and Visualizing Data
Link to publicly shared Google Colab Notebook #2:
https://colab.research.google.com/drive/16FIcE9QUmvD6p1UhZMtD7g_cFZiPLugL?usp=sharing
Wrap Up
Questions? Comments?
Spring 2025 Data & Donuts
Workshop Recording and Materials
Available later today at at https://guides.lib.utexas.edu/data-and-donuts
Next Data & Donuts workshop: Tomorrow!
Making Beautiful Plots in R’s ggplot2
12pm - 1:15pm on Zoom and in the PCL Scholars Lab Data