(Meta)data Analysis for DEI
The University of Alabama
Brian Clark & Catherine Smith
#coreforum2022
Brian Clark & Catherine Smith
10/15/2022
(Meta)data Analysis for DEI
Impetus & Overview of Methodology
Selected LCSH
Diagram of Workflow
MARC records
Read into PyMARC
Meets criteria for analysis
discard
Read into R
Sort/group records by subject headings
Yes
No
Write to CSV
CSV file
Clean and normalize data
Python
R
Produce visualizations of data
Python
Python�PyMARC
Python�650 field decision tree
Append ‘NULL’ to output row
Return list
Yes
650 in record
ind 2 = 0
Append to LCSH list
Count items in LCSH list
Contains items
Yes
No
No
Ignore
No
Append ‘NULL’ to output row
Append to output row as JSON
[“=650 \0$aWomen$zAsia$xHistory.$0http://id.loc.gov/authorities/subjects/sh85147274”,
“=650 \0$aWomen$zAsia$xSocial conditions.$0http://id.loc.gov/authorities/subjects/sh85147274”,
“=650 \0$aFeminism$zAsia.$0http://id.loc.gov/authorities/subjects/sh85047741”,
“=650 \4$aFeminism$zAsia.”
“=650 \4$aWomen$zAsia$xHistory.”,
“=650 \4$aWomen$zAsia$xSocial conditions.”]
[“$aWomen$zAsia$xHistory.”,
“$aWomen$zAsia$xSocial conditions.”,
“$aFeminism$zAsia.”]
Yes
R
R
R
Clean
Organize
Analyze
Utility vs. Exploration
Sample Findings
Sample Findings
Cooccurrences of LCSH w/ "Indigenous peoples"- related LCSH
Cooccurrences of LCSH w/ "Women"-related LCSH
Questions?
B. Clark & C. Smith (2022) "Prioritizing the People: Developing a Method for Evaluating a Collection’s Description of Diverse Populations," Cataloging & Classification Quarterly, DOI: 10.1080/01639374.2022.2090042
GitHub repo: https://github.com/bpclark2/Core-Forum-2022