1 of 40

Historic Black Lives Matter: �Recovering Hidden Knowledge in Archives Through Interactive Data Visualization

Lori A. Perine

AIC Research Fellow/Doctoral Candidate

University of Maryland/INFO�Assistant Professor, Mathematics, Statistics, and Data Science

Montgomery CollegePresented Online to:�Student Datathon at Spelman College

November 1, 2024

2 of 40

Welcome/Today’s Presentation

  • HBLM Background/Motivation
    • Why Visualize Archives
    • Extension of CT-LOS
    • HBLM Design and Research Objectives
  • The Data: Legacy of Slavery Collection at the Maryland State Archives
  • The Design: Interactive Dashboard on Tableau
    • Design Elements
    • Recovered Knowledge
  • Key Takeaways
  • Questions and Answers

3 of 40

Background and Motivation

4 of 40

Why Visualize Archival Records?

  • Opens possibilities to view the historical record through a different lens. �
  • Better representation of persons and events that tend to be marginalized or “erased” in the records.�
  • Aid discovery and enhance access to

information and stories buried in archival records.

5 of 40

HBLM extends CT-LOS

  • Interdisciplinary research presented and published for IEEE Big Data 2020, CAS Workshop #5
  • Explore application of computational methods to enhance discovery of histories of marginalized communities .
  • Adopt computational thinking for case study conceptualization and implementation
  • Address socio-technical context in application of computational treatments.
  • Demonstrate experiential, interdisciplinary, team-based learning for information professionals
  • Create learning/teaching artifacts
  • UMD Collaborators: Dr. Richard Marciano, R.K. Gnanasekaran, P. Nicholas, A. HIll
  • Community Partner/Collaborator:

Maryland State Archives

Legacy of Slavery Program

  • Interactive visualization with existing data sets
  • Textual data-mining with existing data sets
  • Graph database and cross-collection connections
  • Adapt ontology for enslaved populations for better representation of actors, relationships, and events
  • Metadata extensions or “retrofits” to enable connections across collections
  • Probability models and machine learning to automate connection discovery

Research Topics Proposed by CT-LOS

6 of 40

HBLM Design Overview

  • Extension of the CT-LoS research, focusing on the manumissions data
  • Create a series of web-based visualizations to enhance the MSA’s interactive offerings, which target educators and researchers
  • Explore how design principles and interactive and/or dynamic implementation can enhance user engagement with collections.
  • Create learning/teaching artifacts
  • Community Partner:

Maryland State Archives

Legacy of Slavery Program

Research Methodology: Data visualization design and case study. Dissemination via public web-based interactive data visualization and published blog on the VisUMD site via Medium.

Design Objectives

Design Methods

(OBJ1) Ensure representation of the human beings encoded within the data

Multiple interactive interfaces to vary perspective of data engagement. Uniform color coding, element layout, and use of dynamic and interactive elements.

(OBJ2) Provide an interface that allows users to easily engage with the data

Beta testing and user evaluation

(OBJ 3) Facilitate discovery and communication of information contained in the collections, while taking into account the limitations of the data.

Selective use of design elements and interactive interfaces to represent 1)people, 2)Time and place, 3) geography, and overview.

7 of 40

The Data

8 of 40

Data Sourcing: Freedom Records from the MSA (I)

Manumissions Documents

Certificate of Freedom:

9 of 40

A manumission is the legal document freeing an enslaved person. Manumissions can be found in land, probate, and chattel records. There is also a separate record series called Manumissions.

A Certificate of Freedom is a legal document that was issued to African Americans who were required to record proof of their freedom in the county court. The court would then issue them a Certificate of Freedom. If the person had been previously manumitted by an act of the slaveholder, the court clerk or register of wills would look up the manumitting document before issuing a certificate of freedom.

Data Sourcing: Freedom Records from the MSA (I)

10 of 40

Enhanced Data Flow from Original Source Documents to Computational Exploration

11 of 40

Basic Descriptive Analytics:�Manumissions & Certificates of Freedom Data

  • Number of Records: 23,655 �*Possible Duplicate Records
  • Geographic Coverage: 16 counties & 1 city
  • Year Issue: 1806-1864
  • Age Range: 3 months-82 years old
  • Male: 93% (21,887)
  • Female: 5% (1,082)
  • Unknown: 2% (686)
  • Number of Records: 7,399 initial cleaned
  • Geographic Coverage: 10 of 24 counties
  • Year Issue: 1770-1870 �(delayed manumissions)
  • Age Range: 0 (infant) – 80 �(original max = 237)
  • Male: 47.9%
  • Female: 51.7%
  • Unknown: 523

Manumissions

Certificates of Freedom

12 of 40

��Basic Data Exploration in the Manumissions Collection

Number of Records:�Anne Arundel: 3380�Caroline: 38

Cecil: 2

Carroll: 3

Dorchester: 254

Hartford: 90

Kent: 76

Montgomery: 159

Queen Anne: 3017

Talbott: 319

������

Age Distribution by County

13 of 40

Data Visualization in R: Frequency Distribution of Manumissions Documents by Year and County

13

Histogram with ggplot2

Scatterplot with ggvis

14 of 40

Research Topics Proposed by CT-LOS

  • Interactive visualization with existing data sets
  • Textual data-mining with existing data sets
  • Graph database and cross-collection connections
  • Adapt ontology for enslaved populations for better representation of actors, relationships, and events
  • Metadata extensions or “retrofits” to enable connections across collections
  • Probability models and machine learning to automate connection discovery

Graph Image Source: Shimizu, C. et al, The enslaved ontology: Peoples of the historic slave trade. Journal of Web Semantics, Vol. 63, August 2020. https://doi.org/10.1016/j.websem.2020.100567

15 of 40

The Design

16 of 40

Viz Design Challenges

  • Text, text, text!
  • Non-standard information
  • Representation of human beings
  • Engage and Communicate
  • Ideal design vs. technical capability

17 of 40

18 of 40

Viz Demonstration – Tableau Dashboard

  • https://public.tableau.com/app/profile/l.a.perine/viz/ManuVizPresentation/ManumissionsintheStateofMaryland1774-1874

19 of 40

How old were people when they were granted their freedom? Were they male or female? Are there differences among the counties in who was granted freedom — and when?

  • Interactive population pyramid most directly represents enslaved persons
  • Highlighted by its position on the top left
  • Vertical axis is age group and the horizontal axis is number of manumissions records
  • The tooltip feature of Tableau allows the user to see the exact count in each bar by hovering the mouse over the bar.
  • Female/male sides are distinguished by color.
  • Default view is staticillustrating information for all counties and all years within the data.
  • The general year slider and/or county selector can be used to spotlight information about the population who whom manumission is conferred, by county, by range of years, or both.

Design Element 1: The People

20 of 40

Example of Recovered KnowledgeThe distribution of ages at which enslaved people were granted freedom and whether the person was female or male provides a window into how legal norms and practices impacted individual lives of Maryland’s Black antebellum population. By using the interactive features, the user can examine how patterns vary over time and by county. It should be noted that this visualization do not allow us to distinguish certain phenomenon uncovered through other analysis, such delayed manumissions which confer freedom at a later age.

Design Element 1: The People

21 of 40

What are the trends in manumissions over time? How do they compare from county to county?

  • Interactive trend plot of the annual frequency (number) of manumissions by county. 
  • Linked directly to the population pyramid, via the year and county selectors, for complete trends in manumissions and the populations they are impacting
  • Vertical axis represents the number of manumissions. Horizontal axis represents year
  • Interactive slider creates dynamic timeline effect 
  • range of years can be chosen, for a spotlight view of the trend lines during a customized period of time.
  • The tooltip feature gives the county, year, and the number of manumissions 
  • Common color coding is used to represent the trend lines of individual counties.

Design Element 2: Time and Place

22 of 40

Example of Recovered Knowledge�The trends show distinct periods when manumissions rise and fall, with some peaks occurring well in advance of the Civil War and Maryland’s emancipation declaration in 1864. These patterns direct our attention to historical events or movements that facilitated or hindered freedom for enslaved populations.

Design Element 2: Time and Place

23 of 40

How do counties rank by numbers of manumissions and how does that change over time?

  • Dynamic ranking plot of the number of manumissions by county. 
  • Automated with values changing in annual increments.
  • Vertical axis is the counties with the common color coding 
  • Length of bars representing the annual number of manumissions in the county.
  • Counties are only visible on the vertical axis if data is available for that particular year

Design Element 3: Geographic Focus

24 of 40

Example of Uncovered KnowledgeThis visualization presents a more granular view of the level of manumissions in counties. Manumissions peaked in different counties at different times. Observing these phenomena invite us to research further the potential temporal drivers and inhibitors for conferring legal freedom.

Design Element 3: Geographic Focus

25 of 40

What are years, counties, and number of manumissions records represented in the dataset? What are general trends throughout the period when records are kept?

  • Classic stacked frequency bar chart provides a baseline visualization of the dataset.
  • Visualization is static, to provide a backdrop and reference for the other elements of the dashboard
  • Horizontal axis represents year in five-year intervals and the height of the bar (vertical axis) are total manumissions records.
  • Length of bars representing the annual number of manumissions in the county.
  • The bars are segmented by county, using the common colors.

Design Element 4: The Big Picture

26 of 40

Example of Uncovered KnowledgeThere is a fairly robust pattern of manumissions in Maryland during the first decades of the 19th century, which dropped sharply in the 1830s and stayed at that level until statewide emancipation in 1864. This pattern can be mapped to key historical events in Maryland that first encouraged, then suppressed freedom.

Design Element 4: The Big Picture

27 of 40

Conclusions/Takeaways

28 of 40

Key Takeaways

  • Dynamic and interactive elements help to engage interest and tell a story
  • Position of elements aids interaction and engagement
  • Variety in elements supports investigation
  • Similar information presented with different marks and interactivity can support broader communication
  • Static elements and background provide context/reference points
  • Color palette is important – colorblind user feedback
  • Fewer elements + high interactive = interest + usability

L. Perine. (Dec. 2020). Historic Black Lives Matter: Visualizing Hidden Heritage in Legacy of Slavery Collections. [Blog] VisUMD: Visualization at University of Maryland, Available at: https://medium.com/visumd/historic-black-lives-matter-visualizing-hidden-heritage-in-legacy-of-slavery-collections-23d3266dd0c5

29 of 40

�Thank You�Questions & Discussion

CONTACT: �PROF. LORI A.PERINE�LPERINE@MONTGOMERYCOLLEGE.EDU�LPERINE@UMD.EDU

30 of 40

Backup Slides

31 of 40

Computational Treatments to Recover Erased Heritage: A Legacy of Slavery Case Study (CT-LoS)

AN INITIAL EXPLORATION OF OPPORTUNITIES AND LIMITATIONS

32 of 40

CT-LoS Project Overview

  • Interdisciplinary research presented and published for IEEE Big Data 2020, CAS Workshop #5
  • Explore application of computational methods to enhance discovery of histories of marginalized communities .
  • Adopt computational thinking for case study conceptualization and implementation
  • Address socio-technical context in application of computational treatments.
  • Demonstrate experiential, interdisciplinary, team-based learning for information professionals
  • Create learning/teaching artifacts
  • UMD Collaborators: Dr. Richard Marciano, R.K. Gnanasekaran, P. Nicholas, A. HIll
  • Community Partner/Collaborator:

Maryland State Archives

Legacy of Slavery Program

Research Questions

Research Methods

(RQ1) What are the opportunities and limitations for using computational methods and open source tools to characterize data encoded within records of enslavement and to discover new patterns and relationships in that data?

Apply computational

methods associated with “big data” to information contained within text-based records of archival collections related to

slavery.

(RQ2) How does knowledge of social and cultural systems impact those opportunities and limitations?

Investigate the socio-cultural context in which the original source artifacts and were created and collected; and second, the socio-technical context for converting those artifacts into digital formats.

Research Methodology: Exploratory case-study organized around use of primarily open source data analytics tools and methods applied to two datasets from the MSA collections. Dissemination via peer-reviewed paper, presentations, and Jupyter notebooks.

33 of 40

Computational Thinking Practices Identified for CT-LoS

34 of 40

Data Sourcing Methods

34

  • Direct access to underlying SQL database using webscraping techniques (Python)

  • Extraction of primary fields (Python)�
  • Translation into basic .CSV data tables for easy access (Python to Excel)�
  • Data elements may be loss in technical translation, as well

35 of 40

CT-LoS Scope in Work Flow

36 of 40

Data Sourcing Considerations for Computational Explorationn

  • Relational SQL databases created from original source documents
    • Non-standard primary sources
    • Transcription errors
    • Data entry errors
  • Limitations of two dimensional technical tools
  • Assumptions regarding definitions of data fields and what/how to “mine” from original documents
  • How well are meaning and relationships captured by the digitized representations? What data elements are lost or ignored? Why? Who decides???

The sociotechnical context for datification of the original sources is an important element of the data exploration.

37 of 40

Data “Biography”: Key sociotechnical consideration for applying computational methods

  • Application of computational thinking invites us to examine how socio-technical context informs the potential for extending metadata and methods: when, what, who, why, how, where
    • Original documents
    • Assumptions and technology for the digitization and datification processes
    • Computational exploration, visualization, and linking
  • We need to do this in all phases of the workflow:
      • data preparation – abstraction as metadata?
      • data wrangling – decisions for addressing non-standard, transcription, possible duplicates, missing information
      • data analysis and visualization – transparency, explainability, and interpretation

38 of 40

Data “Biography” Influences Computational Decisions

Original Documents

Digitization and Datafication

Computational Exploration

Provenance

X

X

X

Legal Context

X

X

Historical Context

X

X

X

Geographic Context

X

X

Transcription/ Translation

X

X

Technology Tools and Methods

X

X

when, what, who, why, how, where

39 of 40

Identifying Patterns: What happened around 1831/32?

40 of 40

LoS Data Biography: Maryland Historical Context

1642- The first cargo ship with 13 Africans arrives in St. Mary's City. The legal status of indentured servants and slaves in Maryland remains in contention.

1664- Maryland legalizes slavery.

1775- The Revolutionary War begins.

1783- Maryland prohibits the importation of slaves.

1783- The Maryland Gazette published "Vox Africanorum", an editorial denouncing the inequality in the newly formed America, which promoted liberty and freedom while enslaving thousands.

1796- The Maryland General Assembly liberalizes the state's manumission laws regarding how and when a slave owner can free his/her slaves.

1831- The Maryland Colonizational Society forms to colonize Maryland blacks in Africa.

1832- In response to the Nat Turner Revolt, Maryland's legislature prohibits free blacks from entering the state.

1857- The U.S. Supreme Court hands down the Dred Scott decision, which denied African Americans equal rights as citizens.

1860- The Maryland General Assembly outlaws manumission by deed or will.

1861- The Civil War begins.

1862- Slavery is abolished in District of Columbia.

1863- Lincoln issues the Emancipation Proclamation, which frees all slaves in the territories currently in rebellion.

1864- On November 1, slavery is abolished in Maryland.

1865- Slavery is abolished in all of the states by the 13th Amendment.