Introduction to
Digital Humanities
Week 2.1 – Data Culture(s)
database <-> narrative
> Lev Manovich, Database as symbolic form (1999)
database <-> narrative
Microsoft Encarta ‘95 cd-rom case
searchers vs. browsers
> searchers
Use the Princeton Postcard Collection to find how your college (e.g., Princeton or its colleges/buildings) is represented in the collection.
> browsers
Freely explore the Princeton Postcard Collection without a specific search term. Focus on discovering interesting, unexpected, or surprising items related to Princeton history.
searchers vs. browsers (~ ca. 10 mins)
> searchers
> browsers
searchers vs. browsers (~ ca. 10 mins)
> searchers
> browsers
searchers vs. browsers (~ ca. 10 mins)
humanities <-> data?
The Lonedale Operator (1911), D.W. Griffith
melodrama in silent film
Chaucer’s
vs.
Shakespeares
language
Page from Oprah Winfrey’s Diary, Sept. 22 1970
turned into data
date | event | person mentioned |
9-22-1970 | asked on a date | Anthony |
9-25-1970 | study in library | Bella, Anthony, Thomas |
9-27-1970 | birthday party | Mila, Joan, Jess, Marc |
Miriam Posner’s Silent Film & Melodramatic Conventions Dataset
screenshot of just one of my many many ‘data’ folders
n-gram revolution
is literature data?
Google Books, in its way, represents an even more profound shift than the printing press, because it ends the relationship to the codex [...]
is literature data?
Before EEBO [Early English Books Online] arrived, every English scholar of the Renaissance had to spend time at the Bodleian library in Oxford; that’s where one found one’s material. But actually finding the material was only a part of the process of attending the Bodleian, where connections were made at the mother university in the land of the mother tongue. Professors were relics; they had snuffboxes and passed them to the right after dinner, because port is passed left. EEBO ended all that, because the merely practical reason for attending the Bodleian was no longer justifiable when the texts were all available online.
> you imply . . .
when you call something ‘data’
your n-gram-based research
n-gram revolution
Cora
Todd
Todd
Carl
Theo
my experience
Experiment: Who is mentioned more, Shakespeare or Milton?
PROGRAM CountMentionsInFiles:
DATA txt_files
VARIABLES:
- txt_files_mentioning_milton = 0
- txt_files_mentioning_shakespeare = 0
FOR EACH txt_file in list of txt_files:
file_content = read_file_content(txt_file)
IF ' Milton ' is found in txt_file:
ADD 1 to txt_files_mentioning_milton
IF ' Shakespeare ' is found in file_content:
ADD 1 to txt_files_mentioning_shakespeare
DISPLAY txt_files_mentioning_milton
DISPLAY txt_files_mentioning_shakespeare
END PROGRAM
Experiment: Who is mentioned more, Shakespeare or Milton?
Milton is mentioned in 2854 texts.
Shakespeare is mentioned in 2040 texts.
→ Milton : Shakespeare ratio = 1.3990196078
😎
😎
Experiment: Who is mentioned more, Shakespeare or Milton?
the illustrations from Shakespeare in the notes
characteristics mentioned by Milton are found in Vergil
paſſage of Narciſſus probably gave Milton the hint
the great poets,— of even Shakespeare himself
as Shakespeare, Milton and Pope, the writers
such court-fools as Shakespeare might have
works of Shakespeare and John Mil-
ton have no
😬
😰
Shakespeare: 2674
Shakespere: 193
Shakspeare: 1467
Shakespear: 274
Shaksper: 22
Shackspeare: 8
Shackspear: 1
Shaxpere: 5
Shaxberd: 6
Shakspere: 691 Shakesspeare: 0 Shackspere: 4
Shakesphere: 2
Shakespeare: 2674
Shakespere: 193
Shakspeare: 1467
Shakespear: 274
Shaksper: 22
Shackspeare: 8
Shackspear: 1
Shaxpere: 5
Shaxberd: 6
Shakspere: 691 Shakesspeare: 0 Shackspere: 4
Shakesphere: 2
shakespeare: 33
shakespere: 4
shakspeare: 40
shakespear: 24
shaksper: 1
shackspeare: 0
shackspear: 0
shaxpere: 0
shaxberd: 0
shakspere: 12
shakesspeare: 0
shackspere: 0
shakesphere: 0
Shakespeare: 2674
Shakespere: 193
Shakspeare: 1467
Shakespear: 274
Shaksper: 22
Shackspeare: 8
Shackspear: 1
Shaxpere: 5
Shaxberd: 6
Shakspere: 691 Shakesspeare: 0 Shackspere: 4
Shakesphere: 2
shakespeare: 33
shakespere: 4
shakspeare: 40
shakespear: 24
shaksper: 1
shackspeare: 0
shackspear: 0
shaxpere: 0
shaxberd: 0
shakspere: 12
shakesspeare: 0
shackspere: 0
shakesphere: 0
Shakefpeare: 36
Shakospeare: 4
😭
Shakeſpeare : 125
Incidence of the word-forms "laft" and "last" in English documents from 1700 to 1900, according to Google's web n-grams database. Based on OCR scans of books, which can misidentify the long S as "f".
discussion
Data Biography
Data Biography
> Create a Data Biography for a humanities dataset, a narrative about a dataset’s lifecycle, including its creation, usage, and milestones.
> Learn more about it here: Krause, Heather. “Data Biographies: Getting to Know Your Data.” Global Investigative Journalism Network, 27 Mar. 2017.
Background check on a dataset:
“Getting to know your data can reveal crucial gaps, bias, misinformation, or overlooked details in your story.”
Data Biography
Your Data Biography should tell a story about the dataset that addresses the following key aspects:
Data Biography
> Some additional notes:
> Submit by email a 3~5 page (1000-1500 words) paper by October 15, 11:59PM
> See full assignment description here
for next session
> Pre-Class Annotations (no reflection this time!)
references
Krause, Heather. “Data Biographies: Getting to Know Your Data.” Global Investigative Journalism Network, 27 Mar. 2017, https://gijn.org/stories/data-biographies-getting-to-know-your-data/.
Marche, Stephen. “Literature Is Not Data: Against Digital Humanities.” Los Angeles Review of Books, 28 Oct. 2012, https://lareviewofbooks.org/article/literature-is-not-data-against-digital-humanities/.
Michel, Jean-Baptiste, et al. “Quantitative Analysis of Culture Using Millions of Digitized Books.” Science, vol. 331, no. 6014, Jan. 2011, pp. 176–82.
Posner, Miriam. Humanities Data: A Necessary Contradiction. 25 June 2015.
Ramsay, Stephen. “The Hermeneutics of Screwing Around; or What You Do with a Million Books.” Pastplay: Teaching and Learning History with Technology, edited by Kevin B. Kee, University of Michigan Press, 2014, pp. 111–20.
Rosenberg, Daniel. “Data before the Fact.” Raw Data Is an Oxymoron, 2013, pp. 15–40.