1 of 42

Data and Project Management

20-22 January 2022

Susanna Allés Torrent (University of Miami)

&

Gimena del Rio Riande (CONICET, Argentina)

2 of 42

About us

  • Designing and working on DH projects in Spain, USA and Argentina since 2010
  • Teaching (esp. postgraduate courses)
  • Building a global Spanish-speaking DH community
  • Focus on: Digital Scholarly Editing, Text Encoding Initiative, Minimal Computing, Open Access

2

3 of 42

Breakdown of the session

12:15 - 1:15 pm: Introduction and good practices

1:15 - 2 pm: Lessons Learnt from…

2 - 2:15 Break

2:15 - 2:30 pm Activity - Break Out Rooms

  • 10 min. Case Scenarios
  • 20 min. Questions

2:30 - 2:45 Final Discussion: Tell about your project

3

4 of 42

Introduction and good practices

12:15 - 1:15 pm

4

5 of 42

Starting a DH project…

5

6 of 42

6

7 of 42

7

https://www.washington.edu/research/myresearch-lifecycle/

MyResearch Project Lifecycle

8 of 42

8

9 of 42

9

10 of 42

Before you start

10

11 of 42

Define the scope of the project:

  • What is /are the goal(s) of your project?
  • What problem(s) are you trying to solve?
  • Why are you doing this project?
  • What can you do with the resources available?
  • Who are you talking to?
  • Which are the resources that you have available right now without funding?
  • Which are the opportunities to get funding for such a project?

Also, when you are a grad student:

  • How much time do you have, really?
  • Is this going to count towards your PhD (talk to your advisor)?
  • How does it fit into your big picture or how could you keep developing it after your PhD?

11

12 of 42

Questions

  • Is your institution offering you any digital resources (e.g. people, technical support, software, hardware, etc.) ?
  • Time constraints: What are our deadlines? What about other people’s deadlines?
  • Which will be your deliverables? What do we hope to show as result at the end of the project?

12

13 of 42

Think

  • Detailed plan
  • Roles & responsibilities
  • Ethics
  • Communication plan
  • Resources
  • Time constraints
  • Deliverables
  • Risks and success
  • Assessment

13

14 of 42

14

15 of 42

Resources for your DH project

15

16 of 42

Resources for your DH Project: Communication

16

17 of 42

Resources for your DH Project: Data Management, Storage and Preservation

  • Create a Data Management Plan: https://dmptool.org/. Examples: https://dmptool.org/public_plans
  • Create a project, store and preserve datasets, images, documents, etc, during the project, considering size and access/control. e.g. OSF, https://osf.io, or an institutional repository

17

18 of 42

Resources for your DH Project: Storage and Preservation

  • Develop or use software, scripts, and/or workflows and establish a team repository. E.g. GitHub (also documentation in GitHub pages)
  • Preserve software, scripts, and/or workflows and establish a community. E.g. Zenodo, OSF
  • Preserve conference, training or workshop materials. E.g. Zenodo, OSF
  • Develop digital object management tracking tools (such as a spreadsheet, or database) for datasets, software, conference presentations, posters, preprints, and publications. E.g. Sheets in Google Drive.

18

19 of 42

Your data: How are you going to produce or recover your data?

Preliminary questions:

  • Where is your data? Is already out there or will you have to produce your data?
  • What structure do you need for your research?

Data models:

  • Linear (unstructured)
  • Hierarchical meta-markup languages (semi-structured)
  • Tabular (structured)
  • Relational (structured)
  • Binary

19

20 of 42

Files naming, storage/ backup strategies

Depending on your data, you will need to create a file naming system:

  • formats: txt, xml, csv, jpg, tiff, mp3, etc.
  • Be organized: clear structure of folders
  • Consisting and transparent file names (e.g. dickens_01, dickens_02,...)

Always use a Cloud system or a Versioning system (e.g. GitHub), and/or make a hard copy

20

21 of 42

Platforms

Differents ways to create your projects:

  • Using existing platforms or frameworks: Wordpress, Omeka (archives, collections), Drupal,
    • Pros: Backed with structured data (relational databases); interoperability; popular systems; metadata systems in place; web interface with search, browsing, taxonomy, tags, etc.
    • Cons: More difficult to customize
  • Creating your own
    • Pros: Creating a digital infrastructure and interface that responds specifically to your needs.
    • Cons: Lack of interoperability; standalone projects; need of a reliable developer who will maintain the project or write a consistent documentation

21

22 of 42

Funding: Think big, but start small

  • Start with a proof of concept
  • Be realistic on what can you achieve (time + resources)
  • Apply for small grants (e.g. at the university)
  • Scaffold your project (start with data query). Remember: Data query and curation is a scholarly activity

22

23 of 42

When to wrap up the project

Take into consideration:

  • Availability of data
  • Time
  • Resources
  • Team
  • Funding

23

24 of 42

Lessons Learnt from our projects

1:15 - 2 pm

24

25 of 42

Lessons learnt from our projects

  • Don’t be too ambitious
  • Plan in advance
  • Know your team
  • Make subteams, be clear about roles
  • Plan short activities
  • Document everything (minutes for calls, activities, etc.)
  • Open as much as you can
  • Report
  • Publish

25

26 of 42

Digital Narratives of COVID-19

https://covid.dh.miami.edu/

  • Project proposed from a very specific context
  • Conceived as an experiment
  • Conceived as a collaboration

26

More about this: “Exploring Digital Narratives of COVID-19 through Twitter” LASA Crisis Global. Desigualdades y Centralidad de la vida, May 27th, 2021

27 of 42

Goals:

  • Investigate online conversations about the current pandemic through the mining and examination of Twitter discourses in English and in Spanish worldwide
  • Create a corpus of tweets written in Spanish worldwide and from specific areas related to the coronavirus pandemic
  • Implement text analysis techniques from the digital humanities perspective by combining distant and close reading
  • Study what and how the public talks about the Covid-19 pandemic on social media, and how the discourses vary by language and region

Main Deliverables:

  • Downloadable collections of tweets (Mexico, Argentina, Colombia, Peru, Ecuador, Spain, Miami)
  • Release of Jupyter Notebooks to explore our corpus (frequency, concordances, sentiment analysis, …)

27

28 of 42

Other resources

28

Text Technologies Hub

https://tthub.io/

More about this: “TEI en español: Why and How” TEI Conference and Members' Meeting 2021

"Next-Gen TEI" 2021

29 of 42

TTHub is a work in progress that gathers teaching materials in Spanish and multiple resources devoted to digital editing and in general to text technologies. The site hosts several online tutorials in the section “Aprende” that can be accessed online or be downloaded in PDF from the academic repository Zenodo “TEI en español” .

Mellon Grant (Public Knowledge):

  1. Better access to the Guidelines: Translation platform
  2. More teaching / pedagogical materials (intro, advanced, XSLT, differents softwares, etc.)
  3. More Scholarly Narratives and Outputs: CFP for data articles in Spanish - Journal of the TEI
  4. More Training: TEI Schools in Spanish
  5. Survey on the uses and need of the TEI in the Spanish Speaking World, bitly.com/encuestaTEI

29

30 of 42

Community-Lead Projects

Asociación Argentina de Humanidades Digitales (AAHD)

  • Community of practice
  • Community Work without Funding or Staff
  • Self-organized group came from several semi and formal meetings dating back to 2013
  • SIGs // projects & activities

30

31 of 42

Characteristics of Community-Led Projects at AAHD

  • No direct funding
  • Participation is voluntary
  • Projects are collaborations between researchers, postgraduate students or groups
  • Participants determine a project (1 or 2 years or ongoing)
  • They address issues in DH that are difficult (or impossible) to overcome alone
  • All community members have access to participation and feedback (mailing list)
  • AAHD is owned by the community
  • Community-centered means more labor in relationship-building and feedback processes

31

32 of 42

32

33 of 42

Podcast Project

Five questions to digital humanists

  • Open call for training with certificates (two workshops)
  • Choose one or two digital humanists you’d like to interview
  • Schedule interviews (15-20 min)
  • Prepare podcast -upload - disseminate (trainers)
  • Article

33

34 of 42

Journal incubator project

  • OJS training with certificates
  • Editorial practices training with certificates
  • Editorial team
  • Reviewers team

34

35 of 42

Some of our projects… check them out… and ask us!

35

36 of 42

Break

2 - 2:15 pm

36

37 of 42

Activities & Discussion

2:15 - 2:45 pm

37

38 of 42

Option 1: Project Review

Imagine you are a reviewer working for Reviews for DH, https://reviewsindh.pubpub.org/ During the process you are asked to answer this list of questions:

  • Are the scope and goals of the project clear?
  • Does the project have a documentation available?
  • Are contents available in different languages?
  • Are authorship and citable information clearly available?
  • Is the data of the project available in an online repository?

38

39 of 42

Option 2: Digital Exhibition with Big Grant

Imagine you win a 5.000.000 U$S Mellon grant to develop a digital exhibition to digitize, exhibit, and explore important historical documents (you can imagine any type of objects, e.g. private archives containing unpublished documentation, old photographs, … )

  • Who would be your audience?
  • How would you…:
  • Organize your team and roles
  • Plan activities
  • Document
  • Spend the Budget

Would you…

  • Hire a Project Manager?
  • Engage librarians? How and what for?
  • Engage citizens?

As an example, consider this: https://photogrammar.org/maps

39

40 of 42

Option 3: Student Oriented Project

Last year you taught an early modern literature course and one of the texts your students worked on awoke a lot of interest: a mémoire of a young lady that went into slavery (there is only one extant manuscript and an edition from the 19th century, which you have in your university library). For the next iteration of the course that has a digital component, you decide that you are proposing a collective digital edition of this text that has to be done by all your students. Try to answer the following questions:

  • How would you organize the students? Would you assign different roles?
  • Which steps do you foresee the class would need?
  • Which tasks do you consider you would require?
  • How would you imagine the workflow (text transcription, text encoding, storage, web interface, etc.?
  • How do you balance technical expertise / labor with scholarly content?

As an example, consider this https://minilazarillo.github.io/ OR https://alicer98.github.io/DIGIT-110-AJC-Survey/document.html

40

41 of 42

Some bibliography and useful resources

Morgan, Page & Yvonne Lam (2018). “Making Choices About Your Data.” Digital Humanities Summer Institute

PM4DH: Project Management for the Digital Humanities, https://scholarblogs.emory.edu/pm4dh/

Reed, Ashley (2014). “Managing an Established Digital Humanities Project: Principles and Practices from the Twentieth Year of the William Blake Archive.” Digital Humanities Quarterly 8.1

Siemens, Lynne (2018). “Issues in Large Project Planning and Management.” Digital Humanities Summer Institute

Siemens, Lynne (2020). “Project Management.” Digital Pedagogy in the Humanities. Concepts, Models, and Experiments. https://digitalpedagogy.hcommons.org/keyword/Project-Management

41

42 of 42

42