1 of 13

Data Awareness and Data Management

Benito Trollip

18-19 October 2022

DH-Ignite

Coastlands Hotel, Umhlanga

benito.trollip@nwu.ac.za

License: CC BY 4.0

Funded by:

2 of 13

STRUCTURE 

  • Introduction
  • What is data?
  • What makes data important?
    • FAIR + CARE
  • Where and how data be stored?
  • What is a repository?
    • Why SADiLaR’s repository?
  • Training and involvement

3 of 13

INTRODUCTION

  • Reasons for data awareness and management
    • Value of data
    • Labour intensive to produce data
    • Data can be seen as a form of output
    • Reusability
    • Advancing the field
  • Benefits (for you)
    • Visibility
    • Citability
    • Stored
    • Collaboration

4 of 13

WHAT IS DATA?

Please go to menti.com

Enter the following code

4102 6910

5 of 13

WHAT IS DATA? [1]

  • Consider this definition by Harrower et al. (2020):

“We could then define data in the humanities broadly as all materials and assets scholars collect, generate and use during all stages of the research cycle.”

6 of 13

WHAT IS DATA? [2]

  • It could therefore include:
    • Datasets
      • Corpora, wordlists, frequency lists
      • Interviews, qualitative questionnaire answers
    • Methodology and process
      • Code, methods used, workflow
    • Application(s)
      • Executable files or tools

7 of 13

WHAT IS METADATA?

Please go to menti.com

Enter the following code

4102 6910

8 of 13

WHAT MAKES DATA IMPORTANT:�FAIR + CARE principles

  • Findable
  • Available
  • Interoperable
  • Reusable

+

  • Collective benefit
  • Authority to control
  • Responsibility
  • Ethics

9 of 13

Box 2 in Wilkinson et al. (2016)

10 of 13

WHERE AND HOW CAN DATA BE STORED? [1]

  • Analogue - in a steel drawer? With a key?
  • Semi-digitally - a USB or CD-ROM
  • Other?
  • What is a repository?
  • Licensing

11 of 13

WHY SADiLaR’s REPOSITORY?

  • Specialist repository with different options
  • SADiLaR’s repository
    • Other NLP/ML outputs are there
    • Findable, also on Google Scholar
    • Linked to CLARIN
    • Open-source, free to access

12 of 13

TRAINING AND INVOLVEMENT

  • ESCALATOR programme:
    • ESCALATOR’s webpage with flyer
    • Join the Slack channel
  • Steps to submit to the repository are available here
  • Request workshops here
  • Contact me at benito.trollip@nwu.ac.za

13 of 13

THANK YOU FOR YOUR ATTENTION.

PRESENTATION AVAILABLE AT bit.ly/DHI-data

PLEASE FEEL FREE TO GET IN TOUCH:

benito.trollip@nwu.ac.za

info@sadilar.org