1 of 28

DATA

Carrie Pirmann and Ben Hoover

07 June 2024

LITERACY

2 of 28

Goals

  • Understand what it means to be “data literate”
  • Know the steps of working with data
  • Learn to recognize what makes good (and bad) visualizations

3 of 28

What is data?

DATA* = FACTUAL INFORMATION THAT IS SYSTEMATICALLY RECORDED AND ANALYZED TO ANSWER A QUESTION

*Definition may vary by discipline

4 of 28

Data comes in many different forms!

Harvard College Alcohol Survey 2001

Darwin’s finches from the Galapagos Islands (beak adaptation to specific types of foods present on different islands inspired Darwin’s theory of evolution by natural selection)

Rosalind Franklin’s x-ray diffraction image of crystalized DNA (evidence of a double helix structure)

5 of 28

Types of data

  • Quantitative data deals with quantities, i.e., information that can be counted, measured, or otherwise expressed using numbers
    • Summarized and analyzed using traditional statistics or related methods
  • Mixed data combines quantitative and qualitative data
    • Analyzed using mixed (quantitative and qualitative) methods
  • Qualitative data deals with qualities, i.e., information that is descriptive and conceptual in nature, and cannot be easily expressed in numbers
    • Sources of qualitative data: text documents, interview transcripts, images, audio and video recordings, other
    • Requires qualitative summary and analysis (not statistics)

6 of 28

Data literacy & why it’s important

DATA LITERACY = THE ABILITY TO READ, WORK WITH, ANALYZE, VISUALIZE, INTERPRET, ARGUE WITH, AND USE DATA TO MAKE DECISIONS AND SOLVE PROBLEMS

  • Data literacy skills are essential 21st-century skills!
  • Necessary to advance scientific knowledge, and solve problems (big, small, business, medicine, government etc.)
  • You don’t need to be a data scientist – just keep developing the data literacy skills relevant to your academic & professional interests

7 of 28

Activity

Tell us about your project.

  1. What types of data are you planning to use?

  • Where do you plan to search for them? (Who may have already collected the data, how, and why?)

8 of 28

Steps of working with data

  1. Formulate a question or hypothesis
  2. Acquire the data (collect new data or find a dataset you trust)
  3. Get to know your data (incl. the research methods and ethical guidelines)
  4. Prepare/“clean” the data for analyses and visualizations
  5. Decide on appropriate analyses or visualizations
  6. Interpret your results and tell a story with your data

9 of 28

1. Formulate a question or hypothesis

  • What do we (people) already know about the topic? (literature review)
  • What do you want to know?
  • What do you expect to find (e.g., pattern of results, differences between groups or conditions, cause and effect), and why?

10 of 28

2. Acquire the data (new or existing)

Two primary approaches:

  • Collect new data
    • You can optimize the study design to your research question or hypothesis
    • A new research study and data collection can be time-consuming & costly
  • Find an existing dataset 🡺 OPEN DATA
    • Carefully evaluate the source: Do you trust the authors? Is the data of high quality? Does it have all required documentation (codebook, IRB approval)?
    • Make sure you have permission to use the data and present/publish the results

11 of 28

2 strategies to find open data

There are many open-access, publicly available datasets online that you can use!

  1. Find an established data repository, and search for a dataset by topic or other attributes.
  2. Find a published research article and locate the original dataset used. (Many peer-reviewed professional journals require data sharing as a condition of publication. Some journals also curate a list of recommended data repositories.)

12 of 28

Open data repositories

From the Bertrand Library main page, go to Research by Subject Guides: Data Services 🡺 Data 🡺 Open Data

https://researchbysubject.bucknell.edu/c.php?g=956824&p=6906764

Some examples:

  • Inter-University Consortium for Political & Social Research (ICPSR)

🡺 built-in data analysis tools!

  • https://www.data.gov/ 🡺 home of U.S. government data
  • Census data (people) and Economic Census data (businesses)

13 of 28

3. Get to know your data

Even if you didn’t collect the data, understanding the research methods is critical to interpreting the results!

  • How was the data collected? Who collected it? When and where? With what measures or instruments? Using what study design? Primary research question of the study? Source of funding?
  • What were the relevant ethical research guidelines, and were they followed (e.g., IRB approval and informed consent)?
  • Which variables will you look at to answer your question or test your hypothesis? (The Codebook is your friend.)

14 of 28

Data quality & integrity is key!

We use data to learn something about the world, to draw a valid and accurate conclusion, to make the best, most informed decision

Good quality data = useful and beneficial

Poor quality data = useless and potentially harmful

15 of 28

3. Get to know your data

Bottom line:

Get to know your data and the methods used to collect it!

If you understand your data, you can:

  • Ask new questions and test new hypotheses
  • Be creative but also rigorous in your analyses
  • Understand the limitations of the data
  • Better interpret your results

16 of 28

Data ethics

Data ethics is part of research ethics – and it’s about trust

3 principles of responsible conduct of human subject research:

  • Respect for persons (a person needs to give an informed consent to participate in any research study; they have the right to know what the study is about, as well as the risks and benefits; and they can withdraw their consent at any point)
  • Beneficence (minimize the risk while maximizing the benefit)
  • Justice (the risks and the benefits should be fairly distributed)

17 of 28

Data privacy

Increasingly important (again, it’s about trust!) – but the ethical guidelines and legal regulations are only evolving

  • What kinds of data can be ethically and/or legally collected on people? And on what conditions?
  • Who owns the data? (The person supplying the data? The researcher who collects it? The funding agency or the business who paid for the study? The government of the country?)
  • Who has the right to see the data? Use the data to make decisions (and what kinds of decisions)? Sell it and profit from the data?

Be an informed and responsible data user!

18 of 28

Break

19 of 28

Steps of working with data

  1. Formulate a question or hypothesis
  2. Acquire the data (collect new data or find a dataset you trust)
  3. Get to know your data (including the research methods and ethical guidelines)
  4. Prepare/“clean” the data for analyses and visualizations
  5. Decide on appropriate analyses or visualizations
  6. Interpret your results and tell a story about your data

20 of 28

4. Prepare/ “clean” the data

Tips: Document all the changes you make to your data files, no matter how small, so you (or someone else) can repeat/ replicate your processing steps, your analyses, and ultimately your results

  • Save your working data file with a new name; keep the original secure
  • Consider the tool you will use for data analyses or visualizations – and structure your data for that tool
  • Check for missing data, and decide how to deal with it
  • Be careful and consistent at each step to avoid errors

21 of 28

5. Analyze and visualize the data

The goal is to find the right analysis or the right visualization

to answer your question or test your hypothesis.

Things to consider:

  • Type of data (e.g., quantitative, qualitative, etc.)
  • Study design used to collect the data
  • Limitations of the data
  • How the results will be used and by whom
  • Tool/s you intend to use (e.g., statistical software)
  • Be as simple as you can be – but no simpler

22 of 28

Activity 1a

Share your visualization (take 2 minutes to review Perception Deception reading)

  1. What are good aspects?

  • What are bad aspects?

23 of 28

Activity 1b

Pair-Share

  • Create common criteria for evaluating visualization

  • Share out your criteria

24 of 28

Activity 2

Good, Bad, and Ugly

  • Share out your review of one good and one bad visualization

25 of 28

Activity 3

XKCD

  • Share why you find the visualization interesting and your review (using your group’s criteria)

26 of 28

6. Interpret the results and tell a story about the data

  • Go back to your initial research question or hypothesis – and now answer it with data

  • Consider your audience, their needs, interests, and level of knowledge, and how they will use the results

  • The goal is to tell a clear, accurate, logical, and compelling story with your data

27 of 28

LinkedIn Learning is your friend!

LinkedIn Learning: https://www.bucknell.edu/linkedinlearning

  • Set up your LinkedIn professional profile
  • Use LinkedIn Learning online courses to learn or improve your data literacy skills (you can search by topic, skill, tool, etc.)
  • Display the completed courses on your LinkedIn profile to demonstrate your learning

28 of 28

Thank you!