1 of 17

Contemporary Research Data

Libby Hemphill

Resource Center for Minority Data, ICPSR

School of Information

1

SHARING DATA TO ADVANCE SCIENCE

2 of 17

Road Map

  • Digital exhaust as rich social science data
  • Capturing social media data
  • Creating impact with linked data
  • Research Examples

2

10/10/17

3 of 17

Digital Exhaust

3

4 of 17

Twitter Sunrise

4

https://carto.com/gallery/twitter-sunrise/

5 of 17

Cityways

5

http://senseable.mit.edu/cityways/

6 of 17

Capturing Social Media Data

  • Command Line
  • STACK
  • TCAT
  • Scrape
  • Roll your own

GUI/SaaS

  • Netlytic
  • Sysomos
  • NodeXL

6

7 of 17

Choosing a Capturing Strategy

  • What data do you need to answer your question(s)?
  • What data do you need to link appropriately to other data sets?
  • How comfortable are you writing Python/PHP?
  • How much storage do you have available?
  • Will you need a stream or a snapshot?
  • What data formats are you comfortable working with?

7

8 of 17

STACK

8

9 of 17

Scraping

9

10 of 17

NodeXL

10

11 of 17

Creating Impact by Linking

  • Localized analysis
  • Surface demographic/regional differences
  • Trace information/ideas among platforms

11

12 of 17

Research Examples

12

Project

Platform

Linked Data Sources

Capture Strategy

Congress, Twitter, and the Public Agenda

Twitter

The New York Times

GovTrack

Roll Your Own

FlashPoint: Detecting Cyberbullying

Instagram

HateBase�ProfaneBase

Scrape

Rural - Urban USA

Twitter

US Census

STACK

Perpetuating Segregation

EveryBlock

US Census

Roll Your Own

13 of 17

Congress, Twitter and

The New York Times

13

14 of 17

Not Quite EveryBlock

14

15 of 17

Challenges

  • Metadata
  • Platforms’ terms of service
  • Technical and computational resources

15

16 of 17

Your Needs

  • What do your users ask about?
  • What could ICPSR do to help?

16

17 of 17

RCMD Needs Your Input

  • What data should we be archiving?
  • How else can we help?

Libby Hemphill

libbyh@umich.edu

17