1 of 17

Teaching Data Analysis &

Computational Methods to Social Scientists

Alex Hanna, Sociology

March 19, 2015

@alexhanna // alex-hanna.com

2 of 17

Slides are available at http://tinyurl.com/bigdata-socsci

Follow along at home and check out links!

@alexhanna // alex-hanna.com

3 of 17

The problem

How teach data analysis and literacy to social science students?

How to teach computational methods to social scientists wanting to start in big data projects?

@alexhanna // alex-hanna.com

4 of 17

The problem

Social scientists use SPSS or STATA

But not R, Python, Hadoop

Literacy important for both new and veteran scholars

@alexhanna // alex-hanna.com

5 of 17

Old tasks

Data munging

Regression

Graphing

@alexhanna // alex-hanna.com

6 of 17

Old tasks

New tasks

Data munging

Regression

Graphing

Data munging

Web scraping

Large-scale networks

Automated text analysis

@alexhanna // alex-hanna.com

7 of 17

Pedagogical approach

Meet people where they are at

How can you get them involved in a meaningful way?

@alexhanna // alex-hanna.com

8 of 17

Pedagogical approach

Meet people where they are at

How can you get them involved in a meaningful way?

Provide a lab setting for working through problems

Guide people along with hands-on workshops

@alexhanna // alex-hanna.com

9 of 17

Pedagogical approach

Meet people where they are at

How can you get them involved in a meaningful way?

Provide a lab setting for working through problems

Guide people along with hands-on workshops

Make code and instructions integrated and available on the web

Markdown, IPython/Project Jupyter notebooks

@alexhanna // alex-hanna.com

10 of 17

Example 1:

Introduction to RStudio

Goal: Traditional tasks

Data handling

Plotting

Univariate and bivariate analysis

Audience: Introductory methods course

Undergrad sociology students

Some with STATA experience

http://alex-hanna.com/teaching/soc357/lab/

@alexhanna // alex-hanna.com

11 of 17

RStudio:

Using Examples

Code blocks and interface

Allowing for “do-it-yourself” puzzle after initial instructions

@alexhanna // alex-hanna.com

12 of 17

Example 2:

Blogclub “tworkshops”

Goal: From zero to Hadoop for social media data

Basic UNIX terminal, Python

Various types of analysis

Audience: Mix of ~10 faculty and PhD students in SJMC

Labs taking place over timespan of a year

http://alex-hanna.com/tworkshops

@alexhanna // alex-hanna.com

13 of 17

Tworkshop syllabus

1. Twitter API and an introduction to the terminal

2. More terminal and your first Python script

3. Basic Python

4. Python modules and I/O

5. Hadoop and MapReduce

6. Basic sentiment analysis

7. Network analysis

@alexhanna // alex-hanna.com

14 of 17

Example 3:

HSE Summer School

Goal: Python coding and network data collection

Audience: ~30 PhD students

10-day Summer School for Internet Research at Higher School of Economics in St. Petersburg, Russia

@alexhanna // alex-hanna.com

15 of 17

HSE Summer School: Three labs

Introduction to network data

Conceptual, showing data

Collecting Twitter data

Python (tweepy)

Scraping web networks

Python (scrapy)

@alexhanna // alex-hanna.com

16 of 17

Takeaway

Provide tasks which students think of in their own research

Make it hands-on from the outset

Think of as a stepping stone to more activities

@alexhanna // alex-hanna.com

17 of 17

Thanks!

ahanna@ssc.wisc.edu

@alexhanna // alex-hanna.com