1 of 17

R code for Twitter Scrape and Estimation

Keywork plot

Tweet count plot

KliqueFinder (code)

basic

advanced

Other data source

snapshots

movies

Crystalized sociograms

Parameter estimates

Parameter estimates

Code must be adapted for scraped data

2 of 17

Scraping Twitter in R

  • Check out Chapter 11 of General introduction to R
  • Get developer access here
    • May take 24 hours
  • Rmarkdown for scraping twitter

3 of 17

Getting access codes

4 of 17

Copy all these and save them to paste into R code

5 of 17

Setting up tokens: fill in the “” in Scraping Twitter:�Please download and save to a file on your machine�

6 of 17

Conceptualization: Network analysis is about what flows�

  • What flows through twitter?
    • Changes in beliefs?
    • Behaviors?
    • Level of activity?
    • Use of language?

6

7 of 17

Conceptualization: Think about the network model of influence

7

Yi’t-1

Wii’t-1🡪t

Yit

t

t-1

B

C

D

8 of 17

Process for Choosing a topic

  • Twitter ethnography:
    • Look at popular media
    • Look at tweets
      • Keywords vs hashtags
    • Identify key actors (@somebody)
    • Advanced searches
  • Narrow down:
    • Geography: but Twitter is global
    • By timing
    • By participants

8

reflection

9 of 17

Conceptualization: Think about the process of selection

9

t

t-1

Wii’t-1🡪t

C

B

C

D

10 of 17

Choose your keywords

10

reflection

11 of 17

Timing: Think about the network model of influence

11

Yi’t-1

Wii’t-1🡪t

Yit

t

t-1

B

C

D

12 of 17

Timing The Formal Model of Influence -- the Network Effect

  • wii’t-1🡪t

Interactions between i and i’ between t-1 and t.

  • yit
    • Outcome (dependent variable). An attitude or behavior of actor i at time t.
  • i’ wii’t-1🡪tyi’t-1.
    • Exposure. Sum of attributes of others with whom the actor interacted between t and t-1.
  • yit = ρi’wii’t-1🡪tyi’t-1 yit-1 +eit.
    • Model. Errors are assumed iid normal, with mean zero and variance (σ2).

12

13 of 17

Scrape Twitter Timeline for Diffusion:�See Influence Model

Pre interaction

sentiments

T1

T0

Interaction interval

Mention, retweet, reply

T2

Post interaction

sentiments

time_interval1

time_interval2

time_interval3

T3

Yit

Wii’t-1🡪t

Yi’t-1

14 of 17

Time interval code

15 of 17

Check size of scrape

16 of 17

Keyword plot over time

Reduce size as needed: scrape can take a long time

17 of 17

Execute full tweet, produces edgelist and node data