1 of 22

Social Media Scraping for QDA

2 of 22

Data Services at NYU

3 of 22

Goals

  1. What is QDA?
  2. Why For Social Media?
  3. What are some considerations and tools?
  4. How do we use these tools?
  5. Other Considerations

4 of 22

What is Qualitative Data Analysis (QDA)?

The systematic collection, organization and interpretation of unstructured data (e.g.interview transcripts, public documents, focus groups, open-ended survey questions)

After a close reading of the qualitative data, researchers will assign “codes” or themes to relevant quotes and/or sections of video/audio/images.

5 of 22

What qualitative research is *not*

Anything automatic is not qual. Qual = deep engagement

  • Word counts/clouds (e.g. Voyant)
  • Machine learning e.g. text mining, keyword extraction, sentiment analysis
  • Artificial Intelligence e.g. predictive learning
  • Mixed methods
    • Qual is *one part* of mixed methods (think square <> rectangle)

6 of 22

Why qualitative methods?

Used by a wide range of fields, such as anthropology, education, nursing, information science, psychology, sociology, history, and user experience (UX) /marketing.

Qualitative data is varied: observations, interview or focus group transcripts, public or archival documents, memos, jottings, media (newspapers, magazines, tweets), visuals and more.

7 of 22

Why qualitative methods? (part 2)

Qualitative methods can generate rich, detailed data that leave cultural contexts and individuals’ perspectives intact allowing for deeper understanding of phenomenon.

Answers questions that quantitative studies cannot.

Helps researchers understand the “why”.

8 of 22

How QDA works

Collect data

Organize

Gather documents and source materials

Patterns

Articulating connections, relationships, patterns

Coding

Assigning concepts to parts of source material

Analysis

Create theoretical frameworks that match data

9 of 22

QDA Tools Comparison

Software

Platforms

Feature Set

NVIVO

Windows, Mac

Coding, Aggregation, Query, Visualisation, Stats

MaxQDA

Windows, Mac

Coding, Aggregation, Query, Visualisation, Stats

Atlas.TI

Windows, Mac

Coding, Aggregation, Query, Visualisation

Qualcoder

Linux, Windows (beta)

Coding, Aggregation, Query

Qcoder (R)

All OS’s

Coding, Aggregation, Query; access to R pkgs

Taguette

All OS’s & browser

Coding, Aggregation, Query

10 of 22

What Qualitative Data Software Does

Coding

Assigning Concepts

Classifying

Store Description Information

Memos

Writing & Describing Conceptual Ideas

Diagram

Articulating Connections, Relationships, Patterns

11 of 22

Qualitative Data Analysis Software (QDA)

QDA Software v. Traditional Qualitative Analysis

  • Discovery of Conceptual Outputs: Relationships, Discursive Patterns, Themes
  • Discovery of Measurable Outputs: Frequencies, Scales, Directionality
  • Data Management Perspective: Organizing Your Source Material

Atlas.Ti, NVivo, Dedoose, Taguette

12 of 22

Documents in QDA Research

TEXT (PDF, DOC, RTF, TXT)

IMAGES (JPEG, PNG)

AUDIO (MP3)

VIDEO

SPREADSHEET (EXCEL)

*Focus Groups.

13 of 22

Tools for Social Media Research (Just a Few)

  • Capturable
    • Built-in scraping with Atlas.ti or MaxQDA (limitations on captures)
    • R and Python Scraping (more complex and more versatile)
    • Social Media API’s and Data Repositories (there are lots, just take time to find them)
  • Semi or Non-capturable

14 of 22

Considerations for Social Media Research

  • What networks are you looking at?
    • People, information, space, etc.
  • What do you want to learn from your analysis?
    • Are you looking at one platform or many?
    • What is the importance of your platform and how might the capture misconstrue the data/ the real function of the platform?
  • What are your constraints?
    • Time, technical skill, overwhelming amounts of data, etc.
  • How might these considerations shape the tools you choose?

15 of 22

Other Considerations

  • Who/What are your research subjects?
  • What is your data management plan?
  • Do you need a digital hygiene or digital security plan?
  • What do you want your outputs/deliverables to be?
  • What methodology do you plan to use to get these deliverables?
  • What are the constraints on your project and how might this affect the outputs?
  • What are the ethical considerations of your research? How will you address these?
  • What might you be misinterpreting?

16 of 22

Supported by NYU Data Services

  • MAXQDA 2022
    • Twitter capture and analysis
      • Spreadsheet format (limited to previous 7 days capture) for Qualitative Analysis
      • Quantitative Analytics, (frequent hashtags, posts per day, etc.
      • Sentiment Analysis
    • Youtube Comment Capture
  • Atlas.ti
    • Can only Scrape from Twitter, captures directly into the program.
    • Captures Images and links to them online.
    • Evernote Web-Clipper import.=
    • Strong Auto-Coding (by Hashtag, User Handle, etc)
    • Sentiment Analysis, Regular Expression/ Named Entity coding

17 of 22

Using QDA Software

  • For the practicalities of using the two software packages in this tutorial, you can find links to instructional slides below.

  • Atlas.ti
  • MAXQDA 2022

18 of 22

Atlas.ti Web Scraping

    • LIMITED DATA COLLECTION: only collects data from previous week
  • MORE ANALYSIS CAPABILITIES
    • -Create co-occurrence tables
    • -Relational data in codes with Booleans

19 of 22

Atlas.ti Web Scraping

-This takes place through the software interface.

To Capture

  • Go to document, Import from Twitter.
  • Sign in to your account.
  • Specify Query (your hashtag or user account)
  • Select what you want auto-coded
  • Click “Import Tweets”
  • Atlas.ti isn’t great for things other than Twitter.

20 of 22

NCapture and Web Scraping

NCapture is a free web browser extension, developed by QSR, that enables you to gather material from the web to import into NVivo. You can use NCapture to collect a range of content—for example, articles or blog posts. You can also collect social media content from Facebook, Twitter and YouTube.

To add this to Chrome

  • Search NCapture for Chrome
    • Select add to Chrome
  • The NCapture Icon will appear in the Top Right Corner

21 of 22

NCapture and Web Scraping

In your browser, display the web page that you want to capture.

Click the NCapture button at the top of your browser. You can choose how you want to capture the web page and provide additional information about the captured content.

a Select a Source type. You can choose to capture the full web page as a PDF or just the main article on the page as a PDF. The 'Article as PDF' option is designed for pages that contain one large block of text, such as a news article or blog post.

b Review the default Source name. This will be the name of the source once it is imported into NVivo. If you want, you can change the name.

c You can enter a short Description or a Memo:

  • The description could describe what you are capturing—it becomes one of the source's properties.
  • The memo could record your reasons for capturing this data and how it relates to your research. It becomes a linked memo when the source is imported.

22 of 22

NCapture and Web Scraping

d In the Code at nodes box, you can enter one or more nodes that you want to code the content to. Then, when you import the content into NVivo, the entire source is coded at the nodes you entered.

  • If you enter a node that does not exist in your project—or incorrectly type the name of a node—a new node will be created during import. If you want to code to an existing node in your project (for example, a child node in a hierarchy), you can enter the hierarchical node name—for example, economy\tourism.

e Click Capture. The captured content is saved as an NCapture file which you can import into your NVivo project.

The same process is used to capture social media data, videos, etc.