1 of 33

Qualitative Coding and Business

Penn State

Smeal College of Business

M&O DBA | October 2023

2 of 33

Introduction

Anthropologist (2012-2016)

Research on how business owners navigate changes in policies and culture in Thailand, Vietnam, China, & US.

Research Manager (2017-2020)

Helped a large multinational non-profit revamp its core strategy and engage Gen Z employees.

University Instructor (2015-2021)

Taught classes on applied qualitative methods and intercultural communication. Trained over 50 researchers to use applied qual. methods.

Ph.D. Student (2022-Present)

Studying deep-level diversity and how new ideas spread in teams and organizations

3 of 33

Agenda

  • Products of Qual Coding
  • Coding Examples
    • "An Anthropologist walks into a bar"
    • Other business examples
    • Academic Examples
  • "Fake it till you make it" & Coding

4 of 33

Qualitative Coding

  • Focused on Interpretation and meaning
  • Analogy: Speedometer versus Roadmap
  • Optimizing Outcomes versus Creating a Conceptual map

5 of 33

Qualitative Coding Types

Grounded Coding

Top-down Coding

Starting Point

Raw data (inductive) OR general lens (abductive)

Pre-defined codebook (deductive)

Definitions of what “counts” as a code

Researcher gets to create the rules, typically starting with in-vivo quotes (direct quotes from the text)

Strict rules and definitions (see next slide for an example)

Outcomes & Uses

  • New codebooks (scripts)
  • New ways of seeing the world (lenses)
  • Deep, context-specific understandings (movies)

  • Counts of codes per document
  • Percentages of code overlaps
  • Interrater reliability scores
  • Compare populations

6 of 33

Top-down Codebook Template & Sample Entry

This codebook template and entry come from a project attempting to categorize all the kinds of pro-social responses companies made to the Covid-19 pandemic using corporate twitter data.

TEMPLATE

EXAMPLE

Code Name

Definition: (1-5 sentences that describe the “yes” criteria for this code)

Keywords: (3-10 unique words, separated by a comma, that would make this code instantly recognizable.)

Core Example:

  • Company name- what the company is doing. (explanation why it is a clear example of this code)

Edgecase, Yes: 

  • Company name- what the company is doing. (explanation what makes this fit even though it might seem like it shouldn’t)

Edgecase, No: 

  • Company name- what the company is doing. (explanation why this does not fit the code, even when it might seem like it should)

Access to Product & Services

Definition: Free or discounted or more available for different groups

Keywords: Free access, donate, waive fees, extended free trial

Example: 

  • Adobe offers online learning resources for free to students during Covid-19. (Adobe is offering something they already created, and it is related to the cause they are trying to help)

Edgecase, Yes

  • Havana eats offers PB&Js to hungry students during covid. (Havana eats doesn’t typically offer PB&J on the menus, but this is squarely in the purview of a restaurant and is not a new product or controversial.)

Edgecase, No

  • Everlywell creates a for-profit Covid-19 test kit (This is creating a product that hasn’t been created before, belongs in innovation)

7 of 33

Potential Products from Coding

  • Script. A theory that is intended for testing with other methods.
  • Lens. Structures to help see new insights in other phenomena.
  • Movie. Thick descriptions of a phenomenon in its context.

Adapted from a table describing the products of grounded theory research. See O’Connor, M. K., Netting, F. E., & Thomas, M. L. (2008). Grounded Theory: Managing the Challenge for Those Facing Institutional Review Board Oversight. Qualitative Inquiry, 14(1), 28–45. https://doi.org/10.1177/1077800407308907

Grounded theory (Glaser & Strauss; Eisenhardt)

Constructivist Grounded theory

(Charmaz)

8 of 33

An Anthropologist Walks into a Bar

  • How do customers view our promotions and practices? (Movie)
    • After using more traditional marketing analyses, Beerco tries qualitative methods.
    • Themes they find include that customers view their promotions as a "box of crap" and female servers feel "hot pantsed“ or overly sexualized .
    • They use these findings to revamp sales and improve employee experiences. 
  • How do we compete in a changing environment? (Script)
    • Lego developed and diversified product lines that made building easier and "cooler", but they were seeing small returns.
    • They find that kids wanted to achieve mastery, not just do an easy build.
    • They used that finding to cancel products that did not fit their strategy.
  • How can we capture new sources of growth? (Lens)
    • Coloplast wanted to explore the experiences of living with an ostomy bag to see if their assumptions were correct.
    • They found that customers stopped complaining because they had lost hope, not because the product was effective.
    • They created new product lines that helped them provide unique value to customers.

9 of 33

Other Business Cases for Qualitative Coding

  • How do we break down silos? (Movie)

Tech consulting firm. $900 million in revenues, 15,000 employees, recent merger and large growth.

  • How do we compete against incumbent companies? (Script)

B2B software company. $300 million revenue, 300 employees, growth-stage startup.

  • How do we attract and retain Gen Z talent? (Lens)

Big Non-profit. 60,000 full-time Gen Z volunteers, 7,500 full-time employees in 120 countries.

  • How do we expand into a new country? (Lens)

App company. $1 billion revenue, 1,400 employees.

10 of 33

Academic Case for Qualitative Coding

  • Why do employees stay silent about good ideas they have? (Milliken, et al. 2003) (Script)
  • What happens to ideas after they are raised? (Satterstrom, et al. 2020) (Movie)
  • How do workers deal with the ethics of a new technology implementation? (Rauch & Ansari 2022) (Lens)
  • How do public failures impact organizational legitimacy? (Chai & Doshi 2022) (Script)
  • How are strategic change initiatives implemented? (Gioia & Thomas 1996) (Movie)

11 of 33

General Overview of the Coding Process

An iterative process between:

  • Initial Coding (or open coding)
  • Secondary Coding
  • Axial Coding

Figure 2.1 is from Saldaña, J. (2013). The coding manual for qualitative researchers (2nd ed). SAGE.

12 of 33

Part-Whole Analysis

  • Hermeneutics- style of analysis that looks for many possible interpretations.
  • Iterates between looking at small parts (like individual comments, or even lines in a comment) and comparing them to the whole (like the full body of comments).
  • Note: You will see me moving from part to whole, and whole to part in my analysis.

13 of 33

Coding Example (MaxQDA)

Initial Steps:

  1. Upload documents into program
  2. Explore documents
  3. Highlight important lines
  4. Assign codes
    1. Initial Coding (Pink codes)
    2. Secondary Coding (Blue codes)
    3. Axial Coding
  5. Analyze codes

Screenshot from MaxQDA 2020, a computer-assisted qualitative data analysis software (CAQDAS).

14 of 33

Memo: Initial Coding

  • Initial coding of NYT article (pink codes)
  • Coded the following entries
      • Size of investment 

"There is more money at stake, so it just changes the calculus,” (The End of Faking It in Silicon Valley - The New York Times, p. 2)

      • Lack of money

"when the easy money dries up, everyone parrots the Warren Buffett's proverb about finding out who is swimming naked when the tide goes out" (The End of Faking It in Silicon Valley - The New York Times, p. 2)

  • Grouped entries as “What causes people to care about fraud”

15 of 33

Memo: Initial Coding

  • Coded entries (only one example shown below)

"Start-ups have many of the conditions most associated with fraud, Mr. Dyck said. They tend to employ novel business models, their founders often have significant control and their backers do not always enforce strict oversight." (The End of Faking It in Silicon Valley - The New York Times, p. 4)

  • Grouped entries as “What do people believe leads to fraud”
    • Novel business models
    • Founders with control
    • Backers not enforcing oversight

16 of 33

Memo: Initial Coding

  • Found two typologies to explore in comments: (scripts)
    • Bad behaviors
      • Playing Fast and Loose, Transferring money to shell companies, Misappropriations, Lying, Unethical behavior, Falsifying, Fraud...
    • Legal actions
      • Conviction, Scandals, Allegations, Lawsuits...
  • Next phase- see what everyday people think leads to people caring about fraud or what leads to fraud. 

17 of 33

Memo: Secondary Coding (Blue Codes)

Secondary Coding: What do laypeople think?

  • When do people care about fraud?
  • What do people believe leads to fraud?

Screenshot from MaxQDA 2020, a computer-assisted qualitative data analysis software (CAQDAS).

18 of 33

Memo: Secondary Coding

  • When do people care about fraud?
    • Size of investment
    • Lack of money
    • (Threat to) privilege

“Based on this article two things are clear, the only people persecuted for being fraudsters are women and of color. White male privilege and the absolution of criminal activity.” (Comment 15, Paragraph 1)

“As a member of the privileged older white male demographic, I find it immensely gratifying that these interlopers finally get their comeuppance. Spectacular wealth gained from exploiting other people is an exclusive right we have reserved to ourselves for centuries and it has been most distressing to see it squandered on this undeserving rabble. It good to see the plebians’ hard earned tax dollars put to work fighting this terrible to my exclusive good. Go get ‘‘em tigers!”” (Comment 4, Paragraph 1)

19 of 33

Memo: Secondary Coding (Round 2)

Additional Questions to Review in the Secondary Coding:

  • Do demographic characteristics play a role in when people care who commits fraud?
  • Two views of fraud seem to be emerging: Lack of common sense and bad intent.
  • Do commentors view investors and VCs as more culpable than CEOs?

20 of 33

Memo: Secondary Coding (Round 2)

Do demographic characteristics play a role in when people care who commits fraud?

    • Age, Race, Gender, East Coast (Banks and Wallstreet) versus West Coast (tech startups/ children)

“The REAL FAKING happens in the east coast BANKS. The Silicon Valley faking is from naïveté and obsessive ‘optimism’.” (Comment 3, Paragraph 1)

“Based on this article two things are clear, the only people persecuted for being fraudsters are women and of color. White male privilege and the absolution of criminal activity.” (Comment 15, Paragraph 1)

21 of 33

Memo: Secondary Coding (Round 2)

Two emerging views of fraud: Lack of common sense and bad intent.

    • Lack of common sense (comparing fraudsters to children, naiive, optimistic)
    • Lack of integrity (mentions of bad intent, knowing what they are doing, etc)

“The REAL FAKING happens in the east coast BANKS. The Silicon Valley faking is from naïveté and obsessive ‘optimism’.(Comment 3, Paragraph 1)

One is tempted to be a cranky old timer and say, see, these kids are getting their comeuppance. On the other hand, one hundred years ago, young entrepreneurs building the 20th century economy were doing the same dumb, dishonest, experimental stuff in their own new world. They had to crash, then learn lessons and rebuild their economy for the new century. So these kids need to try and fail and hopefully create something substantial at some point. (Comment 10, Paragraph 1)

22 of 33

Memo: Secondary Coding (Round 2)

Do commentors view investors and VCs as more culpable than CEOs?

    • Count the amount of responses that mention CEOs Versus Investors as being culpable. 

I explore this by:

  1. Creating a word frequency table to see words that are related to investors and founders
  2. Searching comments for additional themes related to investors and founders
  3. Coding new ideas

23 of 33

Memo: Secondary Coding (Round 2)

Word Frequency Table

Rank

Word

Documents

Documents %

Frequency

%

Word length

1

fraud

126

15.24

178

0.86

5

2

investor

114

13.78

153

0.74

8

3

company

112

13.54

159

0.77

7

4

work

107

12.94

123

0.60

4

5

business

98

11.85

132

0.64

8

6

valley

76

9.19

96

0.46

6

7

fake

74

8.95

96

0.46

4

8

silicon

74

8.95

86

0.42

7

9

time

74

8.95

84

0.41

4

10

look

70

8.46

82

0.40

4

11

only

69

8.34

77

0.37

4

12

come

68

8.22

77

0.37

4

13

tech

66

7.98

83

0.40

4

14

take

62

7.50

69

0.33

4

15

founder

61

7.38

81

0.39

7

Table created in MaxQDA 2020.

24 of 33

Memo: Secondary Coding (Round 2)

Comments By Key Terms (151 comments with either “Founder” or “Investor”)

Screenshot from MaxQDA 2020.

25 of 33

Memos: Secondary Coding (Round 2 summary)

Analysis Notes:

  • Do commentors view investors and VCs as more culpable than CEOs?
    • Maybe investors, but might be a reaction to article focusing on founders.
  • Do demographic characteristics play a role in when people care who commits fraud?
    • Age
    • Race
    • Gender
    • East Coast (Banks and Wallstreet) versus West Coast (tech startups/ children)
  • Two views of fraud seem to be emerging
    • Lack of common sense (comparing fraudsters to children, naiive, optimistic, etc)
    • Lack of integrity (mentions of bad intent, knowing what they are doing, etc)

26 of 33

Tentative developments for Round 3

Next steps:

  • Theoretical saturation of ideas
    • Be comfortable letting go of each of these ideas
    • Keep coding comments until no new ideas pop up in our categories of interest.
  • Theoretical sampling of next data to gather or view
    • Negative cases that might add nuance to our current cases
    • Deeper level data

27 of 33

Tentative developments for Round 3

Next steps (in the data we have):

  • Investors more culpable than CEOs by public
    • Code comments with references to Founder/ Investor. See if different themes emerge when looking at each party.
  • New money -> less culpable than old money
    • Naiivete, inexperience, trying new things are attributes associated with founders and CEOs
    • Willful ignorance, bad intentions attributed to financial institutions like banks, Wall Street, and venture capital firms. 
  • A difference between doing and creating an idea
    • Look for references to "create nothing" or "real work"

28 of 33

Tentative developments for Round 3

Next steps (other data we might collect):

  • Investors more culpable than CEOs by public
    • Interviews of investment firms and 401k investors about perceived culpability.
    • Experiment that manipulates a news story, see the reactions of participants.
  • New money -> less culpable than old money
    • View records of fraud trials and see what appeals lawyers make (in different states)
    • View Pew survey or GSS survey results to see if there are questions about attitudes towards different institutions.
  • A difference between doing and creating an idea
    • Look for themes of "create nothing" or "real work“ in the media.
    • Ethnography of accelerators to see how “doing” and “thinking” are or are not related.

29 of 33

Bias and coding

Positivism

Constructivism

Beliefs about truth

There is one underlying truth that the researcher is trying to discover

There are multiple valid interpretations that a researcher can justify.

Intended Outcomes

  • Comparing to other situations
  • Counting
  • Deep understanding of a situation
  • Creating a lens

Views of researcher

Researchers can and should be free from bias. The more detached from the topic, the better.

Biases research at every stage of research, by their initial research questions, methods they use, data they select, way they code, etc.

Relationship of a Researcher’s Bias to study outcomes

Undesirable, a threat to validity.

Valuable to uncover special insights that computers can not.

What should be done to bias

Eliminate or control for biases by using controls, interrater reliability, statistical methods, random sampling.

Transparency. Researchers should take notes of how they think their biases are influencing questions, responses, and interpretations.

Triangulation. Seek other data to uncover other ways of seeing a phenomenon.

30 of 33

Thank You!

PennState

Smeal College of Business

31 of 33

Appendix: Coding Software Comparison Rubric

  • CAQDAS (Computer-assisted Qualitative Data Analysis Software)
  • Link to rubric*

Uploading Data

Theme Finding and coding

Analysis Ease

Reports & Visualizations

Overall Score

Name

Subscription Costs

Perpetual Cost

Avg Score

Avg Score

Avg Score

Avg Score

Overall Avg Score

MaxQDA Plus

$55 (110 for 2 years)

$740.00

7

9

8

9.5

8.4

Nvivo 12 Pro

$118/ year (Student license)

$1,068

8.2

8.25

9

8.5

8.5

QDA Miner

$238.00

$595.00

3.8

5

7.5

5.5

5.5

QDA Miner W/ Wordstat

$398.00

$995.00

3.8

6.25

9

7.5

6.6

Excel

N/A

N/A

5.6

4.75

4

2.5

4.2

*Rubric created in 2019, and does not list all options like Dedoose, Atlas TI, R with Tidy text, etc.

Feel free to reach out if you are considering a software, I can provide further insights.

32 of 33

MaxQDA Resources

*Note, I don’t get a commission from MaxQDA. I just like it, and it is cost-effective.

33 of 33

MAXQDA