1 of 94

Reputation Inflation

Apostolos Filippas, John Horton & Joseph Golden

1

2 of 94

The “sharing" or "gig" economy

2

Banking

Food

Hotels

Real Estate

Retailing

Healthcare

Transportation

Diversified Labor

Personal Services

Corporate Services

Rental Cars

3 of 94

Feedback is overwhelmingly positive

3

Distribution of reputation on eBay (Nosko and Tadelis, 2015)

4 of 94

Feedback is overwhelmingly positive

4

Distribution of reputation on Airbnb (Zervas, Proserpio & Bayers, 2016)

5 of 94

Is this "good"?

How would we know?

5

6 of 94

Empirical context for this paper

  • A dataset of feedback ratings from an online labor market
    • More than 5 million transactions, over 10+ years

6

7 of 94

Apostolos F.

7

Employer profile

Worker profile

bilateral reputation

8 of 94

Employer feedback on end of contract

8

numerical public feedback

textual public feedback

9 of 94

Feedback is overwhelmingly positive

9

10 of 94

Feedback is has become overwhelmingly positive

3.7/5

10

11 of 94

Feedback is has become overwhelmingly positive

11

12 of 94

Feedback is has become overwhelmingly positive

12

13 of 94

Feedback is has become overwhelmingly positive

13

14 of 94

Feedback is has become overwhelmingly positive

14

15 of 94

Feedback is has become overwhelmingly positive

15

16 of 94

Feedback is has become overwhelmingly positive

16

17 of 94

Feedback is has become overwhelmingly positive

4.8/5

17

18 of 94

Feedback is has become overwhelmingly positive

18

19 of 94

Feedback is has become overwhelmingly positive

19

Feedback scores pool in the highest “bin!”

20 of 94

Feedback is has become overwhelmingly positive

20

21 of 94

Feedback is has become overwhelmingly positive

21

22 of 94

Two potential reasons for the increase

  1. Improvements in "fundamentals"
  2. "Reputation Inflation" i.e., lower standards

22

23 of 94

Why do we care?

  • If it is inflation, problems that online reputation solves make a comeback
    • adverse selection and moral hazard
  • Top-censoring causes a loss of information (even if the cause is fundamentals)

23

24 of 94

Disentangling “fundamentals” & “inflation”

  • Painstaking approach
    • Cross out (control for) every hypothesis that pertains to fundamentals
  • Alternative approach
    • Find other measures of rater satisfaction and see if they inflate

24

25 of 94

New platform-created "Private" feedback as an alternative measure

25

Would you work with this freelancer again if you had a similar project?

  1. Definitely Yes
  2. Probably Yes
  3. Probably No
  4. Definitely No

26 of 94

26

27 of 94

27

28 of 94

28

29 of 94

29

24% of dissatisfied employers publicly gave 4+ stars

30 of 94

Private and public feedback over time while both were collected

30

31 of 94

An second alternative measure: written text

31

32 of 94

Extracting sentiment from text

  1. training set and labels
    • written feedback (input) and corresponding scores (labels) from a short period of time
  2. fit a predictive model
    • standard NLP/ML pipeline
      • NLTK: cleaning, NER tagging, POS tagging, stemming, stopwords; Scikit-learn: multiple algorithms and grid search, 5-fold CV
  3. apply model out-of-sample to get proxy of rater satisfaction

32

33 of 94

Actual versus predicted scores

33

34 of 94

Actual versus predicted scores

34

35 of 94

Average scores for phrases over time

35

36 of 94

At least 50% of increase is inflation.

Why did this happen?

36

37 of 94

Cause of reputation inflation

  • Raters might dislike giving "bad" feedback
    • Raters don't want to harm
    • Workers might avoid "harsh" raters
    • Workers retaliate
  • But this would only explain a bias, not the trend
    • what feedback is “good” and “bad” changes over time

37

38 of 94

38

39 of 94

A model of reputation inflation

workers

employers

39

40 of 94

Buyer is ex ante uncertain about type

???

qL or qH

  • quality qH, qL
  • util. gain α
  • fraction θ

40

41 of 94

Buyer observes most recent feedback ("good" or "bad")

  • quality qH, qL
  • util. gain α
  • fraction θ
  • feedback s = {G, B}

41

42 of 94

Wage is conditioned on what buyer sees

w|s

  1. pay wage Δw = w|G - w|B

  • quality qH, qL
  • util. gain α
  • fraction θ
  • feedback s = {G, B}

42

43 of 94

New output produced

  • pay wage Δw = w|G - w|B
  • receive output

output

  • quality qH, qL
  • util. gain α
  • fraction θ
  • feedback s = {G, B}

43

44 of 94

If the output is "good", rating is good

  • pay wage Δw = w|G - w|B
  • receive output
  • if good, positive feedback

output

s = G

  • quality qH, qL
  • util. gain α
  • fraction θ
  • feedback s = {G, B}

44

45 of 94

If the output is "bad", conundrum

  • pay wage Δw = w|G - w|B
  • receive output
  • if good, positive feedback
  • if bad, compare
    • b truth-telling benefit
    • ci Δw truth-telling cost ci~F(μ) idiosyncratic

output

s=G or s=B

???

  • quality qH, qL
  • util. gain α
  • fraction θ
  • feedback s = {G, B}

45

46 of 94

Buyer loves truth, but dislikes harming

  • pay wage Δw = w|G - w|B
  • receive output
  • if good, positive feedback
  • if bad, compare
    • b truth-telling benefit
    • ci Δw truth-telling cost ci~F(μ) idiosyncratic

output

s = B

  • quality qH, qL
  • util. gain α
  • fraction θ
  • feedback s = {G, B}

46

b > ci Δw

47 of 94

If harm great enough, gives "good"

  • pay wage Δw = w|G - w|B
  • receive output
  • if good, positive feedback
  • if bad, compare
    • b truth-telling benefit
    • ci Δw truth-telling cost ci~F(μ) idiosyncratic

output

s = G

  • quality qH, qL
  • util. gain α
  • fraction θ
  • feedback s = {G, B}

47

b < ci Δw

48 of 94

Future buyers consider lying prevalence

  • pay wage Δw = w|G - w|B
  • receive output
  • if good, positive feedback
  • if bad, compare
    • b truth-telling benefit
    • ci Δw truth-telling cost ci~F(μ) idiosyncratic

output

s=G or s=B

???

  • quality qH, qL
  • util. gain α
  • fraction θ
  • feedback s = {G, B}

48

% of truthful reporting p

49 of 94

Fraction employers telling truth matters

  • wage Δw(p) = w|G(p) - w|B(p)
  • receive output
  • if good, positive feedback
  • if bad, compare
    • b truth-telling benefit
    • ci Δw(p) truth-telling cost ci~F(μ) idiosyncratic

output

s=G or s=B

???

  • quality qH, qL
  • util. gain α
  • fraction θ
  • feedback s = {G, B}

49

% of truthful reporting p

50 of 94

An equilibrium truth-telling fraction

output

s = G

50

equilibrium truthful reporting pE = F(b/Δw(pE))

  • decreasing in idiosyncratic costs μ
  • increasing in truth-telling benefit

51 of 94

Equilibrium truth-telling

51

pE

b/μ

Parameters: qH=0.8, qL=0.2, θ=0.5, b=1, ci~N(μ,1)+

52 of 94

Evolution of the truth-telling fraction

  • employers match with workers
  • most employers start off truthful
  • employers may stop being truthful

52

T = 0

0

pE

p0

53 of 94

  • employers match with workers
  • most employers start off truthful
  • employers may stop being truthful

53

T = 1

1

pE

p1

p0

54 of 94

  • employers match with workers
  • most employers start off truthful
  • employers may stop being truthful

54

2

pE

p1

p2

p0

T = 2

55 of 94

  • employers match with workers
  • most employers start off truthful
  • employers may stop being truthful

55

3

pE

p1

p2

p3

p0

T = 3

56 of 94

  • employers match with workers
  • most employers start off truthful
  • employers may stop being truthful

56

pE

p1

p2

p3

p0

Equilibrium

57 of 94

57

pE

p1

p2

p3

p0

58 of 94

A direct test: A change in costs to bad feedback

  • Platform began revealing private scores (batched)
    • Workers do not know which rater gave which rating
  • If raters only care about retaliation, batching should "work"
  • But if raters care about harm, then public revelation of private feedback should kick-off inflation in private scores

58

59 of 94

Private feedback revelation

59

60 of 94

Private feedback revelation

60

61 of 94

Private feedback revelation

61

62 of 94

Conclusions

  • Much of the increase in feedback scores is due to inflation rather than improvements in "fundamentals"
  • A desire not to harm---rather than simply fear of retaliation---is enough to cause inflation
  • What can be done?
    • Lots of room for future research

62

63 of 94

Title: "Reputation Inflation"

Authors: Filippas, Horton & Golden

Link: http://www.john-joseph-horton.com/papers/longrun.pdf

63

64 of 94

Backup slides

64

65 of 94

Questions & Findings

  • Feedback score increase due to inflation or fundamentals?
    • reputation inflation occurs
    • at least 50% of the increase is due to inflation
                  • Α
  • Why do feedback scores inflate?
    • a theoretical model of inflation due to (i) reflected costs and (ii) endogenous feedback costs
    • platform intervention suggests that model’s predictions hold

65

66 of 94

Implications

  • Online reputation systems likely to suffer from inflation
    • sharing economy/P2P platforms more susceptible
    • product/firm review platforms less susceptible
                  • A
  • Platform design
    • insight: addressing long-run behavior is crucial
    • insight: private & public performance of new features
    • method: written feedback as a way to monitor inflation

66

67 of 94

Thank you.

67

68 of 94

Platform DON’Ts: additional levels of reputation

68

“This Is Spinal Tap,” 1984

69 of 94

Appendix

69

70 of 94

Future Research

  • Private feedback experiment
    • experimental revelation to employers
    • employers turn to less inflated signals of worker quality
    • improvements in allocative efficiency
  • Platform policy change
    • car-sharing platform removed all feedback information
    • consumers anonymously rate transactions
    • platform versus provider reputation

70

71 of 94

Inflation Matters: Informativeness of feedback over time

worker ability

time trend

noise

  • If noise explains more of the variance in feedback scores over time, then feedback is becoming less informative
    • a Bayesian employer would not update her ability posterior in the limit

71

72 of 94

Inflation Matters: Informativeness of feedback over time

72

Fit the regression in each quarter, report

73 of 94

Truth-telling fraction and cost of bad feedback

73

74 of 94

Feedback is has become overwhelmingly positive

74

Feedback over time from another online P2P market

75 of 94

Private feedback question

75

“Would you work with this freelancer again, if you had a similar project?”

76 of 94

Online reputation

% of U.S. adults that say online reputation VS government oversight help “a lot” to:

(PEW research center, 2017)

76

Online reputation

Government

Make consumers feel confident about transactions

46

25

Make firms accountable to their consumers

45

30

Ensure safety of products

and services

41

33

77 of 94

Disentangling “fundamentals” and “inflation”

78 of 94

Disentangling “fundamentals” and “inflation”

Alt. Measure

79 of 94

Disentangling “fundamentals” and “inflation”

  • Painstaking approach
    • cross out every hypothesis that pertains to fundamentals
    • problem: there is always another, hard to measure, …
  • Approach #2: employ alternative measures of rater satisfaction
    • private feedback a inconsequential
    • sentiment of written feedback - harder to change the meaning of words, harder to complain about “tone” than stars, not aggregated

79

80 of 94

Disentangling “fundamentals” and “inflation”

  • Painstaking approach
    • cross out every hypothesis that pertains to fundamentals
    • problem: there is always another, hard to measure, …
  • Approach #2: employ alternative measures of quality
    • private feedback a it is … private
    • sentiment of written feedback a hard to change the meaning of words, harder to complain about “tone” than stars, not aggregated

80

81 of 94

Private feedback as an alternative measure

  • The platform began collecting "private feedback" information in addition to public feedback
    • "Private" feedback was not accessible by other employers

81

82 of 94

Causes for feedback score increase

  1. improvements in “fundamentals”
    • platform-specific improvements
    • improvements in cohort quality
    • “bad” employers/workers drop out of the platform
    • employer/worker learning
  2. reputation “inflation” - raters are not more satisfied
    • drifting standards in what “good” and “bad” feedback is
    • increasing cost of giving “bad” feedback increases
    • increasing unwillingness to leave bad feedback

82

83 of 94

Questions & Findings

  • Feedback score increase due to inflation or fundamentals?
    1. reputation is subject to considerable inflation in a large online P2P marketplace
    2. decreasing usefulness of reputation
  • Why do feedback scores inflate?
    • a theoretical model of inflation due to (i) reflected costs and (ii) endogenous feedback costs
    • test model’s predictions exploiting a platform intervention

83

84 of 94

Questions & Findings

  • Feedback score increase due to inflation or fundamentals?
    • reputation is subject to considerable inflation in a large online P2P marketplace

84

85 of 94

Questions & Findings

  • Feedback score increase due to inflation or fundamentals?
    • reputation is subject to considerable inflation in a large online P2P marketplace
    • (in the paper) deteriorating informativeness of reputation
  • Why do feedback scores inflate?
    • a theoretical model of inflation due to (i) reflected costs and (ii) endogenous feedback costs
    • test model’s predictions exploiting a platform intervention

85

86 of 94

Questions & Findings

  • Feedback score increase due to inflation or fundamentals?
    • reputation is subject to considerable inflation in a large online P2P marketplace
    • (in the paper) deteriorating informativeness of reputation
  • Why do feedback scores inflate?
    • a theoretical model of inflation due to (i) reflected costs and (ii) endogenous feedback costs
    • test model’s predictions exploiting a platform intervention

86

87 of 94

Questions & Findings

  • Feedback score increase due to inflation or fundamentals?
    • reputation is subject to considerable inflation in a large online P2P marketplace
    • (in the paper) deteriorating informativeness of reputation
  • Why do feedback scores inflate?
    • a theoretical model of inflation due to (i) reflected costs and (ii) endogenous feedback costs
    • test model’s predictions exploiting a policy change

87

88 of 94

Robustness - additional checks

  • fixed cohorts
                  • A
                  • A
  • fixed job types
                  • A
                  • A
  • first-time employers
                  • A
                  • A
  • experienced employers

88

89 of 94

Questions & Findings

  • Feedback score increase due to inflation or fundamentals?
    • reputation inflation occurs
    • at least 50% of the increase is due to inflation
  • Why do feedback scores inflate?
    • a theoretical model of inflation due to (i) reflected costs and (ii) endogenous feedback costs
    • test model’s predictions exploiting a platform intervention

89

90 of 94

Questions & Findings

  • Feedback score increase due to inflation or fundamentals?
    • reputation inflation occurs
    • at least 50% of the increase is due to inflation
  • Why do feedback scores inflate?
    • a theoretical model of inflation due to (i) reflected costs and (ii) endogenous feedback costs
    • test model’s predictions exploiting a platform intervention

90

91 of 94

Why do feedback scores inflate?

  • “Bad” feedback is costly
    • employers genuinely don’t want to harm the worker
    • future workers might avoid “harsh” raters
    • workers retaliate (angry emails, unavailable for future consulting, … )
  • “Bad” feedback is endogenous
    • what constitutes “bad” feedback changes over time

91

92 of 94

Why do feedback scores inflate?

  • “Bad” feedback is costly
    • employers genuinely don’t want to harm the worker
    • future workers might avoid “harsh” raters
    • workers retaliate (angry emails, unavailable for future consulting, … )
  • “Bad” feedback is endogenous
    • what constitutes “bad” feedback changes over time

92

93 of 94

Private feedback

  • The platform began collecting private feedback information in addition to public feedback (2013, 2014)
  • Private feedback question and private numerical score (2014)
  • Private feedback was not accessible by other employers

93

94 of 94

Private feedback revelation

  • The platform found that private feedback was more informative, and believed they had solved the problem
  • The platform announced that private feedback would start being employed in an anonymized, aggregated public score
  • Exogenous cost to workers’ cost of private feedback

94