1 of 23

When Sheep Shop

Measuring Herding Effects in Product Ratings with Natural Experiments

1

By Gael Lederrey and Robert West

WWW2018, Lyon

2 of 23

Motivation

2

Why?

Cheers!

3 of 23

Back to our example (Standardized ratings)

3

Lost Rhino Ice Breaker IPA

4 of 23

Studying Herding Effects

  • Herding effects are difficult to study��
  • Randomized experiments can be risky��
  • Naïve observational study are not suited��
  • => Natural experiment

4

5 of 23

Data

5

Breweries

Beers

Users

Matched

6 of 23

Methodology

6

Matched

First Rating

High

Low

Medium

Medium

Low

Medium

Later Ratings

?

?

?

?

?

?

7 of 23

Methodology

7

8 of 23

Naïve Observational Study

Randomized Experiment

Good Natural Experiment

Bad “Natural Experiment”

8

T

Treatment

P

Product

S

Rating Site

O

Outcome

?

9 of 23

Step-by-Step

  • Match the products�
  • Define the paired-treatment groups�
  • Balance the paired-treatment groups�
  • Aggregate paired-treatment groups�
  • Results

9

10 of 23

Assumptions

  • Treatment assignment is haphazard.�

  • Matched dataset reflects full dataset

10

11 of 23

  • Match the products

  • Names as TF-IDF vectors�
  • Computes all pairwise cosine similarities�
  • Iterate through small set to get a match in larger set�
  • Keep a match if
    • High similarity
    • Large difference between 1st and 2nd match

11

12 of 23

1.5 Standardize the ratings

12

Lost Rhino Ice Breaker IPA

13 of 23

2. Define the paired-treatment groups

3 groups:

  • High: rating > percentile 85
  • Low: rating < percentile 15
  • Medium otherwise

13

14 of 23

3. Balance the paired-treatment groups

4. Aggregate paired-treatment groups

Rather well balanced “out of the box”

14

BeerAdvocate

H

M

L

H

585

1,213

116

M

1,210

6,593

1,242

L

138

1,255

568

RateBeer

HH

HM

HL

MM

ML

LL

585

2,413

254

6,593

2,497

568

15 of 23

Validity of the assumptions

15

16 of 23

Treatment assignment is haphazard

  • Is Treatment independent of Site and Product?

  • Confounds captured by:�
    • Style
    • Country

Treatment assignment (approx.) independent of Site and Product!

16

HM

#(H)

Pr(H)

Style

BA

RB

BA

RB

Amer. IPA

105

88

0.54

0.46

Amer. Double/Imp. IPA

100

81

0.55

0.45

Amer. Pale Ale

32

32

0.50

0.50

Amer. Wild Ale

27

36

0.43

0.57

Country

BA

RB

BA

RB

United States

544

493

0.52

0.48

Canada

24

37

0.39

0.61

Belgium

30

31

0.49

0.51

England

7

13

0.35

0.65

ML

#(H)

Pr(H)

Style

BA

RB

BA

RB

Amer. IPA

64

64

0.50

0.50

Amer. Pale Ale

36

44

0.45

0.55

Fruit/Vegetable Beer

30

25

0.55

0.45

Amer. Amber/Red Ale

32

20

0.61

0.39

Country

BA

RB

BA

RB

United States

445

449

0.50

0.50

Canada

84

53

0.61

0.39

Belgium

28

36

0.44

0.56

Germany

20

21

0.49

0.51

HL

#(H)

Pr(H)

Style

BA

RB

BA

RB

Amer. IPA

7

5

0.58

042

Amer. Double/Imp. IPA

8

2

0.80

0.20

Amer. Pale Ale

3

3

0.50

0.50

Russian Imperial Stout

0

5

0.00

1.00

Country

BA

RB

BA

RB

United States

46

38

0.55

0.45

Canada

7

7

0.50

0.50

Germany

3

6

0.33

0.67

Belgium

5

2

0.71

0.29

17 of 23

Matched dataset reflects full dataset

  • BeerAdvocate more U.S.-centric / smaller��
  • => Ideal: Match all beers from BA and skew RB

Unbiased w.r.t. BeerAdvocate

(Should also hold for RateBeer)

17

18 of 23

Results

18

19 of 23

Herding

19

20 of 23

Conclusion & Future work

20

21 of 23

Conclusion

  • Fix problem => Hide reviews��
  • Conclusions holds when community overlaps��
  • Setup is cheap and does not interfere with system��
  • Applicable to other websites!

21

22 of 23

Future work

  • Currently applying methodology to Amazon UK v.s. Amazon US.

  • Newcomes more prone to herding?��
  • Early exposure lastingly alter one’s behaviour?��
  • Combine ratings from many websites => more truthful score?

22

23 of 23

Thank you!

Code available on Github:

https://github.com/epfl-dlab/when_sheep_shop

Contact the authors:

Gael Lederrey: gael.lederrey@epfl.ch

Robert West: robert.west@epfl.ch

dlab@EPFL: http://dlab.epfl.ch

23