Final Presentation
Challenge Team
Salt baes
Objectives
User Groups
| 2009 | 2010 | 2011 | 2012 | 2013 | 2014 | 2015 | 2016 |
Elite | 5392 | 8296 | 10523 | 14889 | 15042 | 16293 | 20827 | 22334 |
Non-Elite At Least 1yr Elite | 38190 | 34532 | 29513 | 27471 | 25171 | 20963 | 16676 | 20564 |
Slack | 1118 | 2100 | 2619 | 2683 | 4508 | 3732 | 4182 | 6512 |
Potential | 5004 | 4776 | 7119 | 4661 | 4983 | 8716 | 8019 | 294 |
Non-Elite Never Elite | 979728 | 979728 | 979278 | 979278 | 979278 | 979278 | 979278 | 979278 |
Elite, Non-elite detection
Metadata Analysis
How is the number of tips affiliated with a Yelp user’s elite status?
We looked for distinguishing characteristics in the tip-giving patterns among different user groups.
Elites and Non-Elites have defined traits, but Potentials and Slackers are difficult to define based solely on tip behavior.
Statistical Analysis
|
| 2009 | 2010 | 2011 | 2012 | 2013 | 2014 | 2015 | 2016 |
Elite-Potential | T-statistic p-value | 0.7873 0.4343 | 1.7259 0.0846 | 3.1529 0.0016 | 1.9074 0.0565 | 3.2430 0.0011 | 4.6098 4.1e-06 | 5.1559 2.6e-07 | 1.4068 0.2952 |
Elite-Slacker | T-statistic p-value | 1.1692 0.2465 | 1.5195 0.1289 | 0.8958 0.3704 | 1.6833 0.0923 | 3.8020 0.0001 | 2.9549 0.0031 | 1.9329 0.0533 | 3.6995 0.0002 |
Elite-Non-Elite | T-statistic p-value | 1.4716 0.1455 | 3.9304 8.9e-05 | 4.8191 1.5e-06 | 6.751 1.6e-11 | 7.1942 7.3e-13 | 7.0875 1.5e-12 | 4.3850 1.2e-05 | 5.2215 1.9e-07 |
Potential-Slacker | T-statistic p-value | 1.2447 0.2311 | 0.2287 0.8191 | -1.626 0.1040 | -0.1304 0.8962 | 0.4338 0.6644 | -0.1909 0.8485 | -1.6784 0.0949 | 0.4047 0.6857 |
Potential-Non-Elite | T-statistic p-value | 0.7015 0.4883 | 1.6456 0.1003 | 1.7564 0.0792 | 3.5339 0.0004 | 3.0934 0.0020 | 2.6759 0.0075 | 0.8409 0.4005 | 2.1613 0.0310 |
Slacker-Non-Elite | T-statistic p-value | -0.616 0.5443 | 1.1635 0.2451 | 2.9218 0.0035 | 4.4498 9.1e-06 | 2.2711 0.0232 | 1.9021 0.0573 | 1.7178 0.0861 | 2.6972 0.0070 |
Significant at the 10% level
Significant at the 5% level
Significant at the 1% level
Text Analysis
Reviews reduced to 3 dimensions using T-SNE
Text Analysis
Non-Elite
Elite
Slacker
Potentials
Time Series Analysis
Regression Models
SVC:
R^2 0.6695, MAE 0.0869, MSE 0.00809
Kernel Ridge:
R^2 0.6935 MAE 0.02598, MSE 0.00405
Decision Tree (depth = 3):
R^2 0.8711, MAE 0.02120, MSE 0.00143
Anomaly Detection
Kernel Density Estimation
Dynamic Time Warping Clustering
Questions?
Subteam: Boss Llama
1st Topic: Survival Analysis
2nd Topic: Competition analysis
Features Considered
Fit a Poisson Process
Geographical “clusters”
Identifying Clusters
Topic Modeling
Topic Top Words
Other Characteristics of Restaurants
Types of Restaurants
Identifying User Clusters
Steps:
Result
The End
Thank you!