NYCe TAXI !
PROJECT GOALS
DATASET
Source, Size, Relevant Fields
1
SOURCE
Source: http://publish.illinois.edu/dbwork/open-data/
SIZE
Source: http://publish.illinois.edu/dbwork/open-data/
RELEVANT FIELDS
Unique Fields
Trip Fields
Fare Fields
ADDITIONAL SOURCE
TOOLS USED
Libraries, Infrastructure
2
LIBRARIES & INFRASTRUCTURE
DATA EXPLORATION & FEATURES
Data Cleaning, Data Visualization, Feature Engineering
3
DATA EXPLORATION
VISUALIZATIONS
Tip Amount Density Function
Tip Percentage Density Function
CORRELATIONS
Tip Amount vs Payment Type
Tip Amount vs Tolls Amount
Tip Amount vs Fare Amount
FEATURE ENGINEERING
MODELS & METRICS
Models, Algorithms, Analysis
4
MODELS
Tip Class w/ Zero Tip Data
Tip - No Tip
Tip Class w/o Zero Tip Data
Tip Percent Class w/ Zero Tip Data
Tip Percent Class w/o Zero Tip Data
Tip Amount
ALGORITHMS
METRICS - CLASSIFICATION & REGRESSION
| Baseline | SVM | Decision Tree | Random Forest | Adaboost |
Tip - No Tip | 52.33 | 98.516 | 98.249 | 98.259 | 98.299 |
Tip Class w/ | 47.66 | 81.253 | 81.565 | 81.593 | 81.208 |
Tip Class w/o | 45.07 | 67.98 | 68.002 | 68.002 | 66.72 |
Tip % w/ | 47.66 | 68.10 | 68.08 | 68.097 | 68.013 |
Tip % w/o | 41.47 | 41.58 | 42.12 | 42.14 | 41.97 |
| Baseline(Mean Abs Error) | Linear | SVM | Lasso |
Tip | 1.38 | 0.75 | 0.79 | 1.254 |
CONFUSION MATRIX
Tip - No Tip
Tip Class w/o Zero Tip
Tip Class w/ Zero Tip
Tip % w/ Zero Tip
INFERENCES & ROADMAP
Insights Gained, Future Work
5
INFERENCES
FUTURE WORK
QUESTIONS?
Thanks!
CREDITS
Special thanks to all the people who made and released these awesome resources for free: