MenuSights
Helping at-risk patients choose low-cholesterol items at local restaurants
31.7%
of Americans have high LDL cholesterol
High cholesterol can be a result of:
High cholesterol can lead to heart disease
… but following a low cholesterol diet can reduce blood cholesterol by 10-15%
Source: CDC
...many independent restaurants don't have the resources to provide detailed nutritional information
While it’s easy to find nutritional information for chain restaurant menus...
MenuSights ranks menu items for cholesterol content using natural language processing and nutritional analysis
Training
Ground Truth
Logistic Regression: assigning probabilities to cholesterol category
Chicken Pad Thai
2-gram classification coefficient
Word classification coefficient
0.2
0.01
0.3
0.0
0.5
V. high
High
Medium
Low
Data workflow
Collect data
Process text
Model, apply and present
Coefficients describe not just cholesterol content of foods, but also other words commonly associated with a particular cholesterol level
Word | Coefficient |
“shrimp” | 4.2 |
“rum” | 4.2 |
“quiche” | 4.6 |
“frittata” | 4.86 |
“seafood” | 5.2 |
Very High Cholesterol
Word | Coefficient |
“veggie” | 2.83 |
“cauliflower” | 3.02 |
“japanese” | 3.10 |
“vegan” | 3.15 |
“moroccan chicken” | 4.4 |
Low Cholesterol
The model successfully predicts many high-cholesterol items, but fails with creatively-named menu items
Item | Prediction | True cholesterol |
“Popcorn Shrimp” | high | high�A serving of shrimp contains 64% of daily cholesterol |
“The Anti-Salad” | low | high�Description reveals that item is a meat platter! |
The MenuSights Logistic Regression Model predicts cholesterol with 57% accuracy
Tested model against ground truth
57% accuracy
(random chance predicts 25%)
Ground truth cholesterol
Predicted cholesterol
Low
Med
High
V. high
Low
Med
High
V. high
Andy Lane, Ph.D.
Postdoc: CRISPR Bioinformatics/Molecular Biology
Ph.D. in Molecular and Cell Biology
Why did I use home recipes to train a restaurant analysis engine?
Initial strategy:
Can I cluster similar recipes by ingredient?
Has extra data in the form of ingredient lists
Just recipe names & some descriptions
Match accuracy is refined using ingredient cluster matching
Chicken Pad Thai
Spinach Pizza
Mary’s Special Chicken Soup
Adjectivalized nouns
before dish name
Low-frequency uninformative descriptors
first if present
Primary dish name�end words
Scoping the problem: restaurant menu items aren’t the same as home-cooked recipes
Qualitative differences between recipes and menu items:
Sirloin Steak, 9 oz
Pete’s Awesome Sirloin