Hierarchical Probabilistic Modelling in Real Life
Benjamin Batorsky, PhD
Matt Moocarme, PhD
PyData NYC 2018
https://github.com/moocarme/pydata_2018
Slides: https://goo.gl/aBHpuX
bpben.github.io
@bpben2
Data Scientist, Viacom
PhD, Physics
Matthew Moocarme
Benjamin Batorsky
github.com/moocarme
First - Run the Docker container
...or clone the github repo
Powering the conversation about marketing spend
Can we produce a range of “realistic spends”?
Jain, D., & Singh, S. S. (2002). Customer lifetime value research in marketing: A review and future directions
Rossi, Peter E., and Greg M. Allenby. (2003) Bayesian statistics and marketing.
Probabilistic Modelling
What is a bayesian model?
Discrete
Continuous
P(hypothesis | data) =
P(data | hypothesis) P(hypothesis)
P(data)
What is a “hierarchical model”?
Typical approach (pooled):
Hierarchical (partially pooled):
http://twiecki.github.io/blog/2014/03/17/bayesian-glms-3/
Why use this approach?
http://twiecki.github.io/blog/2014/03/17/bayesian-glms-3/
Light lines are samples of Beta, dark line is average of the samples
ThriveHive’s data
A bit about the products (and what we expect)
Sampling from the posterior
Hamiltonian MCMC
Bayesian model evaluation
Watanabe (2010) https://dl.acm.org/citation.cfm?id=1953045
Piironen (2015) https://arxiv.org/pdf/1503.08650.pdf
Predictions with Bayesian models
Implementing in production pipeline
Spend range: SEM
Spend range: Social
Takeaways
THANKS: PyData, PyMC, and you!
https://xkcd.com/2059/