1 of 13

Building Theory With Atheoretical Methods:�What Can Machine Learning Tell Us About Risk Updating?

SAMUEL DEWITT

PHD STUDENT

RUTGERS NEWARK CRIMINAL JUSTICE

2 of 13

The Risk Updating Process

  • Past Work
    • Correlations & Temporal Ordering Problems
    • Experiential Effects but disparate results persist

  • Contemporary Work
    • Bayesian-updating frameworks
      • Posterior=Prior+Experience
    • Conditional Predictions
      • Weighted personal/vicarious experiences

Image courtesy of:

http://www.lancaster.ac.uk/pg/jamest/Group/stats1.html

3 of 13

Moving Forward

  • Expanding Upon Current Theory
    • Frequentist Tradition
      • Risk aversion & updating magnitude/direction
      • Placing “experience” along a probability continuum
        • Appropriate distribution assumptions?
    • Bayesian Tradition
      • Modeling prior + posterior distributions explicitly
        • Bayesian Neural Networks
        • Random Forests

Source: http://www.fortnightly.com/fortnightly/2007/02/future-imperfect-ii-managing-strategic-risk-age-uncertainty

4 of 13

How Can Machine Learning Help Us?

  • Alternative Cost Functions
    • Typical Cost function – Quadratic
      • Problems?
    • Asymmetric Costs
      • Weighting of Negative & Positive Errors

  • Atheoretical Theory Building
    • Patterns we do not anticipate
      • Some may be important, others less so (i.e., shoe size)
      • Can still be dangerous – treading into unknown territory

  • Priors & Posteriors
    • Bayesian Neural Network Process

Image courtesy of:

http://www.pr-owl.org/basics/bn.php

5 of 13

Structure of the Analysis:�Bayesian Neural Networks (BNN)

  • Three Classes of “Layers”
    • Input
      • Theory-Informed
    • Hidden (Neurons)
      • Atheoretical (see – shoe size)
    • Output
      • Blended (averaged across posteriors)

  • Typical Use
    • Classification problems
      • May be extended to continuous outputs
      • See Gelman, Bois, & Jiang (1996)

  • Training & Test Sets
    • Testing generalizeability

Image courtesy of:

http://www.statsoft.com/textbook/statistics-glossary/n

6 of 13

Details of Analysis:�Data, Covariates, & Methods

Data

  • NLSY97
    • Training set – Waves 2-4
      • N=19946
    • Testing set – Wave 5
      • N=8545
    • Sampling weights used
      • Normed across waves

Covariates

    • DV
      • Percent risk arrest – auto theft
    • IVs
      • Prior perception (lagged DV)
      • Perceptual ambiguity
      • Experienced arrest rate (weighted)
      • Personal/Vicarious Gang Membership
      • Objective Risk – regional auto theft arrest rate
      • Personal Demographics

Methods

    • Models
      • Multivariate Regression
      • Bayesian Neural Network

    • Software
      • R version 3.1.1
      • “caret” package
        • Version 6.0-35

7 of 13

Details of Analysis:�Results of Linear Model

Model Statistics

  • Significant Covariates
    • Largely conform to expectations

  • Predictive Accuracy
    • R-squared (Training)
      • 16.3% of variation explained
    • Model Fit - Testing
      • ~17% of variation explained

8 of 13

Details of Analysis:�Results of BNN

Model Statistics

  • Number of Hidden Layers=1
    • Input Layer 1: 15 weights
    • Hidden Layer 2: 30 weights
    • Output Layer 3: 45 weights

  • Predictive Accuracy (R-squared)
    • At Layer 1: 16.8%
    • At Layer 2: 19.8%
    • At Layer 3: 20.2%

9 of 13

10 of 13

11 of 13

Conclusions

  • Improvements?
    • Minor, worth the computational effort?
    • Parameters make sense?
      • Not really….

  • What’s next?
    • Classification models
      • Regularized/Non-regularized BNNs
      • Random Forests – Ceilings & Basements
    • Accounting for risk aversion/seeking
      • Asymmetric costs

12 of 13

Building Theory With Atheoretical Methods:�What Can Machine Learning Tell Us About Risk Updating?

SAMUEL DEWITT

PHD STUDENT

RUTGERS NEWARK CRIMINAL JUSTICE

13 of 13

Details of Analysis:�Results from Linear Model (List)

  • Departures from expectations
    • Weighted arrest ratio insignificant
      • General sample problem?
    • Ambiguity – Negative
      • How to explain this result?