1 of 17

ACTUARIES & DATA SCIENCE

Jerome Tuttle, FCAS, CPCU

Retired Actuary

1

2 of 17

2

What is an actuary?

The mathematicians of the insurance industry.
A business professional who deals with the financial impact of risk and uncertainty.
Analyzes, manages, and measures the financial impact of risk and uncertainty.
Develops and validates models and communicates results to guide decision-making.
Actuaries in movies:

Jack Nicholson – Ben Stiller –

About Schmidt (2002) Along Came Polly (2004)

3 of 17

3

Insurance is a unique business

We don’t know our cost (claims) when we sell the policy, and with some claims we don’t know for many years.
We are not required to sell to everyone – similar to bank loans and college admissions.
We do not charge the same price to everyone. This is REQUIRED by law, e.g., FL Statute 627.062:

■ Rates may not be unfairly discriminatory.

■ A rate is unfairly discriminatory to a group of risks if the rate does not bear a reasonable relationship to the expected loss experience among the risks.

$$$$$ $$$

4 of 17

4

The intersection among math/stats, computer sci, & subject matter knowledge to extract meaningful insights from data translating into tangible business value.

What is data science?

5 of 17

5

Examples of data science e

Internet search engine algorithms.
Targeted advertising and recommendations.
Target Stores sent diaper coupons to the pregnant teenager before she told her father. (Folklore?)
Moneyball and sports analytics.
Better singles matching on dating websites.
Disease diagnosis, personalized healthcare recs.
Data driven crime prediction, facial recognition, terrorist forecasts.
Which tweets did Trump write, and which did his staff write?

6 of 17

6

Actuaries and data science

“Actuaries were among the first data scientists.” (Colin Priest, actuary turned data scientist at Data Robot, Singapore.)
Actuaries are strongest at math/stat and domain knowledge (we study insurance, besides math/stat).
Data scientists are strongest at computer science, especially coding, data manipulation and joining tables, theory of machine learning (training versus testing, overtraining), and machine learning algorithms.
Actuarial exams now include:

Generalized linear models, K-nearest neighbors, K-means clustering, Bayes classifier, decision trees, random forest, principal component analysis. Also a predictive anal specialty.

7 of 17

7

Randomly split data into training versus testing data

RMSE on test data = √[∑ (Actual – Predicted)² / n]

8 of 17

8

Some actuarial examples of data science techniques

If predictive modeling refers to estimating insurance costs, then actuaries have been doing this forever.
Today predictive modeling is computationally intensive, often testing all possible permutations of variables, transformations, etc
The 2 broad categories in data science are prediction and classification. Classification is predicting a category.
Prediction often involves types of regression. Linear regression is being replaced by more flexible Generalized Linear Models.
Classification includes:

Decision trees: underwriting

Clustering: territories

Principal component analysis: detect fraud

In the following examples, assume n independent variables and p data values.

9 of 17

9

For insurance rating, we group (hopefully) similar customers into classes and charge an average rate for the class. Classification is rarely perfect.

Before classification After classification

10 of 17

10

Insurance classes may include age, gender, urban / rural territory, marital status, miles driven, claims history, car type, car age, etc.

But within each n-dimensional slice, there is still considerable variability. A company wants to choose the better than average customers within each class to make a profit.

11 of 17

11

Generalized Linear Models: pricing

Traditionally we used classical linear regression, and we treated our pricing by class as multiplicative:
Base rate = $100

Times factor for Age i = 1.50

Times factor for Gender j = 1.20

Times factor for Territory k = 1.40, ... , etc.

This disregards interactions between classes and makes assumptions on normality and common variances.
GLMs consist of wider range of models with response variable assumed to be a member of exponential family.
Results in some factors being reduced, others increased.
Other applications of GLM:

Effect of telematics on claims

Underwriting score cards

Predict claims likely to settle far above their initial estimate

12 of 17

12

Decision trees: underwriting

Sequentially splits data into categories having similar values for dependent variables.
Uses statistic such as Gini Index to do split.
Possible variables: no. years renewed, occupation, premium payment history, telematics (speed, braking, time of day, etc.)

13 of 17

13

Clustering: territories

Partitions data into classes based on how closely data is grouped. Iteratively updates centers and re-partitions.
There is no dependent variable.
Another use is clustering similar occupations.
Florida has 28 rating territories in auto.

Yao, J. (2008). Clustering in ratemaking; applications in territories clustering. Casualty Actuarial Society Predictive Modeling Seminar

14 of 17

14

Principal component analysis: fraud detection

Reduces a large no. of variables to a smaller no.

of mutually uncorrelated variables that preserves

as much variability as possible.

Auto fraud (staged accident, inflated bills, collusive medical or body shops) hard to detect by first-level claim examiners.
No dependent variable. Data doesn’t say which claim is definitely fraud. Much fraud is undetected.
Data often ordinal, e.g. suspicion level = {1, 2, …, 5} for each variable (# chiropractor visits, hi vol med provider).
Goal is overall fraud suspicion score, iteratively weighting indiv variables based on their consistency and correlation to overall score.
Why do we study Linear Algebra? PCA uses orthogonal transformations, eigenvectors.

15 of 17

15

Credit scoring

A numerical score of a person’s creditworthiness. Ideally correlates with claims experience and provides additional predictive ability beyond traditional rating variables.
Permitted by FL Statute 626.9741.
Criticized as unfair to minorities and low-income people, although analysts dispute the criticism.
Variables include debt/asset ratio, late payment history…
Data science techniques: clusters, trees, GLM, PCA, …

■ Loss ratio = f (variables X1, …, Xn)

Often used in the decision whether or not to offer insurance, but not used to determine the price.
Many publicly available credit risk databases for credit cards and loans, e.g., Kaggle competition at Kaggle.com/c/GiveMeSomeCredit/

■ Probability of default = f (variables X1, …, Xn)

16 of 17

16

Text Analysis

Most data is numerical and is neatly captured in fields
Free-form text is a potential gold mine of information, but it requires effort to extract gold nuggets.
Misspellings, synonyms, stems like “ing” and “ed”, etc.
Look for freq. words, groups of words appearing together.
What kinds of claims are occurring? “Water” may be a captured field, but “water & basement” or “water & ceiling” may be more helpful in finding trends.
What words signal potential large claim amount?
How do people feel about insurance ads? Identify sentiments in customer surveys & tweets.
FL Statute 627.4145 requires insurance policies have min 45 score on Flesch readability test.

17 of 17

17

References

Frees, E. W., et. al. (2016). Predictive modeling applications in actuarial science. New York: Cambridge University Press.
Healy, K. (2018). Data visualization. Princeton, NJ.: Princeton University Press.
James, G., et. al. (2017). An introduction to statistical learning with applications in R. New York: Springer
Zhao, Y. (2013). R and data mining. San Diego: Academic Press.