The variation of car insurance among the US states
Sameer C.
Aryan S.
Baltej N.
12/10/22
Overview
Describe the Data
States with more teenager drivers have less fatal crashes. The fitted line graph is consistent and all the data is pretty close is each other not scattered at all.
Insight 1-Sameer C
With a sample of 50 states, is there strong enough evidence to
conclude that the true average number of people who commute 45 mins or more exceeds people who commute 4 mins or less by 25% at a significance level of .05?
Commute 45 mins or more to commute 4 mins or less: P-Value = 1.00 > .05 so we fail to reject that commute 45 mins or more is at least 25% more than commute 4 mins or less.
Hypothesis 2-Sameer C
Insight
There was a strong correlation between the cost of living, and avg cost of insurance, in automatic learning on Openmarkov there was a arrow directly linking both. Showing strong correlation.
Appendix 2: Sameer C
Goals -Sameer C.
Query 1
State’s Average Car Insurance VS Commute To Work
Identifying the outliers
Part 1. What states have the most commute to work and how does it affect with their car insurance?
California
New York
Texas
Michigan
Connecticut
Massachusetts
Florida
New York And Louisiana
Identifying the outliers and choosing to keep them because they are necessary to keep. We can’t remove the outliers states because they are different variables that are affecting average
Identifying the outliers and choosing to keep them because the states that have the highest numbers of people commuting to work 45 mins or more are geographically big and have more traffic compared to rest of the country
Part 1. What states have the most commute to work and how does it affect with their car insurance?
This fitted line plot demonstrates an increase in the average cost of auto insurance when more people commute for 45 minutes or more. The scattered plot, however, indicates that there isn't a strong relationship because of the low confidence.
Part 2: Is there a relation to state commute to work and fatal crashes?
These 2 graphs exhibit a somewhat strong correlation. There are more fatal crashes in those states when more people commute to work for 45 minutes or longer.
Query 1: Conclusion
Part 1: The average cost of insurance increases as commute times increase, but there is little correlation. It's possible because you have to disclose their monthly mileage apply for auto insurance, thus it stands to reason that those with more mileage will pay more for coverage.
Part 2: The correlation between fatal crash rate and number of people who commute to work for 45 mins or more. But fatal crash directly doesn’t have that much effect on average cost of insurance.
Query 2
State’s Average Car Insurance VS Crime Rate
Part 1:What states have the most crime rate and how does it affect with their car insurance?
Florida
Texas
California
It was anticipated because all of these outlier states are the most populous in comparison to the other states, which explains why there are more number of crimes there.
Part 1:What states have the most crime rate and how does it affect with their car insurance?
Florida= 2947
California= 1752
Texas= 1752
According to the correlation the cost of car insurance doesn’t depend on the crime reported in that state.
Despite the fact that these three states have the highest crime rates, their car insurance is still approximately average. This indicates that there is no direct correlation between the number of crimes and the cost of car insurance.
Part 1: What states have the most crime rate and how does it affect with their car insurance?
Conditional Probability: I noticed that lower crime reported and crime index results in decrease of fatal accidents, teen drivers, commute, uninsured health coverage and increase in alcohol consumption.
Even whereas crime index doesn't directly affect average car insurance, it does have an pretty strong impact on other factors that may indirectly affect average car insurance.
Adjusting the data to index
Part 2:Is there a relation to state crime rate and race?
Part 2: Is there a relation to state crime rate and race? Does this correlation cause increased car insurance?
By indexing the data, we may determine that the recorded crime is more likely to occur due to the population of those states. With index the data is more acute to the population.
Part 2:Is there a relation to state crime rate and race?
Part 2: Is there a relation to state crime rate and race? Does this correlation cause increased car insurance?
Conditional Probability: I noticed that lower crime reported and crime index results in decrease of fatal accidents, teen drivers, commute, uninsured health coverage and increase in alcohol consumption.
Individually observing the two variables
Part 3: Is there a relation to state commute to work and fatal crashes?
Part 3:Is there a relation to state crime rate and alcohol consumption? Does this correlation cause increased car insurance?
Part 3: Is there a relation to state crime rate and alcohol consumption? Does this correlation cause increased car insurance?
Downward slope between alcohol consumption and crime index. Expected it to be complete opposite. But the fitted line plot doesn’t have much confidence level to it.
Query 2: Conclusion
Part 1: Crime rate in general doesn’t have that much of a affect directly on the average car insurance.
Part 2: The conclusion from past 2 from the correlation plot is that asian population is way less associated with crime compared to black and white population which makes the majority of United States
Part 3: Alcohol surprising had a negative correlation with crime rate. Which is concluded from the alcohol consume rate of each state compared with crime rate of that state
Query 3
State’s Average Car Insurance VS Fatal Accidents
Part 1: What states have the most teen drivers and how does it affect with their car insurance?
Outliers: Texas and California but instead of taking them out. I made index of teen drivers
Texas
California
Part 1: States have the most teen drivers and how does it affect with their car insurance
Make index by dividing the number of drivers by the population of the states because it was bias due the fact that those outliers were from the higher population states.
Part 1: What states have the most teen drivers and how does it affect with their car insurance
We see negative correlation between average price of car insurance and teen driver index. I believe it goes back to the whole lower average income results in lower car insurance because they tend to choosing no fault insurance which is cheaper than full coverage.
Part 2: Is there a relation to state fatalities and teen drivers? Does this correlation cause increased car insurance?
After Adjusting the data we see a negative slope. The more teen drivers the less the fatal accidents
Query 3: Conclusion
Part 1: Found negative correlation between teen drivers and car insurance.I believe it goes back to the whole lower average income results in lower car insurance because they tend to choosing no fault insurance which is cheaper than full coverage.
Part 2: Found negative correlation between teen drivers and fatal crashes.The more teen drivers the less the fatal accidents
Query 4
Part 1: What states have the most high education and how does it affect with their car insurance
Massachusetts have the most Advanced degree population and then Maryland.
Median of Advanced degree = 0.092
States car insurance cost in relation to education
Mean of Advanced degree = 0.09794
Median of Advanced degree = 0.092
Range of Advanced degree = 0.103
Standard Deviation
of Advanced
degree = 0.0248394
Part 1: What states have the most high education and how does it affect with their car insurance
Comparing Advance degree to Max Insurance vs Avg Insurance
Comparing the linear formula how for max insurance it starts at 2311 vs how it is in avg insurance.
Part 2: Is there a relation to state higher education and max amount spent on car insurance? Does this correlation cause an increase in average car insurance?
Part 2: Is there a relation to state higher education and max amount spent on car insurance? Does this correlation cause an increase in average car insurance?
Advance and Bachelor degree increase the average cost of insurance by max.
No correlation between advance degree and commute to work. But the higher education resulted in less uninsured health coverage
States car insurance cost in relation to education
Part 3: Is there a relation to state commute to work and higher education ? Does this correlation cause increased car insurance?
Query 4: Conclusion
Part 1: Massachusetts have the most Advanced degree population and then Maryland.
Part 2: Advance and Bachelor degree increase the average cost of insurance by max.
Part 3: No correlation between advance degree and commute to work. But the higher education resulted in less uninsured health coverage
With a sample of 50 states, is there strong enough evidence to conclude that the true average number of bachelor degree exceeds Advanced degree by 15 at a significance level of .05?
Hypothesis 1
Bachelor to Advanced: P-Value = .001 < .05 so we reject that bachelor's degree is at least 15% more than advanced degree.
With a sample of 50 states, is there strong enough evidence to
conclude that the true average number of people who commute 45 mins or more exceeds people who commute 4 mins or less by 25% at a significance level of .05?
Commute 45 mins or more to commute 4 mins or less: P-Value = 1.00 > .05 so we fail to reject that commute 45 mins or more is at least 25% more than commute 4 mins or less.
Hypothesis 2
Hypothesis 3
Insight 1
States with more teenager drivers have less fatal crashes. The fitted line graph is consistent and all the data is pretty close is each other not scattered at all.
Insight 1
Another interesting thing was Downward slope between alcohol consumption and crime index. Expected it to be complete opposite. But the fitted line plot doesn’t have much confidence level to it. But it is still a negative correlation
Insight 2
Appendix 3
Baltej
Outliers
This is a boxplot of the average cost of car insurance, for all the states. There are some extreme outliers that are shown in the box blot so to get a more accurate mean I think without these outliers in most cases there would be more accurate results.
Is there a normal distribution of the avg car insurance across the United States?
With outliers
Without outliers
Conclusion
With the outlier the distribution was skewed to the left, but since most of these are special cases that don’t fit into the majority I removed them so I could find a better avg for the entire country as a whole.
With the outliers removed the distribution of the data is normal, as shown on the histogram with line of best fit in the slide before.
Topic 1
Is the average cost of car insurance a normal distribution?
Is there a normal distribution of snowfall, and rainfall across the United States?
Is there a correlation between the amount of snowfall, or rainfall, and the cost of car insurance?
Rainfall, and Snowfall Histograms
Rainfall and Snowfall Histograms
From the histograms in the previous slides we can tell that rainfall is skewed to the right, and snowfall is skewed to the left. Both the histograms have 2 peaks as well.
This mean that in America as a whole there is a higher amount of rainfall, than snowfall. We are a more tropical country in general.
The histogram for rainfall more closely resembles the one of the avg car insurance in America leading me to believing they are more closely correlated.
Rainfall, Snowfall, and Car Insurance Probability Plot
Conclusion
From the probability plots of average car insurance cost, snowfall, and rainfall we can conclude that only average car insurance has a normal distribution. While rainfall is closer to a normal distribution than snowfall, they are not 100% normal distributions.
Correlation between car insurance and Rainfall
There is a negative correlation between, rainfall and average cost of car insurance. Surprisingly on average the higher the amount of rainfall the lower the cost of car insurance.
Correlation between car insurance and Snowfall
There is also a negative correlation between the cost of car insurance, and snowfall. Surprisingly just like rainfall the more it snows in a state the cheaper car insurance is predicted to be. But since the x value in the line of best fit equation is higher than rainfall, this show that there is a bigger negative correlation.
Insight on Snowfall and Rainfall
Open markov shows that as snowfall, or rainfall increase the avg cost of car insurance also decrease. Showing that snowfall, and rainfall are not a major factor in the increase in average car insurance across the country.
Topic 2
Is there a correlation between the total population, and average cost of car insurance?
How does the race population in a state effect the avg cost of car insurance?
Total Population and Avg. Cost of Car Insurance
If every single population is at its highest in a area, if we look at the avg cost of insurance it has jumped from .2287 to .2917 showing a significant increase in the chance of playing for car insurance on the higher end.
This fitted line plot also shows a positive correlation with the population, and average price of car insurance.
Race Population and Avg. Cost of Car Insurance
White
Indian
Asian
Other
Black
Insight on race population, and avg. cost of car insurance
In conclusion the race population which affected the increase in car insurance was, in the order of Asian, Other, White, Black, and Indian. This means that when there is a high amount of Asians in state the likelihood of having to pay more for car insurance increases more than any other race.
Topic 3
Does cost of living affect the avg car insurance around the the country?
How does the average cost of car insurance compare around the country? The states that I will test are Ohio, California, and Texas.
Cost of Living index and Avg Car Insurance
There is a positive correlation between the cost of living index, and avg cost of car insurance. The higher the cost of living index the more likely there is to be a high cost for car insurance.
Insight
When the cost of living index was increased to the max it shows a significant increase in the avg of car insurance from .2287 to .3326. Much higher than the population effect.
Distribution Plots
Conclusion
States like California, which have a high population, and high cost of living have high car insurance more that 96% of states.
States like Texas that have a high population but avg cost of living are in the 74th percentile.
But states like Ohio which have a low cost cost of living and avg population have to pay low amounts of car insurance only more than 4% of states.
This shows that the cost of living has a bigger effect on that the state population.
Hypothesis 1
Hypothesis: The avg cost of car insurance is lower than Michigan(3785), because michigan is very high.
Significance Level: .01
By Hand
Hypothesis 2
Hypothesis: States with a cost of living higher than 100% will have a 5% higher avg car insurance cost than states that are under 100%
Significance Level: .01
P-Value is .146 > .01 so the null hypothesis is not rejected. So this means that the States with above 100% cost of living index will most likely cost more than 5% of the total average.