Auto-Pedestrians Crashes
Data Analysis
By: Savani Vaidya
The Dataset
Research Questions
**Broader implications will be discussed later
Data Cleaning
Bubble graph shows the frequency of crashes based on weather conditions
Disclaimer: A significant part of the data was entered as unknown, so we have excluded this data.
It’s assumed that most crashes happen under suboptimal weather conditions (rain or snow). However, most crashes, based on this data, show that they occur most under clear skies.
This bar graph shows the frequency of crashes based on the day of the month.
One of our initial visualizations.
Nothing special can be interpreted from this graph.
Generally, the amount of crashes on any day of the month was about the same with little variability.
This bar graph shows the frequency of crashes based on the day of the week.
Again, nothing too special can be interpreted from this graph.
It can be assumed that the most traffic is on Friday while the least is on Sunday. The rest of the days of the week have similar traffic patterns.
This bar graph shows the frequency of crashes based on lighting conditions.
This is likely because there is the most traffic during the day.
Although it may be assumed that most crashes would occur when it’s darker, most actually happen during broad daylight.
This bar graph shows the frequency of crashes that result in the 5 types of possible injuries while illustrating how many of those were hit and runs and how many were not hit and runs.
Evidently, there are more not hit-and-run cases compared to hit-and-run.
Most crashes resulted in possible injury. It has resemblances to a Normal distribution, but skewed more to the left with more crashes having less severe injuries. As found before, more crashes happened at lower speed limits which helps to indicate why more crashes have smaller injuries.
This bar graph shows the frequency of crashes based on the speed limit.
Pedestrians are more likely to be in areas with lower speed limits. Further, speed limits of 25 are typically found near residential areas.
The most crashes happened where the speed limit was 25. Generally, they occurred at lower speed limits.
Amount of Crashes Based on Time of Day (cont.)
Pie charts on the left show how many crashes at intersections were hit and runs.
It is evident that even though most of the crashes were not hit and runs, it is interesting that there is a higher percentage of hit and runs in crashes that don’t happen at an intersection.
To be specific, 36.5% of hit and runs occurred in a non-intersection crash while only 35.5% of hit and runs occured in intersection crashes.
Bubble graphs shows crash frequency based on city or township.
The largest bubble is Detroit and it is quite significant due to the size of the bubble compared to the rest of the counties.
Based on this data, we can also assume that a lot of auto-pedestrian crashes happen in Detroit since it is a bustling city.
All R Code
Significance and Broader Implications