1 of 17

Team: E-Z Coders

Daniel, Darren, Dwayne, Jessica, and Uzma

2 of 17

Overview

One in five U.S. adults live with a mental illness

Any Mental Illness (AMI) and Serious Mental Illness (SMI)

Educational attainment, median household income, and AMI cases

According to the National Institute of Mental Health,

"Nearly one in five U.S. adults live with a mental illness”.

Two broad categories can be used to describe mental illnesses: Any Mental Illness (AMI) and Serious Mental Illness (SMI). AMI encompasses all recognized mental illnesses while SMI is defined as one or more mental, behavioral, or emotional disorder(s) resulting in serious functional impairment. For the purpose of this project we will just be focusing on AMI Cases.
The purpose of this group project is to find if the number of any AMI cases across the United States has gone up during the covid-19 pandemic compared to previous three years.
Other socioeconomic factors such as educational attainment and median household income will be analyzed to see the results for AMI cases.

3 of 17

Questions

What impact did covid have on any mental health in adults?

Was every state/region affected the same way or were some states affected more than others?

How accurately will our ML predict mental health issues across states/regions based on features?

1

2

3

4 of 17

5 of 17

6 of 17

Primary Data Search

Finding suitable variable for analysis
Recognizing and adapting to limitations of data as a byproduct of COVID-19
Initial Analysis focused on four regions and three age groups
Shifting from prevalence estimates to raw totals in hundreds of thousands for AMI

7 of 17

Data Transformation

How to best organize data to allow for optimal use of ML Model
Working from and modifying high level summary table to avoid wide data
Additional years
Organizing by Year rather than by State
Avoiding compromising analysis by expanding search

�

8 of 17

Database Process

Postgres SQL as our database to store final AMI and socio-economic tables
SQLAlchemy facilitated connection for dataframes from JN file to postgres local server

9 of 17

MACHINE LEARNING MODEL SELECTION

EDA was performed to get insight into the dataset.

Supervised Linear regression was chosen to predict the discrete nature of our numeric output feature

10 of 17

FEATURE ENGINEERING & SELECTION

11 of 17

Find the Best Performing Model using GridSearchCV

Linear Regression

DecisionTreeRegressor

GradientBoostingRegressor

12 of 17

REGRESSION MODEL TESTING

Model Name	Data Processing	R^2 Score
Multivariate Linear Regression	X_train, X_test, y_train, y_test = train_test_split(X,y, test_size = .25, random_state = 42)	Testing Score: 0.89
Gradient Boosting Regressor	gbr_params = {'n_estimators': 1000, 'max_depth': 3, 'min_samples_split': 5, 'learning_rate': 0.01, 'loss': 'ls'}	Testing Score: 0.95
Ridge Regression	Ridge_regressor = GridSearchCV(ridge, parameters, scoring = ‘r2’, cv=5)	Testing Score: 0.49

13 of 17

Gradient Boosting Regressor Predictions vs Actual Values

Predictions vs Actual Data

14 of 17

Interactive Tableau Dashboard

Link to Tableau Public Storyboard:

15 of 17

Conclusion

16 of 17

Find the total cases for different types of mental Illnesses per state’s county level

Collect more data for the covid-19 cases after two years

Population and geographic factors could be included

17 of 17

Things that We Have Done Differently

Added more mental illnesses to compare the increase in a certain type as the impact of Covid-19
Going back to as many as five more years to get the AMI cases
Create a web app to deploy our machine learning model for end users to predict the outcomes.