1 of 12

BigPharmaML

Final Project

2 of 12

Description

We have used two independent tables: community pharmacies and county demographic information for the United States.

In our project, we have tried to use data mining and a few machine learning algorithms to anticipate the pharmacy amount based on population density and other demographic factors. Both the community pharmacies and demographic datasets were collected at the county level. By connecting pharmacies with counties and minorities, we will be able to visualize the results. We also tried to bring out the correlation between different features present in the data.

3 of 12

Motivation

The motivation behind this project is to show relation of certain features(columns) with respect to states and population density.

The output obtained by this relation could be used to predict the number of pharmacies required for counties that are missing this information. Also, prediction of certain diseases can be done based on relevant features like population density, certain number of age groups etc.

4 of 12

Dataset

Our dataset includes the initial demography and pharmacy location information and the generated county level pharmacy data. The demographic dataset “Demography_USA" is already county level based. We generated the county level pharmacy location dataset in the file Pharmacy-County.csv.

We needed to carry out a lot cleaning and sorting for the Pharmacy-County dataset.

These datasets contain information related to different counties in each country. This data includes the count of population that have different illnesses, the number of male and female, count based on various ethnicities, age etc.

5 of 12

Visualizations - Statewise pharmacy count

6 of 12

Visualizations - Statewise CVD cases

7 of 12

Visualizations - Statewise diabetes cases

8 of 12

Visualizations - Statewise hypertension cases

9 of 12

Visualizations - Percentage of females with diabetes in each state

10 of 12

Visualizations - Percentage of males with diabetes in each state

11 of 12

Visualizations - Correlation matrix between all the features

12 of 12

Using ML techniques to predict information about pharmacies