1 of 11

Adult Census Income Prediction

1

2 of 11

2

Objective:

To develop an Annual Income Prediction Model. This will let the user to predict their income based on their specific qualities and attributes.

Benefits:

    • To discover one’s earning potential.

    • To find the key areas which impact your salary.

    • Salary insights for certain Job roles.

3 of 11

3

Architecture

4 of 11

4

Data Preprocessing

    • Most of the times the data available to us will not be in the desired format. They are raw data. We need to transform the raw data into the desired format so that it can be sent to the model. Handling of null values, punctuations and conversion of the datatypes into correct format is done during the pre-processing stage.

    • Once the above step is completed we do the feature engineering. Here we develop certain new features from the existing features based on our requirements or domain knowledge.

    • Encoding of the categorical features is carried out.

    • Scaling of the numerical features is done.

    • Finally, we select the most relevant features for building the model and the irrelevant features are dropped off.

5 of 11

5

Model Building

    • After the pre-processing of the data is completed. The dataset is used to train various models. Initially all the models are trained on default parameters and their AUC score is checked.

    • After that we do hyperparameter tuning. For each model a grid of parameters is passed using GridSearchCV, so that the models find its most suitable parameters based on their result.

    • After that all the models are trained on the best parameters and their F1 score is evaluated.

    • The best model is chosen based on F1 score and the AUC score of the models.

6 of 11

6

Cloud Setup

    • During this step we build the web API.

    • The setup of the database and the logging is done.

    • The technologies used for building web API is Flask web frame work, HTML, CSS and for database we are using Astra DB.

    • Logging help us to store the details of every action our API is performing. They include errors and the information messages of the every action.

7 of 11

7

Deployment

    • Once our API is built and tested locally we will approach towards deployment in a server. This will allow the users to access the income prediction model.

    • We will be using Heroku platform to deploy our API. We need to perform certain steps before deployment so that the model works smoothly for all.

8 of 11

8

User Input

After the web API is deployed to Heroku, the web API is accessible to the user. Whenever a user access the API they are asked to provide certain information's which will be used to detect their annual income category. The data provided by the users are stored in a database.

9 of 11

9

Preprocessing of User Data

Once the user provides their data, it is collected by the API and then it is further preprocessed according to the model requirements.

10 of 11

10

Loading Model, Prediction and saving into Database

    • The pretrained model is loaded in the backend and it is ready for prediction of the user’s annual income category.

    • The preprocessed data of the user is passed into the model for prediction and it predicts the category for the user.

    • The model’s output is stored into the database.

11 of 11

11

Thank You