1 of 12

DIABETES�PREDICTION�

USING MACHINE LEARNING

2 of 12

OUR – TEAM

Dhananjay Kumar Kushwaha (2001920100093)
Divy Narayan (2001920100096)

3 of 12

AGENDA

Introduction

Objective

Support Vector Machine (SVM)

Workflow

Design & Methodology

Conclusion

References

4 of 12

INTRODUCTION

World most chronic metabolic disorder

Diabetes is the most common metabolic disease

The modern life food habits have a high possibility for diabetes due to the added sugar and fat content added in the food.

The symptoms are the key factors to predict any disease.

Standard dataset has been used for this work with 75:25 ratio for training and testing. The results were compared between the existing methods and proposed method to show the proposed method has high accuracy

To predict the diabetes in our proposed system we will use dataset as input.

Dataset consists of several features they are Pregnancies, Glucose, Blood Pressure, Skin Thickness, Insulin, BMI, Diabetes Pedigree Function, Age.

4

5 of 12

OBJECTIVES

Aim of the Project :

To diagnose diabetes using machine learning algorithms at an early stage.

Scope of the Project :
The scope of our project is we can able to produce better results compared with existing systems, which is having higher prediction of accuracy.
By using machine learning techniques we can able to reduce processing time of data

5

6 of 12

SUPPORT VECTOR MACHINE(SVM)

6

Support-Vector Machines (SVMs, also support-vector networks) are supervised learning models with associated learning algorithms that analyze data for classification and regression analysis.

If the training data is linearly separable, we can select two parallel hyperplanes that separate the two classes of data, so that the distance between them is as large as possible. The region bounded by these two hyperplanes is called margin

7 of 12

WORK FLOW

Diabetes data

Data pre processing

Train Test split

Support Vector Machine classifier

New Data

SVM

Diabetic

(or)

Non- Diabetic prediction

8 of 12

8

DATA SET :
To predict the diabetes in our proposed system we will use dataset as input .

Dataset consists of several features they are Number of Pregnancies, Glucose, Blood Pressure, Skin Thickness, Insulin, BMI, Diabetes Pedigree Function, Age.

Diabetes data

DESIGN & METHODOLOGY

9 of 12

9

DATA PRE-PROCESSING :
Data Pre-processing is the first and important step in the proposed model .
Data Preprocessing is a technique that is used to convert the raw data into a clean data set.

When we collect data from real world it consists of redundant data, incomplete data, irrelevant data or it may contain errors . By using Data Pre-processing, we can able to remove all these issues.

Data pre-processing

DESIGN & METHODOLOGY

10 of 12

10

TRAIN TEST SPLIT :
In our proposed system the data is divided into training data and testing data .
Training data ensures that model learns only from the training data and tests its performance with the testing data .

The training data contains 75% of total dataset and testing data contains 25% of dataset.

Train Test Split

DESIGN & METHODOLOGY

11 of 12

REFERENCE

11

………………………………………………………….

You Tube

……………………………………………………….....

Teacher Guidance

……………………………………………………….....

Online Resources

……………………………………………………….....

12 of 12

THANK YOU !

Feel free to ask if you have any questions.