1 of 29

DATABASE MANAGEMENT SYSTEMS Unit IV Normalization

K Vydehi

Assistant Professor

Department of CSE

Aditya University

2 of 29

DBMS K Vydehi, Asst Professor, CSE Dept.

2

Normalization is a process of making relations by decomposing them into smaller relations to

  1. Reduce redundancy:- some information is stored repeatedly.
  2. Eliminate update anomaly:- if one copy of repeated data is updated, an inconsistency is created unless all copies are similarly updated.
  3. Eliminate delete anomaly:- it may not be possible to delete some information without losing other information.
  4. Eliminate insert anomaly:- it may not be possible to store some information unless some other information is stored as well.

Definitions

3 of 29

DBMS K Vydehi, Asst Professor, CSE Dept.

3

Hourly Emps(ssn, name, lot, rating, hourly wages, hours _worked)

Hourly_Emps

Hourly_Emps2

Wages

4 of 29

DBMS K Vydehi, Asst Professor, CSE Dept.

4

An attribute is functionally dependent on another. If we can use the value of one attribute to determine the value of another.

y is functionally dependent on x

or

x determines y

Ex:-ssn Employee_name

ssn can be used to uniquely identify by an employee_name

Functional Dependency

FID

FNAME

CID

CNAME

3102

SHARMA

C01

DBMS

2352

PHANI

C02

CO

1201

HEMA

C03

PPL

353

SASTRY

C04

JAVA

FID FNAME

CID CNAME

5 of 29

DBMS K Vydehi, Asst Professor, CSE Dept.

5

Closure of a set of Functional Dependencies

Closure is a set of FDs is a set of all possible FDs that can be derived from a given set of FDs. This closure set is denoted by F+.

Armstrong Rules

x,y,z denote set of attributes over a relation R

6 of 29

DBMS K Vydehi, Asst Professor, CSE Dept.

6

Closure of a set of Functional Dependencies

Step 1:

Step2:

Step 3:

Step 4:

7 of 29

DBMS K Vydehi, Asst Professor, CSE Dept.

7

Attribute Closure

Input:-Set of Functional dependencies(F) and set of attributes(α)

Output:-stored in a variable result. Finding the super key

8 of 29

DBMS K Vydehi, Asst Professor, CSE Dept.

8

Attribute Closure

(AG)+ is a super key as result variable contains all the attributes in R

9 of 29

DBMS K Vydehi, Asst Professor, CSE Dept.

9

Normal Forms

First Normal Form :-A relation is in 1NF if every attribute in every row can contain only one single value.

10 of 29

DBMS K Vydehi, Asst Professor, CSE Dept.

10

First Normal Form

SUPPLIER_NO

STATUS

CITY

PART_NO

QUANTITY

S1

20

BOMBAY

P1

300

S1

20

BOMBAY

P2

200

S1

20

BOMBAY

P3

400

S2

10

CHENNAI

P1

300

S2

10

CHENNAI

P2

200

S3

10

CHENNAI

P3

400

S4

20

BOMBAY

P2

200

S4

20

BOMBAY

P1

300

FIRST(SUPPLIER_NO,STATUS,CITY,PART_NO,QUANTITY)

PRIMARY KEY(SUPPLIER_NO,PART_NO)

FD:- SUPPLIER_NO🡪CITY

CITY🡪STATUS

PART_NO🡪QUANTITY

11 of 29

DBMS K Vydehi, Asst Professor, CSE Dept.

11

First Normal Form

Insert Anomaly:-we cannot insert supplier information until that supplier supplies atleast one part.

Delete Anomaly:- If we delete a tuple in FIRST with supplier_no s3 and part_no p3, we lose the information that s3 is located in chennai.

Update Anomaly:- If supplier s1 moves from bombay to ahmedabad, to find every tuple s1 and bombay and to replace it.

Table FIRST contains too much information packed together. Unpack-to place shipment information in one table and supplier information in another.

12 of 29

DBMS K Vydehi, Asst Professor, CSE Dept.

12

Second Normal Form

A relation is in 2NF if and only if it is in 1NF and every nonkey attribute is dependent on the primary key.(no partial dependency)

SECOND(SUPPLIER_NO,STATUS,CITY)

PRIMARY KEY(SUPPLIER_NO)

SHIPMENTS(SUPPLIER_NO,PART_NO,QUANTITY)

PRIMARY KEY(SUPPLIER_NO,PART_NO)

FOREIGN KEY(SUPPLIER_NO) REFERENCES SECOND(SUPPLIER_NO)

SUPPLIER_NO

STATUS

CITY

S1

20

BOMBAY

S2

10

CHENNAI

S3

10

CHENNAI

S4

20

BOMBAY

SUPPLIER_NO

PART_NO

QUANTITY

S1

P1

300

S1

P2

200

S1

P3

400

S2

P1

300

S2

P2

200

S3

P3

400

S4

P2

200

S4

P1

300

13 of 29

DBMS K Vydehi, Asst Professor, CSE Dept.

13

Second Normal Form

Insert Anomaly:-we can insert information that s5 is located in Delhi,even s5 doesn’t currently supply any parts.

Delete Anomaly:-we can delete s3p3 in shipments table, we don't lose information that s3 is located in chennai.

Update Anomaly:-we can change the city for s1 from bombay to ahmedabad by changing it once in SECOND table.

Primary key attributes-Supplier_no,Part_no

Nonkey attributes-status,city,quantity

14 of 29

DBMS K Vydehi, Asst Professor, CSE Dept.

14

Third Normal Form

A relation is in 3NF if & only if it is in 2NF and every non key attribute is non transitively dependent on the primary key.(No transitive dependency).

Supplier_no🡪City

City🡪Status

Then Supplier_no🡪status

Insert Anomaly:-we cannot insert a supplier in bangalore have status of 50 until some supplier actually located in that city.

Delete Anomaly:-we can delete supplier s5 without losing status for delhi is 30. unpack- supplier information in one table & city information in another.

Update Anomaly:-we need to change the status for bombay from 20 to 40 we need to find every tuple for bombay for changing it.

15 of 29

DBMS K Vydehi, Asst Professor, CSE Dept.

15

Third Normal Form

SUPPLIER_NO

CITY

S1

BOMBAY

S2

CHENNAI

S3

CHENNAI

S4

BOMBAY

S5

DELHI

CITY

Status

BOMBAY

20

CHENNAI

10

DELHI

60

BANGALORE

50

CS(CITY,STATUS)

PRIMARY KEY(CITY)

SC(SUPPLIER_NO,CITY)

PRIMARY KEY(SUPPLIER_NO)

FOREIGN KEY(CITY) REFERENCES CS(CITY)

16 of 29

DBMS K Vydehi, Asst Professor, CSE Dept.

16

Boyce-codd normal form

A relation is in 3NF and every determinant is a candidate key. If we join BCNF two relations then 3NF relation is obtained.

Supp(supplier_no,city,part_no,quantity)

FD:- Supplier_no🡪city

part_no🡪quantity

Supplier_no,part_no are determinants & candidate key

SUPPLIER_NO

CITY

PART_NO

QUANTITY

S1

BOMBAY

P1

300

S1

BOMBAY

P2

200

S2

CHENNAI

P1

300

S2

CHENNAI

P3

400

S3

LUCKNOW

P4

500

17 of 29

DBMS K Vydehi, Asst Professor, CSE Dept.

17

Boyce-codd normal form

SUPPLIER_NO

CITY

S1

BOMBAY

S2

CHENNAI

S3

LUCKNOW

SUPPLIER_NO

PART_NO

QUANTITY

S1

P1

300

S1

P2

200

S2

P1

300

S2

P3

400

S3

P4

500

SUPPL(SUPPLIER_NO,CITY)

PRIMARY KEY(SUPPLIER_NO)

SHIPMENT(SUPPLIER_NO,PART_NO,QUANTITY)

PRIMARY KEY(SUPPLIER_NO,PART_NO)

FOREIGN KEY(SUPPLIER_NO) REFERENCES SUPPL(SUPPLIER_NO)

18 of 29

DBMS K Vydehi, Asst Professor, CSE Dept.

18

Properties of Decomposition

  1. Loss-less join decomposition

Let R be a relation schema and let F be a set of FDs over R. A decomposition of R into two schemas is said to be loss less join, if we recover original relation from decomposed relation. Otherwise it is lossy join.

We take projections of a relation and recombine them using natural join, to obtain original relation.

R

R1 R2

19 of 29

DBMS K Vydehi, Asst Professor, CSE Dept.

19

Dependency Preservation

It allows us to enforce all FDs by examining a single relation on each insertion or modification of a tuple.

Ex:-R(SUPPLIER_NO,STATUS,CITY,PART_NO,QUANTITY)

FD:- SUPPLIER_NO🡪CITY

CITY🡪STATUS

PART_NO🡪QUANTITY

R1(SUPPLIER_NO,STATUS,CITY)

FD1,FD2

R2(SUPPLIER_NO,PART_NO,QUANTITY)

FD3

R1,R2 Satisfies dependency preservation.

20 of 29

DBMS K Vydehi, Asst Professor, CSE Dept.

20

Fourth Normal Form

A relation is in 4NF if and only if it is in BCNF and all Multivalued dependencies in R are FDs out of keys.

Ex:-

Two or more multivalued facts about the same attribute occur in the same table.

R(A,B,C)

A,B,C-Attributes

B,C-Multivalued facts about A

MVD:-A🡪🡪B and A🡪🡪C

B is multi-dependent on A

Every FD is an MVD. But there exist MVDs that are not FDs.

21 of 29

DBMS K Vydehi, Asst Professor, CSE Dept.

21

Fourth Normal Form

EMP_NO

PROJECT_NO

SKILL

E1

1

ANALYSIS

E1

1

DESIGN

E1

1

PROGRAM

E1

2

ANALYSIS

E1

2

DESIGN

E1

2

PROGRAM

FDS: EMPNO🡪SKILL

EMPNO🡪PROJECT_NO

MVDS: EMPNO🡪🡪SKILL

EMPNO🡪🡪PROJECT_NO

22 of 29

DBMS K Vydehi, Asst Professor, CSE Dept.

22

Fourth Normal Form

EMP_NO

PROJECT_NO

E1

1

E1

2

EMP_NO

SKILL

E1

ANALYSIS

E1

DESIGN

E1

PROGRAM

EMP_PROJECT(EMP_NO,PROJECT_NO)

EMP_SKILL(EMP_NO,SKILL)

23 of 29

DBMS K Vydehi, Asst Professor, CSE Dept.

23

Fifth Normal Form

A relation is in 4NF and that has no join dependency

if we combine 2 decompositions or projections using natural join we need to obtain loss less join decomposition--join dependency

Dept->->Subject Dept->->Student

DEPT

SUBJECT

STUDENT

CSE

CS101

RAVI

IT

IT501

YOGI

CSE

CS102

SUMA

CSE

CS103

RAMA

ME

ME201

SUSHMA

EC

EC301

RANI

24 of 29

DBMS K Vydehi, Asst Professor, CSE Dept.

24

Fifth Normal Form

Dept->->Subject Dept->->Student

DEPT

SUBJECT

CSE

CS101

IT

IT501

CSE

CS102

CSE

CS103

ME

ME201

EC

EC301

DEPT

STUDENT

CSE

RAVI

IT

YOGI

CSE

SUMA

CSE

RAMA

ME

SUSHMA

EC

RANI

25 of 29

DBMS K Vydehi, Asst Professor, CSE Dept.

25

Fifth Normal Form

If we combine the two decompositions we obtained this table

It is lossy join decomposition and join dependency

DEPT

SUBJECT

STUDENT

CSE

CS101

RAVI

CSE

CS101

SUMA

CSE

CS101

RAMA

IT

IT501

YOGI

CSE

CS102

RAVI

CSE

CS102

SUMA

CSE

CS102

RAMA

CSE

CS103

RAVI

CSE

CS103

SUMA

CSE

CS103

RAMA

ME

ME201

SUSHMA

EC

EC301

RANI

26 of 29

DBMS K Vydehi, Asst Professor, CSE Dept.

26

Fifth Normal Form

We need to decompose in the relation into two relations

Dept->->Subject Subject->->Student

DEPT

SUBJECT

CSE

CS101

IT

IT501

CSE

CS102

CSE

CS103

ME

ME201

EC

EC301

SUBJECT

STUDENT

CS101

RAVI

IT501

YOGI

CS102

SUMA

CS103

RAMA

ME201

SUSHMA

EC301

RANI

27 of 29

DBMS K Vydehi, Asst Professor, CSE Dept.

Fifth Normal Form

We combine two relations using natural join. We obtain this relation. It is loss less join decomposition and no join dependency.

Surrogate Key: It is unique,notnull and updatable.

Ex:- Mobile_number

DEPT

SUBJECT

STUDENT

CSE

CS101

RAVI

IT

IT501

YOGI

CSE

CS102

SUMA

CSE

CS103

RAMA

ME

ME201

SUSHMA

EC

EC301

RANI

28 of 29

DBMS K Vydehi, Asst Professor, CSE Dept.

28

PROPERTIES

1NF

2NF

3NF

BCNF

DEFINITION

Every attribute in every row contains single value

Relation is in 1 NF and All the non-key attributes of the table are fully functionally dependent on the Primary key of the table.

Relation is in 2 NF and every non key attribute is non transitively dependent on the primary key.

A relation is in 3NF and every determinant is a candidate key

Loss less join decomposition

Always achievable

Always achievable

Always achievable

Sometimes not achievable

Dependency preservation

-

-

Possible

Either loss less join or dependency preservation is possible. not both

Anomalies

May allow some anomalies

May allow some anomalies

May allow some anomalies

Always eliminates anomalies

29 of 29

DBMS K Vydehi, Asst Professor, CSE Dept.

29

Thank You