1 of 27

CSE 344: Section 4

Database Design, BCNF, ER

Oct 21st, 2021

2 of 27

Outline

BCNF decomposition

  1. Check if any function dependencies violates BCNF
  2. If at least one does, use it to decompose relation

ER review

3 of 27

Functional Dependencies

A functional dependency (FD) for relation R is a formula of the form A → B

where A and B are sets or individual attributes of R.

The meaning of the functional dependency is that for every value of A, there is a unique value of B. We say the FD holds for R if, for any instance of R, whenever two tuples agree in value for all attributes of A, they also agree in value for all attributes of B.

4 of 27

FD Example

For the relation Student(studentID, name, DateOfBirth, phoneNumber), assuming each student has only one name, then the following functional dependency holds

{studentID} → {name, DateOfBirth}

However, assuming a student may have multiple phone numbers, then the FD

{studentID} → {phoneNumber} does not hold for the table.

5 of 27

Keys

We call the attribute(s) that determines all other attributes in a schema to be a superkey.

Ex. for R(A, B, C), the set {A, B, C} determines all attributes in R

If no subset of a superkey is also a superkey, then that superkey is a minimal key or key for short. If a relation has multiple keys, each key is a candidate key

Ex. if AB → ABC and C → ABC, then {AB} and {C} are candidate keys for R

6 of 27

Closure Algorithm

Goal:

We want everything that an attribute/set of attributes determine

Observation:

  • If we have A → B and B → C, then A → C
  • Since A → A is always true, A → (A, B, and C)
  • Formal notation is {A}+ = {A, B, C}
  • Since the closure of A is all attributes, A is a key

7 of 27

Conceptual Design

Name

SSN

PhoneNumber

City

Fred

123-45-6789

206-555-1234

Seattle

Fred

123-45-6789

206-555-6543

Seattle

Joe

987-65-4321

908-555-2121

Westfield

SSN Name, City

8 of 27

Conceptual Design

Anomalies:

• Redundancy: repeated data for Fred

• Update anomalies: what if Fred moves to “Bellevue”?

• Deletion anomalies: what if Joe deletes his phone number?

9 of 27

Normalization

Removing redundant data from relations, using lossless decomposition to create multiple relations instead of one big table

  • 1NF
  • BCNF
  • 3NF

10 of 27

Conceptual Design

The BCNF (Boyce-Codd Normal Form) ---- A relation R is in BCNF if every set of attributes S is either a superkey or its closure is S itself

{S}+ = {S} or {S}+ = {all attributes in R}

11 of 27

BCNF Decomposition Algorithm

Normalize(R):

find X s.t.: X ≠ X+ and X+ ≠ [all attributes]

if (not found) then R is in BCNF

else let Y = X+; Z = [all attributes] + X - X+

decompose R into R1(X Y) and R2(X Z)

Normalize(R1); Normalize(R2);

X +

12 of 27

Example

The relation is R (A, B, C, D, E)

FDs : A → E, BC → A, and DE → B

Question : Decompose R into BCNF.

13 of 27

Solution

Notice that {A}+ = {A,E}, which violates the BCNF condition.

We split R to R1(A,E) and R2(A,B,C,D).

R1 satisfies BCNF now, but R2 does not because: {B,C}+ = {B,C,A}.

Notice that there is no E in R2 table so we don't need to consider the FD DE → B!

Split R2 to: R21(B,C,A) and R22(B,C,D)

R (A, B, C, D, E)

FDs : A → AE, BC → ABC, and DE → BDE

14 of 27

Decomposition

Lossless Decomposition: Rejoining all the decomposed relations will result in exactly the original data

Ex. R(A, B, C, D) → S1(A, B) and S2(B, C, D)

Lossy Decomposition: Rejoining the decomposed relations may add extra tuples to the original data

Ex. R(A, B, C, D) → S1(A, B) and S2(C, D)

15 of 27

Vertical Partitioning

16 of 27

Vertical Partitioning

17 of 27

Vertical Partitioning

18 of 27

Vertical Partitioning Applications

  • Advantages
    • Speeds up queries that touch only a small fraction of columns
    • Single column can be compressed effectively, reducing disk I/O
  • Disadvantages
    • Updates are very expensive!
    • Need many joins to access many columns
    • Repeated key columns add overhead

19 of 27

Horizontal Partitioning

20 of 27

Horizontal Partitioning

21 of 27

Horizontal Partitioning

22 of 27

Horizontal Partitioning Applications

  • Performance optimization
    • Especially for data warehousing
    • E.g., one partition per month
    • E.g., archived applications and active applications
  • Distributed and parallel databases

23 of 27

Integrity Constraints in ER Diagrams

Product

price

name

category

makes

makes

makes

Keys

Single Value Constraints (many-one) [possibly none]

Referential Integrity Constraints (many-one) [exactly one]

Other Constraints

>2

24 of 27

Examples of Relationships

  • one-one: ssn - student id
  • one-many: ssn - phone#
  • many-many: store - product
  • is-a: computer - PC and computer - Mac
  • has-a: country - city
    • What country does the city of Cambridge belong to?
  • exactly-one: a city belongs to exactly one country

25 of 27

Ex: Creating an ER diagram

Create an ER diagram containing the following information:

  • The id, name, gender, name of country of birth, and region of world (continent) for students and faculty
  • The major of students
  • The salary of faculty
  • The courses that each student takes(identified by course_id)
  • The department each course is offered in

⇒ What are the entities?

⇒ What are their relationships?

26 of 27

Ex: ER diagram to Table declarations

What would this database look like?

Table1(primarykey,Reference.key, attribute)

Table2(primarykey2,Reference.key, attribute2)

.

.

.

27 of 27

Ex: ER diagram to Table declarations

Country (name, region)

People (id,Country.name, gender, Pname)

Faculty (People.id, salary)

Student (People.id, major)

Course (courseID, dept)

Took (Student.People.id, Course.courseID)