1 of 23

CSE 414: Section 5

FDs, Keys, and BCNF

10/24/24

2 of 23

Announcements

  • Homework 2 grades released
  • Homework 3 late due date is tomorrow at 11pm
  • Check pinned Ed post for common mistakes

3 of 23

Functional Dependencies

A functional dependency (FD) for relation R is a formula of the form A → B

where A and B are sets or individual attributes of R.

We say the FD holds for R if, for any instance of R, whenever two tuples agree in value for all attributes of A, they also agree in value for all attributes of B.

DEPENDENCY != CAUSATION

*in other words: A determines B, or A is kinda sorta a key for B

4 of 23

Example

Student(studentID, name, dateOfBirth, phoneNumber)

studentID

name

dateOfBirth

phoneNumber

1234

Allison

01-09-1999

253-876-1028

1235

Ben

03-10-2000

206-999-1923

1236

Chris

05-23-1999

206-127-1010

1234

Allison

01-09-1999

253-307-1678

5 of 23

Example

Do the following FDs hold?

{studentID} → {name}

{studentID} → {name, dateOfBirth}

{studentID} → {phoneNumber}

studentID

name

dateOfBirth

phoneNumber

1234

Allison

01-09-1999

253-876-1028

1235

Ben

03-10-2000

206-999-1923

1236

Chris

05-23-1999

206-127-1010

1234

Allison

01-09-1999

253-307-1678

6 of 23

Example

Do the following FDs hold?

{studentID} → {name}

{studentID} → {name, dateOfBirth}

{studentID} → {phoneNumber}

studentID

name

dateOfBirth

phoneNumber

1234

Allison

01-09-1999

253-876-1028

1235

Ben

03-10-2000

206-999-1923

1236

Chris

05-23-1999

206-127-1010

1234

Allison

01-09-1999

253-307-1678

7 of 23

Example

Do the following FDs hold?

{studentID} → {name}

{studentID} → {name, dateOfBirth}

{studentID} → {phoneNumber}

studentID

name

dateOfBirth

phoneNumber

1234

Allison

01-09-1999

253-876-1028

1235

Ben

03-10-2000

206-999-1923

1236

Chris

05-23-1999

206-127-1010

1234

Allison

01-09-1999

253-307-1678

8 of 23

Example

Do the following FDs hold?

{studentID} → {name}

{studentID} → {name, dateOfBirth}

{studentID} → {phoneNumber}

There are two tuples that agree on studentID but don’t agree on phoneNumber!

studentID

name

dateOfBirth

phoneNumber

1234

Allison

01-09-1999

253-876-1028

1235

Ben

03-10-2000

206-999-1923

1236

Chris

05-23-1999

206-127-1010

1234

Allison

01-09-1999

253-307-1678

9 of 23

Trivial FDs

Axiom of Reflexivity

A trivial FD is one where the RHS is a subset of the LHS

That is, every attribute on the RHS is also on the LHS

The two FDs below are trivial:

{name} → {name}

{studentID,name} → {name}

10 of 23

Inference Rules

  • If b is a subset of a, then ab (reflexivity or trivial FD)
    • then a → a (if a and b are subsets of each other, aka a = b!)

  • If a → b then a, c → b, c (augmentation)
    • then a, c → b
    • but not a → b, c
    • “Adding on the left is chill. Adding on the right though 🚩🚩🚩🚩🚩🚩”

  • If a → b & b → c then a → c (transitivity)
    • If abc and c → d then a → bd (pseudo transitivity) or even a -> bcd

11 of 23

Closure

Goal:

  • We want everything that an attribute/set of attributes determines

Observation:

  • Suppose we have the attributes A → B and B → C
    • To determine the closure of A we start with trivial FD: A → A and X = {A}
    • Since we have A → B, we add B to X = {A, B}
    • With this we can know that B → C, so we can add C to X = {A, B}
    • We now have {A, B, C}
    • Formally, the closure of A is written as {A}+ = {A, B, C}

12 of 23

Closures (quicker)

Goal:

  • We want everything that an attribute/set of attributes determines

Observation:

  • If we have A → B and B → C, then A → C
  • So really, A → B, C
  • Add A to the RHS so A → A, B, C
  • Formally, the closure of A is written as {A}+ = {A, B, C}

13 of 23

Example

Given studentID → name, dateOfBirth:

{studentID}+ = {studentID, name, dateOfBirth}

studentID

name

dateOfBirth

phoneNumber

1234

Allison

01-09-1999

253-876-1028

1235

Ben

03-10-2000

206-999-1923

1236

Chris

05-23-1999

206-127-1010

1234

Allison

01-09-1999

253-307-1678

14 of 23

Keys

We call an attribute (or a set of attributes) that determines all other attributes in a schema to be a superkey.

If it is the smallest set of attributes (in terms of cardinality) that does this we call that set a minimal key or just key

By definition, all minimal keys are superkeys, but a superkey is not necessarily a minimal key.

15 of 23

Superkeys and Minimal Keys

R(A, B, C, D)

A -> B

B -> C, D

Question: is {A, B} a superkey, minimal key, neither, or both?

Question: is {B} a superkey, minimal key, neither, or both?

Question: is {A} a superkey, minimal key, neither or both?

superkey

neither

both

A superkey determines all other attributes in a schema

A minimal key is the smallest set of attributes that determines all other attributes in a schema

By definition, all minimal keys are superkeys, but a superkey is not necessarily a minimal key.

16 of 23

Superkeys and Minimal Keys

R(A, B, C, D)

A -> B

C -> D

Question: is {A, C} a superkey, minimal key, or both?

Question: is {A, B, C, D} a superkey, minimal key, or both?

Question: is {A, C, D} a superkey, minimal key, or both?

both

superkey

superkey

A superkey determines all other attributes in a schema

A minimal key is the smallest set of attributes that determines all other attributes in a schema

By definition, all minimal keys are superkeys, but a superkey is not necessarily a minimal key.

17 of 23

Boyce-Codd Normal Form (BCNF)

A relation R is in BCNF if every set of attributes in R is either a superkey or its closure is the same set (trivial FD).

*that is, either {a}+ -> a OR {a}+ -> {all cols}, where a is any col in the table

  • BCNF reduces redundancy at the expense of creating additional tables.
  • BCNF may have several correct decompositions.
  • Some functional dependencies may be lost.

18 of 23

Boyce-Codd Normal Form (BCNF)

Not BCNF! SSN is not a trivial FD or a superkey

19 of 23

BCNF Decomposition Algorithm

X +

20 of 23

BCNF Decomposition Algorithm

C = the set of all attributes in the relation R

Look for an attribute (or a set of attributes) that meets the following conditions (non-BCNF flag!):

  • X+ != X
  • X+ != C

If such attribute exists, it is a violation of the BCNF condition; decompose R:

  • R1 = X+
  • R2 = (C - X+) union (X)
  • Normalize each table (recursive call)

Else, the table is in BCNF

21 of 23

Example

Relation: R (A, B, C, D, E)

FDs:

  • A → E
  • BC → A
  • DE → B

Goal: Decompose R into BCNF

22 of 23

Example

R (A, B, C, D, E)

R (A, E)

R (A, B, C, D)

R (B, C, A)

R (B, C, D)

A+ = {A, E}

BCNF compliant

BCNF compliant

BCNF compliant

{B, C}+ = {B, C, A}

23 of 23

Solution

Notice that {A}+ = {A,E}, which violates the BCNF condition

  • We split R to R1(A,E) and R2(A,B,C,D)
  • R1 satisfies BCNF
  • R2 does not satisfy BCNF because: {B,C}+ = {B,C,A}

Notice there is no E in R2 so we don't need to consider FD DE → B

  • Split R2 to: R21(B,C,A) and R22(B,C,D)
  • R21 and R21 satisfy BCNF now