CSE 414: Section 5
FDs, Keys, and BCNF
10/24/24
Announcements
Functional Dependencies
A functional dependency (FD) for relation R is a formula of the form A → B
where A and B are sets or individual attributes of R.
We say the FD holds for R if, for any instance of R, whenever two tuples agree in value for all attributes of A, they also agree in value for all attributes of B.
DEPENDENCY != CAUSATION
*in other words: A determines B, or A is kinda sorta a key for B
Example
Student(studentID, name, dateOfBirth, phoneNumber)
studentID | name | dateOfBirth | phoneNumber |
1234 | Allison | 01-09-1999 | 253-876-1028 |
1235 | Ben | 03-10-2000 | 206-999-1923 |
1236 | Chris | 05-23-1999 | 206-127-1010 |
1234 | Allison | 01-09-1999 | 253-307-1678 |
Example
Do the following FDs hold?
{studentID} → {name}
{studentID} → {name, dateOfBirth}
{studentID} → {phoneNumber}
studentID | name | dateOfBirth | phoneNumber |
1234 | Allison | 01-09-1999 | 253-876-1028 |
1235 | Ben | 03-10-2000 | 206-999-1923 |
1236 | Chris | 05-23-1999 | 206-127-1010 |
1234 | Allison | 01-09-1999 | 253-307-1678 |
Example
Do the following FDs hold?
{studentID} → {name}
{studentID} → {name, dateOfBirth}
{studentID} → {phoneNumber}
studentID | name | dateOfBirth | phoneNumber |
1234 | Allison | 01-09-1999 | 253-876-1028 |
1235 | Ben | 03-10-2000 | 206-999-1923 |
1236 | Chris | 05-23-1999 | 206-127-1010 |
1234 | Allison | 01-09-1999 | 253-307-1678 |
Example
Do the following FDs hold?
{studentID} → {name}
{studentID} → {name, dateOfBirth}
{studentID} → {phoneNumber}
studentID | name | dateOfBirth | phoneNumber |
1234 | Allison | 01-09-1999 | 253-876-1028 |
1235 | Ben | 03-10-2000 | 206-999-1923 |
1236 | Chris | 05-23-1999 | 206-127-1010 |
1234 | Allison | 01-09-1999 | 253-307-1678 |
Example
Do the following FDs hold?
{studentID} → {name}
{studentID} → {name, dateOfBirth}
{studentID} → {phoneNumber}
There are two tuples that agree on studentID but don’t agree on phoneNumber!
studentID | name | dateOfBirth | phoneNumber |
1234 | Allison | 01-09-1999 | 253-876-1028 |
1235 | Ben | 03-10-2000 | 206-999-1923 |
1236 | Chris | 05-23-1999 | 206-127-1010 |
1234 | Allison | 01-09-1999 | 253-307-1678 |
Trivial FDs
Axiom of Reflexivity
A trivial FD is one where the RHS is a subset of the LHS
That is, every attribute on the RHS is also on the LHS
The two FDs below are trivial:
{name} → {name}
{studentID,name} → {name}
Inference Rules
Closure
Goal:
Observation:
Closures (quicker)
Goal:
Observation:
Example
Given studentID → name, dateOfBirth:
{studentID}+ = {studentID, name, dateOfBirth}
studentID | name | dateOfBirth | phoneNumber |
1234 | Allison | 01-09-1999 | 253-876-1028 |
1235 | Ben | 03-10-2000 | 206-999-1923 |
1236 | Chris | 05-23-1999 | 206-127-1010 |
1234 | Allison | 01-09-1999 | 253-307-1678 |
Keys
We call an attribute (or a set of attributes) that determines all other attributes in a schema to be a superkey.
If it is the smallest set of attributes (in terms of cardinality) that does this we call that set a minimal key or just key
By definition, all minimal keys are superkeys, but a superkey is not necessarily a minimal key.
Superkeys and Minimal Keys
R(A, B, C, D)
A -> B
B -> C, D
Question: is {A, B} a superkey, minimal key, neither, or both?
Question: is {B} a superkey, minimal key, neither, or both?
Question: is {A} a superkey, minimal key, neither or both?
superkey
neither
both
A superkey determines all other attributes in a schema
A minimal key is the smallest set of attributes that determines all other attributes in a schema
By definition, all minimal keys are superkeys, but a superkey is not necessarily a minimal key.
Superkeys and Minimal Keys
R(A, B, C, D)
A -> B
C -> D
Question: is {A, C} a superkey, minimal key, or both?
Question: is {A, B, C, D} a superkey, minimal key, or both?
Question: is {A, C, D} a superkey, minimal key, or both?
both
superkey
superkey
A superkey determines all other attributes in a schema
A minimal key is the smallest set of attributes that determines all other attributes in a schema
By definition, all minimal keys are superkeys, but a superkey is not necessarily a minimal key.
Boyce-Codd Normal Form (BCNF)
A relation R is in BCNF if every set of attributes in R is either a superkey or its closure is the same set (trivial FD).
*that is, either {a}+ -> a OR {a}+ -> {all cols}, where a is any col in the table
Boyce-Codd Normal Form (BCNF)
Not BCNF! SSN is not a trivial FD or a superkey
BCNF Decomposition Algorithm
X +
BCNF Decomposition Algorithm
C = the set of all attributes in the relation R
Look for an attribute (or a set of attributes) that meets the following conditions (non-BCNF flag!):
If such attribute exists, it is a violation of the BCNF condition; decompose R:
Else, the table is in BCNF
Example
Relation: R (A, B, C, D, E)
FDs:
Goal: Decompose R into BCNF
Example
R (A, B, C, D, E)
R (A, E)
R (A, B, C, D)
R (B, C, A)
R (B, C, D)
A+ = {A, E}
BCNF compliant
BCNF compliant
BCNF compliant
{B, C}+ = {B, C, A}
Solution
Notice that {A}+ = {A,E}, which violates the BCNF condition
Notice there is no E in R2 so we don't need to consider FD DE → B