1 of 64

Relational Model�

Prof. S. Mehrotra

ICS Department

University of California at Irvine

2 of 64

Relational Model

  • The primary modeling concept in that of a relation.
    • In relational model, everything (e.g., entity set, relationship set) are all modeled as a relation.

  • A relation is a collection of tuples/rows.

  • A relation has a relational schema: name(attributes)
    • Example: employee(ssno,name,salary)

  • Attributes:
    • Each attribute has a domain.
    • Each attribute is atomic: we cannot refer to or directly see a subpart of the value.

  • A row is a set of attribute values.

2

2

3 of 64

Database Schema

  • A database schema consists of
    • A set of relation schemas, e.g., S = (R1, R2… Rn)
    • A set of constraints over the relation schemas

3

3

4 of 64

Relation Example

4

4

Account

Account

(No order in attributes or tuples)

Relation:

account = { (150, 20, 11000), (160, 23, 2300), (180, 23, 32000) }

Relation schema: R=(attributes)

Account = (AccountId, CustomerId, Balance)

Constraints: no two accounts can have the same AccountID)

5 of 64

Attributes & Nulls

  • Attributes:
    • Each attribute has a domain.
      • The set of allowed values for each attribute.
    • Each attribute is atomic.
      • We cannot refer to or directly see a subpart of the value.
      • Composite values and multivalued attributes are not allowed.
    • The number of attributes is called the degree of the relation. (while the number of tuples in a relation is called its cardinality)

  • Attributes can have a special value: NULL
    • Can mean not known: we don’t know Jack’s address
    • Or, does not exist: savings account 1001 does not have “overdraft”

5

5

Customer(Id, Name, Addr)

6 of 64

A Relation is a Set

  • A relation does not contain duplicate tuples that have the same value for all attributes

6

6

ID

Name

Age

1

Mark

21

2

Amelia

20

3

John

20

ID

Name

Age

1

Mark

21

2

Amelia

20

2

Amelia

20

Illegal

Legal

7 of 64

Constraints - Why?

Allow designers to specify the semantics of the data.

      • E.g., account balance should be >= 0.

Enable DBMSs to check that new data satisfies the semantics.

Make application programmers’ lives easier.

      • If DBMS guarantees account >=0, the debit application programmer need not worry about overdrawn accounts.

Enable identification of redundancy in schemas (we will see this soon)

      • Helps in good database design.

Help the DBMS in query processing (you need to take 222P for this one)

      • Constraints can help the query optimizer choose a good query execution plan (e.g., by helping it with intermediate result size estimation)

7

7

8 of 64

Constraints - Examples

  • Example Constraint types :
    • Domain constraints.
    • Entity identity.
    • Key constraints.
    • Functional dependencies (generalization of key constraints).
    • Referential integrity.
    • Inclusion dependencies (generalization of referential integrity).
    • Denial constraints.

8

8

9 of 64

Domain Constraints

  • Every attribute has a type (and a domain).
    • integer, float, date, boolean, string, etc.
  • An attribute can have a domain constraint, e.g.:
    • Id > 0
    • Salary > 0.0
    • Age < 100
    • City in {Irvine, LA, Riverside}
  • An insertion may violate the domain constraint
    • DBMS checks if insertion violates domain constraint and rejects the insertion

9

9

Integer

String

String

violations

10 of 64

Key Constraints

  • Each relation has a primary key which is a set of attributes such that no two tuples have the same value for those attributes.

10

10

Integer

String

String

23

violation since no two tuples should have the same value for Id, if Id is a primary key

11 of 64

Superkey Concept

A superkey is a set of attributes such that if no two tuples in the relation have “exactly” the same values for all of the attributes in the superkey.

Let R(A1, A2, …, An) be a relation. Let Attr = {A1, A2, .. An} be a relation. A subset of attributes K (i.e., K is a subset of Attr) is a superkey if and only if for any two distinct tuples t1 and t2 that can exist in the database, t1[K] ≠ t2[K]

11

11

12 of 64

SuperKey – Examples

  • Since relations are sets, the set of all attributes always form a superkey.

  • There are usually multiple superkeys.

  • Example:
    • Suppose LogID is a superkey, i.e., no two records can have the same value of logID

    • Superkeys: a superset of a superkey is also a superkey.

12

12

Log(LogId, AccountId, Xact#, Time, Amount)

Illegal!

13 of 64

Candidate Key

  • Candidate key (or key) is a superkey that is Minimal
    • no proper subset of which is a superkey).
    • {LogID} is a candidate key, but {logID,time} though it is a superkey, it is not a candidate key
    • {AccountID,Xact#} is also a candidate key

  • If more than one key: choose one as the primary key

13

13

LogID

AccountID

Xact#

Time

Amount

111

111

2

1/12/02

$100

122

222

2

12/28/01

$20

123

222

5

12/11/23

$25

14 of 64

Entity Identity Constraints:�Disallowing Null Values

  • Some fields are too important to contain null values.
    • E.g., Sales (customer, salesman, date, amount, saleID). Here, we do not want ‘customer’ to contain a null value.
  • Entity identity constraint:
    • A primary key cannot contain a null value
    • Otherwise, we are not able to differentiate tuples

14

14

Customer

Not what we want! (Solution: Disallow NULLs for this attribute)

15 of 64

Referential Integrity Constraints

  • Specified between two relations to maintain the correspondence between tuples in these relations.
  • Definition (relates to the notion of foreign key):
    • Relation schemas: R and S.
    • R has a primary key K (a set of attributes).
    • A set of attributes FK is a foreign key of S if:
      • The attributes in FK have same domains as the attributes (or correspond to the attributes) in K.
      • The values for FK in each tuple s in S either occur as values of K of a tuple in r of R or else FK is entirely NULL.
  • A referential integrity constraint from attributes FK of S to R means that FK is a foreign key that refers to the primary key K of R.

15

15

S

R

K (primary key of R)

FK (Foreign key)

s

r

16 of 64

Examples of Referential Integrity�

16

16

Account

Customer

Account.CustomerId to Customer.Id

Student.dept to Dept.Name

(Every value of Student.Dept must also be a value of Dept.Name)

Student

Dept

17 of 64

Inclusion Dependencies�

  • Generalization of referential integrity constraints
  • Inclusion dependency S[Y1,...,Yn] ⊆ R[X1,...,Xn] means: values in relation S refer to values in relation R

17

17

S

Y1,…,Yn

R

X1,…,Xn

s

r

X1,...,Xn may not be a key of R

18 of 64

Inclusion Dependencies (cont.)�

  • S[Y1,...,Yn] ⊆ R[X1,...,Xn] iff:
    • for each tuple s in S, there exists a tuple r in R such that s[Y1, …, Yn] = r[X1, …, Xn]
  • Referential integrity is an inclusion dependency in which {X1,…,Xn} is the primary key of R

18

18

S

Y1,…,Yn

R

X1,…,Xn

s

r

19 of 64

Example

19

19

DID

Name

MgrName

1

HR

Mark

2

IT

Maria

EmpID

Name

Salary

DID

1

Maria

80,000

2

2

Mark

90,000

1

3

Maria

95,000

2

Referential Integrity

Inclusion Dependencies

20 of 64

Mapping ER to Relational Model

21 of 64

Notes 04

21

(Strong) Entity Sets to Relations

Relation: Employee(ssno, salary, name)

Key: ssno

lastname, firstname)

Relation: Employee_Phone(ssno, tel)

Key: ssno, tel

Employee

ssno

name

first name

last name

salary

[ tel ]

22 of 64

Relation: WorksOn(ssno,proj#,startdate)

Key: ssno,proj#

IND (INclusion Dependencies):

workson[proj#] project[proj#]

workson[ssno] employee[ssno]

Notes 04

22

WorksOn

Relationship Sets to Relations

startdate

Employee

ssno

name

salary

Project

proj#

manager

23 of 64

Relation: WorksOn(ssno,proj#,startdate)

Key: ssno

IND (Inclusion Dependencies):

workson[proj#] project[proj#]

workson[ssno] employee[ssno]

Notes 04

23

Cardinality Constraints

An employee works on at most 1 project

WorksOn

startdate

Employee

ssno

name

salary

Project

proj#

manager

24 of 64

For Many - 1 relationship: We can push the extra fields to relations corresponding to entity set on Many side as follows

Relation: Employee(ssno, name, salary, Proj#,startdate)

Key: ssno

Relation: Project(Proj#,mgr)

Key: Proj#

IND (Inclusion Dependencies): Employee[Proj#] ⊆ Project[Proj#]

Note that since employee entity set participation is partial, employees not connected to any projects will have NULL values for proj# and startdate fields.

Notes 04

24

Alternate Approach: Combining Relations

An employee works on at most 1 project

WorksOn

startdate

Employee

ssno

name

salary

Project

proj#

manager

25 of 64

For Many - 1 relationship: We can also do the following

Relation: Employee(ssno,proj#,startdate, name, mgr, salary)

Key: ssno

  • Note that since employee entity set participation is partial, employees not connected to any projects will have NULL values for proj#, startdate and mgr fields.
  • Manager information stored repeatedly, once for each employee
  • Also, cannot store a project information about a project with no employee since ssno cannot be NULL!

What if the relationship was 1-1?

Notes 04

25

Yet Another Alternative: Combining Relations

An employee works on at most 1 project

WorksOn

startdate

Employee

ssno

name

salary

Project

proj#

manager

26 of 64

Relation: WorksOn(ssno, proj#,startdate)

Key: ssno,proj#

IND (Inclusion Dependencies):

workson[proj#] ⊆ project[proj#]

workson[ssno] ⊆ employee[ssno]

employee[ssno] ⊆ workson[ssno]

ICS122A/EECS116

Notes 04

26

Participation Constraints

An employee works on at least 1 project

participation!

WorksOn

startdate

Employee

ssno

name

salary

Project

proj#

manager

27 of 64

Relation: WorksOn(ssno,proj#,startdate)

Key: ssno

IND (Inclusion Dependencies):

workson[proj#] ⊆ project[proj#]

workson[ssno] ⊆ employee[ssno]

employee[ssno] ⊆ workson[ssno]

Notes 04

27

Multiple Constraints

An employee works on 1 and only 1 project

WorksOn

startdate

Employee

ssno

name

salary

Project

proj#

manager

28 of 64

Relation: WorksOn(ssno,proj#,startdate)

Key: ssno

IND (Inclusion Dependencies):

workson[proj#] ⊆ project[proj#]

workson[ssno] ⊆ employee[ssno]

employee[ssno] ⊆ workson[ssno]

project[proj#] ⊆ workson[proj#]

28

Multiple Constraints

An employee works on 1 and only 1 project

each project has an employee

WorksOn

startdate

Employee

ssno

name

salary

Project

proj#

manager

29 of 64

Relation: Employee(ssno,proj#,startdate, name, mgr, salary)

Key: ssno

Proj# fields must not be NULL

Notes 04

29

Combining Relations

An employee works on at most 1 project

WorksOn

startdate

Employee

ssno

name

salary

Project

proj#

manager

30 of 64

Multiway Relationships

Relation: WorksUsing(ssno,proj#,toolid, startdate)

Key: ssno,toolid, proj#

IND: worksusing[proj#] project[proj#]

worksusing[ssno] employee[ssno]

worksusing[toolid] tools[toolid]

Notes 04

30

WorksOn

startdate

Employee

ssno

name

salary

Project

proj#

manager

Tool

toolid

spec

31 of 64

Multiway Relationships

Relation: WorksUsing(ssno,proj#,toolid, startdate)

Key: ssno,toolid

IND: worksusing[proj#] project[proj#]

worksusing[ssno] employee[ssno]

worksusing[toolid] tools[toolid]

Notes 04

31

WorksOn

startdate

Employee

ssno

name

salary

Project

proj#

manager

Tool

toolid

spec

32 of 64

Multiway Relationships

Relation: WorksUsing(ssno,proj#,toolid, startdate)

Key: ssno,toolid

IND: worksusing[proj#] project[proj#]

worksusing[ssno] employee[ssno]

worksusing[toolid] tools[toolid]

employee[ssno] worksusing[ssno]

Notes 04

32

WorksOn

startdate

Employee

ssno

name

salary

Project

proj#

manager

Tool

toolid

spec

33 of 64

Multiway Relationships

Relation: WorksUsing(ssno,proj#,toolid, startdate)

Key: ssno,toolid OR ssno,proj#

IND: worksusing[proj#] project[proj#]

worksusing[ssno] employee[ssno]

worksusing[toolid] tools[toolid]

employee[ssno] worksusing[ssno]

Notes 04

33

WorksOn

startdate

Employee

ssno

name

salary

Project

proj#

manager

Tool

toolid

spec

34 of 64

Mapping 1:

Relations: (key: ssno for all relations)

staff(ssno, name, salary,position)

faculty(ssno, name, salary, rank)

studentassistant(ssno, name, salary, percentage_time)

Mapping 2:

Relations: (key: ssno for all relations)

Employee(ssno, name, salary)

staff(ssno,position)

faculty(ssno,, rank)

studentassistant(ssno, percentage_time)

  • \

Notes 04

34

Class Subclass Relationships

ISA

Employee

ssno

name

salary

Staff

position

Faculty

rank

Student assistant

percentage-time

35 of 64

Mapping 3 (ssno is key)

Relation: employee(ssno, name, salary, jobtype, position, rank, percentage_time)

Mapping 4: (ssno is key)

employee(ssno, name, salary IsStaff, position, IsFaculty, rank, IsStudentAssistant, percentage_time

)

Notes 04

35

Class Subclass Relationships

ISA

Employee

ssno

name

salary

Staff

position

Faculty

rank

Student assistant

percentage-time

36 of 64

which Mapping to use

36

Total

partial

disjoint

overlapping

cannot use Mapping 1

cannot use Mapping 3

37 of 64

Relations:

staff(ssno, name, salary,position)

faculty(ssno, name, salary, rank)

studentassistant(ssno, name, salary, percentage_time)

Key: ssno for all the relations

Requires a union to construct all employees

  • Cannot use this design if ISA is partial: cannot represent employees who are not staff, faculty, or student assistants!
  • Representing overlapping subclasses can lead to redundancy: E.g., : if staff could also be a student assistant, then redundancy arises

Notes 04

37

Disjoint & Total

ISA

Employee

ssno

name

salary

Staff

position

Faculty

rank

Student assistant

percentage-time

d

38 of 64

Relation: employee(ssno, name, salary, jobtype, position, rank, percentage_time)

Key: ssno

  • jobtype can be used to specify whether an employee is a staff, a faculty, or a student assistant; use NULL jobtype if none of the above
  • Should not be used if subclasses overlap
  • Does not require union to construct the list of employees

ICS122A/EECS116

Notes 04

38

Disjoint & Partial

ISA

Employee

ssno

name

salary

Staff

position

Faculty

rank

Student assistant

percentage-time

d

39 of 64

Relations:

employee(ssno, name, salary)

staff(ssno, position)

faculty(ssno, rank)

studentassistant(ssno, percentage_time)

Key: ssno for all relations

IND: staff[ssno] employee[ssno]

faculty[ssno] employee[ssno]

studentassistant[ssno] employee[ssno]

Note: To model total constraint requires us to state that union of staff, faculty and student assistant covers all employees. Cannot be said using IND but can be specified as a general constraints.

ICS122A/EECS116

Notes 04

39

ISA

staff

faculty

student assistant

Overlapping

employee

ssno

salary

name

position

rank

Percentage

Time

40 of 64

Relation:

employee(ssno, name, salary, IsStaff, position, IsFaculty, rank, IsStudentAssistant, percentage_time

)

Key: ssno

  • IsStaff, IsFaculty, IsStudent_assistant are boolean values (either true or false).

  • To model total, we will need to state that “One of IsStaff, IsFaculty, IsStudentAssistant must be non-NULL.”)

ICS122A/EECS116

Notes 04

40

ISA

Overlapping: Alternative Design

Employee

ssno

name

salary

Staff

position

Faculty

rank

Student assistant

percentage-time

41 of 64

Weak Entity Sets

Relation:

account(acct#,customer, balance)

transaction(acct#,trans#, amount)

Key of Transaction: acct#, trans#

IND: transaction[acct#] account[acct#]

No relation for the weak relationship set “Log.”

ICS122A/EECS116

Notes 04

41

log

Account

acct#

customer

balance

Transaction

trans#

amount

42 of 64

Weak Entity Sets (cont.)

Other relationships of the weak entity set (transaction) should use the complete key as its key

Relation:

ATM(id, address) Key: “id”

use(acct#,trans#, id, time) Key: acct#, trans#, id

IND: use[acct#,trans#] transaction[acct#,trans#]

use[id] ATM[id]

ICS122A/EECS116

Notes 04

42

use

time

log

Account

acct#

customer

balance

Transaction

trans#

amount

ATM

id

address

43 of 64

Aggregation

Relation:

using(id, pid, tid) Key: id, pid, tid

Q: What other relations are there in the overall schema?

A: employees, projects, tools and work (in addition to using)

ICS122A/EECS116

Notes 04

43

work

using

stime

Employees

id

Projects

pid

Tools

tid

44 of 64

Exercise (I) - Question

ICS122A/EECS116

Notes 04

44

Translate the above E/R diagram to relations, using the "E/R" approach to handling ISA hierarchies. (see next slide for answer)

Isa a partial and overlapping relationship

45 of 64

Exercise (I) - Answer

ICS122A/EECS116

Notes 04

45

  • F(c,d): An ordinary enitity-set-to-relation transformation. KEY: c

  • E(a,b,c): A weak entity set's relation includes the keys of all supporting entity sets,

c in this case. KEY: {a,c}

  • G(a,c,e): In the E/R approach, a subclass gets the key of the root (a and c in this case), plus (just) its own attributes, which are e here. KEY:{a,c}

  • H(g,h): An ordinary enitity-set-to-relation transformation. KEY: {g}

  • S(a,c,f,g): A relationship's relation has the keys from all connected entity sets ({a,c} and g in this case), plus it has whatever attributes are attached to the relationship itself (f here). KEY: {c,a,g}

  • Note that R yields no relation, because it supports the weak entity set E.  

  • Complete the referential integrity constraints on your own!

46 of 64

SQL

Preview to help you define schemas and insert/delete data

47 of 64

SQL as Data Definition Language

  • Allows users to
    • Specify the relation schemas
    • Domain of each attribute
    • Integrity constraints
    • Set of indices to be maintained on the relation
      • we will learn what indices are later
    • Security and authorization information for each relation
    • The physical storage structure for each relation on disk.

48 of 64

Domain Types

  • char(n): fixed length char string
  • varchar(n): variable length char string
  • int or integer
  • smallint
  • numeric(p,d): fixed-point number of given precision
  • real, double precision
  • float(n): floats with a given precision
  • date: containing year,month, date
  • time: in hours, minutes and seconds
  • NULL value is part of each domain

49 of 64

Create/Drop Table Command

Create table r (

A1 D1 [not null] [default V1]

A2 D2 [not null] [default V2]

An Dn [not null] [default Vn]

<integrity constraint 1>

<integrity constraint 2>

<integrity constraint k>

)

Drop table r

50 of 64

Example script - Create

CREATE TABLE designer (

id INTEGER,

name VARCHAR(1000),

PRIMARY KEY (id)

);

CREATE TABLE blueprint (

name VARCHAR(255),

designer INTEGER,

PRIMARY KEY (name),

FOREIGN KEY (designer) REFERENCES designer(id)

);

50

51 of 64

Example script - Insert

INSERT INTO designer (id, name) VALUES

(1, 'Art'),

(2, 'Taylor');

-- To see what we inserted:

SELECT * FROM designer;

INSERT INTO blueprint (name, designer) VALUES

('Donald Bren Hall', 1),

('Engineering Hall', 2);

SELECT * FROM blueprint;

51

52 of 64

More Inserts

INSERT INTO designer (id, name) VALUES (1, 'Cecelia');

SELECT * FROM designer;

INSERT INTO designer (id, name) VALUES (NULL, 'Cecelia');

SELECT * FROM designer;

INSERT INTO blueprint (name, designer) VALUES ('Steinhaus Hall', 50);

SELECT * FROM blueprint;

52

53 of 64

Example script - Constraints at work

-- Violates primary key constraint - id 1 already exists

INSERT INTO designer (id, name) VALUES (1, 'Cecelia');

SELECT * FROM designer;

-- Violates entity identity constraint - primary key cannot be NULL

INSERT INTO designer (id, name) VALUES (NULL, 'Cecelia');

SELECT * FROM designer;

-- Violates referential integrity - foreign key designer 50 does not exist yet

INSERT INTO blueprint (name, designer) VALUES ('Steinhaus Hall', 50);

SELECT * FROM blueprint;

53

54 of 64

Deletions

-- Fails since the default action on delete/update is reject (i.e., abort):

DELETE FROM designer WHERE id = 1;

SELECT * FROM designer;

SELECT * FROM blueprint;

Factoid about SQL: there is no way to delete only a single occurrence of a tuple that appears twice in a relation.

55 of 64

Data Modifications and Constraints

  • Modification, insertion, and deletion requests can lead to violations of one or more integrity constraints.

  • DBMS must check for violations at the end of each operation and suitably allow or disallow the operation.

  • Impacts of data modifications on inclusion dependencies can be sometimes tricky to handle!

55

55

56 of 64

Example: Delete

  • Referential integrity constraint:
    • Takes(Quarter, C-ID) ⊆ CourseOfferings(Quarter, C-ID)
  • Deleting a course: “Winter,2009” “ICS122A”
  • What should happen to any tuples in Takes that refer to “Winter,2009 ICS122A”?

56

56

Takes

CourseOfferings

57 of 64

Integrity Maintenance Policies

  • Abort: reject the update (sometimes called Restrict)
  • Cascade: delete tuples from Takes that refer to “Winter,2009 ICS122A”
    • Notice that deleting this Takes record could have a cascading effect, since other records may refer to it
  • Set NULL: set referring values in Takes to NULL

57

57

Takes

CourseOfferings

deleted

58 of 64

Example: Update Referenced Table

  • In CourseOfferings, change “Winter 2009, ICS122A” to “Winter 2009, EECS116”
  • Solutions:
    • Abort (or Restrict): reject the update
    • Cascade: in Takes, also change “Winter 2009, ICS122A” to “Winter 2009, EECS116”
    • Set NULL: set the referencing value(s) to null

58

58

Takes

CourseOfferings

59 of 64

Example: Update Referencing Table

  • In Takes, change “Fall 2004, ICS184” to a new value
  • We need to make sure that the new value belongs to the values of CourseOfferings
  • If it doesn’t, the update should be rejected

59

59

Takes

CourseOfferings

60 of 64

Recall the table and delete that failed…

CREATE TABLE designer (

id INTEGER,

name VARCHAR(1000),

PRIMARY KEY (id)

);

CREATE TABLE blueprint (

name VARCHAR(255),

designer INTEGER,

PRIMARY KEY (name),

FOREIGN KEY (designer) REFERENCES designer(id)

);

-- Fails since the default action on delete/update is reject (i.e., abort):

DELETE FROM designer WHERE id = 1;

SELECT * FROM designer;

SELECT * FROM blueprint;

Factoid about SQL: there is no way to delete only a single occurrence of a tuple that appears twice in a relation.

60

61 of 64

Example script - Change “on delete” action

-- To change the “on delete” action, we need to recreate the foreign key constraint

ALTER TABLE blueprint DROP CONSTRAINT blueprint_ibfk_1;

ALTER TABLE blueprint

ADD CONSTRAINT blueprint_ibfk_1

FOREIGN KEY (designer)

REFERENCES designer(id) ON DELETE CASCADE;

-- Now deletion succeeds and deletes a row in both tables

-- DELETE FROM designer WHERE id = 1

SELECT * FROM designer;

SELECT * FROM blueprint;

61

62 of 64

Example script - drop

-- While dropping tables, blueprint has to be dropped first due to foreign key constraint

-- DROP TABLE blueprint;

-- DROP TABLE designer;

62

63 of 64

Indexes

Suppose we have a relation

Person (name, social security number, age, city)

An index on social security number enables us to fetch a tuple

for a given ssn efficiently (not have to scan the whole relation).

The problem of deciding which indexes to put on the relations is

called : physical database design

64 of 64

Creating Indexes

CREATE INDEX ssnIndex ON Person(social-security-number)

Indexes can be created on more than one attribute:

CREATE INDEX doubleindex ON Person (name, social-security-number)