1 of 108

Relational Model

  • Introduction
  • Basic concepts
    • Relation
    • Attribute
    • Key
    • Integrity constraint
    • Database
  • Relational data manipulation languages
    • Relational Algebra
      • Basic operators
      • Additional operators
      • Extended operations
      • Views and Snapshots
    • Tuple Relational Calculus
    • Domain Relational Calculus

3.1

2 of 108

Introduction to Relational Model

  • Introduced by E.F. Codd in 1970
  • Relational DBMS systems include Oracle, DB2, Sybase, Informix.
  • Main Characteristics:
    • Database is perceived as a set of tables
    • Query results are also tables
  • Main advantages:
    • Simple
    • With solid math base
  • Relational model:
    • Database structure: the database is represented by tables
    • Integrity constraints: the tables satisfy certain constraints
    • Data manipulation: operators are available for data manipulation

3.2

3 of 108

Relation

  • Informally, a relation is a table; a tuple corresponds to a row of such a table and an attribute to a column; the number of tuples is called the cardinality and the number of attributes is called the degree. Order of tuples is irrelevant (tuples may be stored in an arbitrary order); duplicate tuples are not allowed in a relation.
  • Formally, given sets D1, D2, …. Dn a relation r is a subset of �D1 x D2 x … x Dn�Thus a relation is a set of n-tuples (a1, a2, …, an) where �aiDi
  • Example:

Jones

Smith

Curry

Lindsay

customer-name

Main

North

North

Park

customer-street

Harrison

Rye

Rye

Pittsfield

customer-city

attributes

tuples

3.3

4 of 108

Attribute

  • Each attribute of a relation has a name
  • The set of allowed values for each attribute is called the domain of the attribute. A domain is a data type, which is a set of values associated with operators. E.g., the type INTEGER is the set of all integers with operators +,-,*,/, etc. Data types can be system-defined or user-defined.
  • Attribute values are (normally) required to be atomic, that is, indivisible
    • E.g. multivalued attribute values are not atomic
    • E.g. composite attribute values are not atomic
  • The special value null is a member of every domain if allowed
  • The null value causes complications in the definition of many operations
    • we shall ignore the effect of null values in our main presentation and consider their effect later

3.4

5 of 108

Relation Schema and Instance

  • Let A1, A2, …, An be the attributes of a relation. Then
    • R = (A1, A2, …, An ) is the relation schema

E.g. Customer-schema =� (customer-name, customer-street, customer-city)

    • The current value of the relation is the relation instance (specified by a table)

3.5

6 of 108

Keys (1)

Let R be a relation schema, r(R) be any relation on R, and K ⊆ R

  • K is a superkey of R if values for K are sufficient to identify a unique tuple of each possible relation r(R). By “possible r” we mean a relation r that could exist in the enterprise we are modeling.�Example: {customer-name, customer-street} and� {customer-name} �are both superkeys of Customer, if no two customers can possibly have the same name.
  • K is a candidate key if K is minimal superkey�Example: {customer-name} is a candidate key for Customer, since it is a superkey (assuming no two customers can possibly have the same name), and no subset of it is a superkey.

3.6

7 of 108

Keys (2)

  • A primary key is a candidate key which is chosen by the database designer as the principal means of identifying tuples within a relation; the other candidate keys are called alternate keys.
  • A foreign key K is a set of attributes of one relation R1 whose values are required to match values of some candidate key of another relation R2. K is used in R1 for referencing R2.

3.7

8 of 108

Primary Key

A primary key is a column (or a set of columns) in a table that uniquely identifies each row in that table.

🔗 Foreign Key

A foreign key is a column (or a set of columns) in one table that refers to the primary key in another table. It creates a relationship between the two tables.

📘 Example: University Database

1. Students Table

StudentID (PK) Name Major

101 Alice CS

  1. Bob Math

StudentID is the primary key.

2. Enrollments Table

EnrollmentID (PK) StudentID (FK) CourseCode

1 101 CS101

2 102 MATH201

3 101 CS102

StudentID here is a foreign key that references the StudentID in the Students table.

This means each enrollment must be linked to a valid student.

3.8

9 of 108

Integrity Constraints

Integrity constraints are used to ensure the correctness of the data

  • Type constraint: specify the legal values of a type
  • Attribute constraint: declare that a specific attribute is of a specific type
  • Not-Null constraint: null value is not allowed for an attribute if so specified
  • Entity constraints: no primary key value can be null.
  • Key constraints: all tuples in a relation are distinct.
  • Referential integrity constraints: ensure that a value of a foreign key in relation R1 also appears in the corresponding candidate key of the referenced relation R2 (or null)

3.9

10 of 108

✅ 1. Type Constraint

Specifies the legal data type for an attribute.

Example:

CREATE TABLE Employee (EmpID INT, Name VARCHAR(100),

Salary DECIMAL(10, 2));

EmpID must be an integer.

Salary must be a decimal with up to 10 digits, 2 after the decimal point.

3.10

11 of 108

2. Attribute Constraint

Declares that an attribute must conform to a specific type or format.

CREATE TABLE Product ( ProductID INT, Price DECIMAL(8, 2)

CHECK (Price > 0));

Price must be a positive decimal.

3. Not-Null Constraint

Ensures that an attribute cannot be null.

Example:

CREATE TABLE Student (StudentID INT NOT NULL,

Name VARCHAR(50) NOT NULL );

Both StudentID and Name must have values.

3.11

12 of 108

4. Entity Integrity Constraint

Ensures that primary key values are never null.

Example:

CREATE TABLE Department (DeptID INT PRIMARY KEY,

DeptName VARCHAR(100));

DeptID must be unique and not null.

5. Key Constraint

Ensures that all tuples (rows) are distinct based on the primary key.

Example:

CREATE TABLE Course (CourseCode VARCHAR(10) PRIMARY KEY,

Title VARCHAR(100));

No two courses can have the same CourseCode.

3.12

13 of 108

6. Referential Integrity Constraint

Ensures that a foreign key value matches a primary key in another table or is null.

Example:

CREATE TABLE Enrollment

(

StudentID INT, CourseCode VARCHAR(10),

FOREIGN KEY (StudentID) REFERENCES Student(StudentID),

FOREIGN KEY (CourseCode) REFERENCES Course(CourseCode)

);

StudentID must exist in the Student table.

CourseCode must exist in the Course table.

3.13

14 of 108

1. INSERT Operation – Violation of Primary Key Constraint

Table: Students(StudentID, Name, Age)

Constraint: StudentID is a Primary Key (must be unique)

Existing Data:

StudentID | Name | Age

----------|--------|-----

101 | Anita | 20

Invalid Insert:

INSERT INTO Students (StudentID, Name, Age) VALUES (101, 'Ravi', 22);

Violation: Duplicate StudentID (101 already exists)

Error: "Duplicate entry for primary key 'StudentID'"

3.14

15 of 108

2. DELETE Operation – Violation of Referential Integrity

Tables:

Students(StudentID, Name)

Enrollments(StudentID, CourseID)

Constraint: Enrollments.StudentID is a Foreign Key referencing Students.StudentID

Existing Data:

Students:

StudentID | Name

----------|-----

101 | Anita

Enrollments:

StudentID | CourseID

----------|---------

101 | CSE101

3.15

16 of 108

Invalid Delete:

DELETE FROM Students WHERE StudentID = 101;

Violation: Enrollments still references StudentID = 101

Error: "Cannot delete or update a parent row: a foreign key constraint fails"

3. UPDATE Operation – Violation of Domain Constraint

Table: Students(StudentID, Name, Age)

Constraint: Age must be a positive integer

Invalid Update:

UPDATE Students SET Age = -5 WHERE StudentID = 101;

Violation: Age = -5 is outside the valid domain

Error: "Check constraint 'Age > 0' violated"

3.16

17 of 108

Database (1)

  • A database consists of relations and integrity constraints
  • Information about an enterprise is broken up into parts, with each relation storing one part of the information�� E.g.: account : stores information about accounts� depositor : stores information about which customer� owns which account � customer : stores information about customers
  • Relational schemas should be carefully designed
    • E.g. Storing all information as a single relation such as � bank(account-number, balance, customer-name, ..)�results in
      • repetition of information (e.g. two customers own an account)
      • the need for null values (e.g. represent a customer without an account)
  • Normalization theory (discuss later ) deals with how to design relational schemas
  • Basic operations of insert, delete, and update may violate constraints.

3.17

18 of 108

Database (2)

  • The schema (conceptual model) of a database consists of the schemas of the relations in the database
  • The instance of a database is the current values (instances) of the relations in the database
  • Any relation that is not of the conceptual model but is made visible to a user as a “virtual relation” is called a view. Any relation that is not of the conceptual model but is made visible to a user to reflect the status of the database at the moment when the relation is created is called a snapshot.

3.18

19 of 108

E-R Diagram for the Banking Enterprise

3.19

20 of 108

Schema Diagram for the Banking Enterprise

3.20

21 of 108

Relational Data Manipulation Languages

  • Abstract (Mathematical) languages
    1. Relational Algebra --- procedural
    2. Tuple Relational Calculus --- nonprocedural
    3. Domain Relational Calculus --- nonprocedural
  • The three languages are equivalent, I.e., they have same expressing power
  • Commercial languages
    • SQL --- Structured (Standard) Query Language; based on (a)(b)(c)
    • QBE --- Query By Example; based on (c)
    • QUEL --- Query Language; based on (b)
    • ISBL --- Information System Base Language; based on (a)
    • ….

3.21

22 of 108

Relational Algebra

  • Basically, the relational algebra is a set of operators that take relations as their operands and return a relation as their result.
  • Six basic operators
    • Select (called “Restrict” in the textbook)
    • project
    • union
    • set difference
    • Cartesian product
    • rename
  • The operators take two or more relations as inputs and give a new relation as a result.

3.22

23 of 108

Select Operation – Example

  • Relation r

A

B

C

D

α

α

β

β

α

β

β

β

1

5

12

23

7

7

3

10

  • σA=B ^ D > 5 (r)

A

B

C

D

α

β

α

β

1

23

7

10

3.23

24 of 108

Select Operation

  • Notation: σ p(r)
  • p is called the selection predicate
  • Defined as:

σp(r) = {t | tr and p(t)}

Where p is a formula in propositional calculus consisting of terms connected by : ∧ (and), ∨ (or), ¬ (not)�Each term is one of:

<attribute> op <attribute> or <constant>

where op is one of: =, ≠, >, ≥. <. ≤

  • Example of selection:� σ branch-name=“Perryridge(account)
  • Important properties.

3.24

25 of 108

Project Operation – Example

  • Relation r:

A

B

C

α

α

β

β

10

20

30

40

1

1

1

2

A

C

α

α

β

β

1

1

1

2

=

A

C

α

β

β

1

1

2

  • A,C (r)

3.25

26 of 108

Project Operation

  • Notation:�� ∏A1, A2, …, Ak (r)

where A1, A2 are attribute names and r is a relation name.

  • The result is defined as the relation of k columns obtained by erasing the columns that are not listed
  • Duplicate rows removed from result, since relations are sets
  • E.g. To eliminate the branch-name attribute of account� ∏account-number, balance (account)
  • Important properties.

3.26

27 of 108

Union Operation – Example

  • Relations r, s:
  • r ∪ s:

A

B

α

α

β

1

2

1

A

B

α

β

2

3

r

s

A

B

α

α

β

β

1

2

1

3

3.27

28 of 108

Union Operation

  • Notation: rs
  • Defined as:

rs = {t | tr or ts}

  • For rs to be valid.

1. r, s must have the same arity (same number of attributes)

2. The attribute domains must be compatible (e.g., 2nd column � of r deals with the same type of values as does the 2nd � column of s)

  • E.g. to find all customers with either an account or a loan� ∏customer-name (depositor) ∪ ∏customer-name (borrower)
  • Properties.

3.28

29 of 108

Set Difference Operation – Example

  • Relations r, s:

r – s:

A

B

α

α

β

1

2

1

A

B

α

β

2

3

r

s

A

B

α

β

1

1

3.29

30 of 108

Set Difference Operation

  • Notation r – s
  • Defined as:

r – s = {t | tr and t ∉ s}

  • Set differences must be taken between compatible relations.
    • r and s must have the same arity
    • attribute domains of r and s must be compatible
  • Property.

3.30

31 of 108

Cartesian-Product Operation-Example

Relations r, s:

r x s:

A

B

α

β

1

2

A

B

α

α

α

α

β

β

β

β

1

1

1

1

2

2

2

2

C

D

α

β

β

γ

α

β

β

γ

10

10

20

10

10

10

20

10

E

a

a

b

b

a

a

b

b

C

D

α

β

β

γ

10

10

20

10

E

a

a

b

b

r

s

3.31

32 of 108

Cartesian-Product Operation

  • Notation r x s
  • Defined as:

r x s = {t q | t r and q s}

  • Assume that attributes of r(R) and s(S) are disjoint. (That is, �R S = ).
  • If attributes of r(R) and s(S) are not disjoint, then renaming must be used.

3.32

33 of 108

Rename Operation

  • Allows us to name, and therefore to refer to, the results of relational-algebra expressions.
  • Allows us to refer to a relation by more than one name.

Example:

ρ x (E)

returns the expression E under the name X

If a relational-algebra expression E has arity n, then

ρx (A1, A2, …, An) (E)

returns the result of expression E under the name X, and with the

attributes renamed to A1, A2, …., An.

3.33

34 of 108

Composition of Operations

  • Can build expressions using multiple operations
  • Example: σA=C(r x s)
  • r x s

A

B

α

α

α

α

β

β

β

β

1

1

1

1

2

2

2

2

C

D

α

β

β

γ �α

β

β

γ

10

19

20

10

10

10

20

10

E

a

a

b

b

a

a

b

b

A

B

C

D

E

α

β

β

1

2

2

α

β

β

10

20

20

a

a

b

  • σA=C(r x s)

3.34

35 of 108

Expressions

  • A basic expression in the relational algebra consists of either one of the following:
    • A relation in the database
    • A constant relation
  • Let E1 and E2 be relational-algebra expressions; the following are all relational-algebra expressions:
    • E1E2
    • E1 - E2
    • E1 x E2
    • σp (E1), P is a predicate on attributes in E1
    • s(E1), S is a list consisting of some of the attributes in E1
    • ρ x (E1), x is the new name for the result of E1

3.35

36 of 108

Sample Relation: STUDENT

SID Name Age Dept Marks

S1 Alice 20 CSE 85

S2 Bob 21 ECE 78

S3 Charlie 22 CSE 90

S4 David 20 MECH 88

S5 Eva 21 ECE 92

Relational Algebra Operations and Queries

1. Selection (σ) — Select rows based on condition

Query: Students from CSE department

Relational Algebra: σDept=′CSE′(STUDENT)

Output:

SID

Name

Age

Dept

Marks

S1

Alice

20

CSE

85

S3

Charlie

22

CSE

90

3.36

37 of 108

2. Projection (π) — Select specific columns

Query: Names and Marks of all students

Relational Algebra: πName,Marks(STUDENT)

Output:

Name

Marks

Alice

85

Bob

78

Charlie

90

David

88

Eva

92

3.37

38 of 108

3. Union (∪) — Combine two relations

Let’s define another relation:

Relation: TOPPER

SID

Name

Age

Dept

Marks

S3

Charlie

22

CSE

90

S5

Eva

21

ECE

92

Output: (No duplicates)

SID Name Age Dept Marks

S1 Alice 20 CSE 85

S2 Bob 21 ECE 78

S3 Charlie 22 CSE 90

S4 David 20 MECH 88

S5 Eva 21 ECE 92

Query: All students and toppers

Relational Algebra: STUDENT∪TOPPER

Sample Relation: STUDENT

SID Name Age Dept Marks

S1 Alice 20 CSE 85

S2 Bob 21 ECE 78

S3 Charlie 22 CSE 90

S4 David 20 MECH 88

S5 Eva 21 ECE 92

3.38

39 of 108

4. Set Difference (−) — Students who are not toppers

Relational Algebra: STUDENT−TOPPER

Output:

SID

Name

Age

Dept

Marks

S1

Alice

20

CSE

85

S2

Bob

21

ECE

78

S4

David

20

MECH

88

Relation: TOPPER

SID

Name

Age

Dept

Marks

S3

Charlie

22

CSE

90

S5

Eva

21

ECE

92

Sample Relation: STUDENT

SID Name Age Dept Marks

S1 Alice 20 CSE 85

S2 Bob 21 ECE 78

S3 Charlie 22 CSE 90

S4 David 20 MECH 88

S5 Eva 21 ECE 92

3.39

40 of 108

5. Rename (ρ) — Rename relation or attributes

Relational Algebra: ρS(STUDENT)

Renames STUDENT to S

Dept

HOD

CSE

Dr. Rao

ECE

Dr. Meena

MECH

Dr. Kumar

6. Join (⨝) — Combine related tuples from two relations

Let’s define another relation:

Relation: DEPARTMENT

3.40

41 of 108

Query: Join STUDENT with DEPARTMENT

Relational Algebra:STUDENT⋈STUDENT.Dept=DEPARTMENT.Dept DEPARTMENT

Output:

SID

Name

Age

Dept

Marks

HOD

S1

Alice

20

CSE

85

Dr. Rao

S2

Bob

21

ECE

78

Dr. Meena

S3

Charlie

22

CSE

90

Dr. Rao

S4

David

20

MECH

88

Dr. Kumar

S5

Eva

21

ECE

92

Dr. Meena

Relation: STUDENT

SID Name Age Dept Marks

S1 Alice 20 CSE 85

S2 Bob 21 ECE 78

S3 Charlie 22 CSE 90

S4 David 20 MECH 88

S5 Eva 21 ECE 92

Dept

HOD

CSE

Dr. Rao

ECE

Dr. Meena

MECH

Dr. Kumar

Relation: DEPARTMENT

3.41

42 of 108

Natural-Join Operation

  • Notation: r s
  • Let r and s be relations on schemas R and S respectively.The result is a relation on schema R S which is obtained by considering each pair of tuples tr from r and ts from s.
  • If tr and ts have the same value on each of the attributes in RS, a tuple t is added to the result, where
    • t has the same value as tr on r
    • t has the same value as ts on s
  • Example:

R = (A, B, C, D)

S = (E, B, D)

  • Result schema = (A, B, C, D, E)
  • r s is defined as:

r.A, r.B, r.C, r.D, s.E (σr.B = s.B r.D = s.D (r x s))

3.42

43 of 108

Natural Join Operation – Example

  • Relations r, s:

A

B

α

β

γ

α

δ

1

2

4

1

2

C

D

α

γ

β

γ

β

a

a

b

a

b

B

1

3

1

2

3

D

a

a

a

b

b

E

α

β

γ

δ

r

A

B

α

α

α

α

δ

1

1

1

1

2

C

D

α

α

γ

γ

β

a

a

a

a

b

E

α

γ

α

γ

δ

s

r s

Theta join operation

3.43

44 of 108

Types of Join Operations

Theta Join (θ-join)�Combines tuples from two relations based on a condition involving comparison operators like =, <, >, etc.

Equi Join�A special case of theta join where the condition is equality (=).

Natural Join�Automatically joins tables based on all common attributes (same name and domain), eliminating duplicate columns.

Outer Joins (Left, Right, Full)�These are not part of basic relational algebra but are used in extended relational algebra to include unmatched tuples.

3.44

45 of 108

3.45

46 of 108

3.46

47 of 108

Theta Join in Relational Algebra

A theta join (denoted as ⨝θ) is a type of join operation in relational algebra where two relations are joined based on a general condition (θ), which can include any comparison operator like:

• =, ≠, <, >, ≤, ≥

________________________________________

🔹 Syntax

If R and S are two relations, and θ is a condition involving attributes from both:

R ⨝θ S

This returns a relation consisting of all combinations of tuples from R and S that satisfy the condition θ.

3.47

48 of 108

Equi Join in Relational Algebra

An Equi Join is a special case of the theta join where the condition is equality

( = ) between attributes of two relations. It’s used to combine tuples from two relations that have matching values in specified attributes.

🔹 Syntax

If R and S are two relations:

R ⨝_{R.A = S.B} S

This joins R and S where attribute A in R equals attribute B in S.

3.48

49 of 108

Example

Let’s consider two relations:

Employee

EmpID Name DeptID

1 Alice 10

2 Bob 20

3 Carol 30

Department

DeptID DeptName

10 HR

20 IT

40 Finance

🔹 Equi Join: Employee ⨝_{Employee.DeptID = Department.DeptID} Department

This will join the two tables where DeptID matches:

Result:

EmpID Name DeptID DeptName

1 Alice 10 HR

2 Bob 20 IT

Carol is excluded because her DeptID = 30 does not match any in the Department table.

3.49

50 of 108

Outer Join in Relational Algebra (Extended)

An outer join is an extension of the basic join operations in relational algebra. Unlike inner joins (like theta or equi joins), outer joins include unmatched tuples from one or both relations, filling in missing values with NULL.

There are three types:

Left Outer Join (⟕) – Includes all tuples from the left relation and matched tuples from the right.

Right Outer Join (⟖) – Includes all tuples from the right relation and matched tuples from the left.

Full Outer Join (⟗) – Includes all tuples from both relations, matched and unmatched.

3.50

51 of 108

Example

Let’s use the same relations:

Employee

EmpID Name DeptID

1 Alice 10

2 Bob 20

3 Carol 30

Department

DeptID DeptName

10 HR

20 IT

40 Finance

3.51

52 of 108

Left Outer Join: Employee ⟕ Department

Includes all employees, even if their DeptID doesn’t match:

EmpID Name DeptID DeptName

1 Alice 10 HR

2 Bob 20 IT

3 Carol 30 NULL

🔹 Right Outer Join: Employee ⟖ Department

Includes all departments, even if no employee belongs to them:

EmpID Name DeptID DeptName

1 Alice 10 HR

2 Bob 20 IT

NULL NULL 40 Finance

🔹 Full Outer Join: Employee ⟗ Department

Includes all employees and all departments:

EmpID Name DeptID DeptName

1 Alice 10 HR

2 Bob 20 IT

3 Carol 30 NULL

NULL NULL 40 Finance

3.52

53 of 108

Additional Operators

We define additional operators that do not add any power to the relational algebra, but that simplify common queries. That’s to say, the additional operators can be expressed by the six basic operators

  • Set intersection
  • Division
  • Assignment

3.53

54 of 108

Set-Intersection Operation

  • Notation: rs
  • Defined as:
  • rs ={ t | tr and ts }
  • Assume:
    • r, s have the same arity
    • attributes of r and s are compatible
  • Note: rs = r - (r - s)

3.54

55 of 108

Set-Intersection Operation - Example

  • Relation r, s:

A B

α

α

β

1

2

1

A B

α

β

2

3

r

s

A B

α 2

  • Relation r, s:

A B

α 2

  • r ∩ s

3.55

56 of 108

Division Operation

  • Suited to queries that include the phrase “for all”.
  • Let r and s be relations on schemas R and S respectively where
    • R = (A1, …, Am, B1, …, Bn)
    • S = (B1, …, Bn)

The result of r ÷ s is a relation on schema

RS = (A1, …, Am)

r ÷ s = { t | t R-S(r) ∧ ∀ u s ( tu r ) }

r ÷ s

3.56

57 of 108

Division Operation – Example

Relations r, s:

r ÷ s:

A

B

α

β

1

2

A

B

α

α

α

β

γ

δ

δ

δ

β

1

2

3

1

1

1

3

4

6

1

2

r

s

3.57

58 of 108

Another Division Example

A

B

α

α

α

β

β

γ

γ

γ

a

a

a

a

a

a

a

a

C

D

α

γ

γ

γ

γ

γ

γ

β

a

a

b

a

b

a

b

b

E

1

1

1

1

3

1

1

1

Relations r, s:

r ÷ s:

D

a

b

E

1

1

A

B

α

γ

a

a

C

γ

γ

r

s

3.58

59 of 108

Division Operation (Cont.)

  • Property
    • Let q = r ÷ s
    • Then q is the largest relation satisfying q x sr
  • Definition in terms of the basic algebra operation�Let r(R) and s(S) be relations, and let S R

r ÷ s = ∏R-S (r) –∏R-S ( (∏R-S (r) x s) – ∏R-S,S(r))�

To see why

    • R-S,S(r) simply reorders attributes of r�
    • R-S(∏R-S (r) x s) – ∏R-S,S(r)) gives those tuples t in �� ∏R-S (r) such that for some tuple u s, tu r.

3.59

60 of 108

In Relational Algebra, the division operation is used when you want to find tuples in one relation that are associated with all tuples in another relation. It’s particularly useful for queries involving phrases like “for all”.

Definition

If we have two relations:

R(A, B) — a relation with attributes A and B

S(B) — a relation with attribute B

Then the division of R by S, written as R ÷ S, returns a relation T(A) such that:

T contains all values of A from R such that for every B in S, the pair (A, B) is in R.

3.60

61 of 108

Example

Let’s say we have the following relations:

R(Student, Course)

Student Course

Alice DBMS

Alice OS

Bob DBMS

Bob OS

Bob Networks

Charlie DBMS

Charlie OS

S(Course)

Course

DBMS

OS

R ÷ S gives:

We want to find all students who have taken all the courses listed in S

(i.e., DBMS and OS).

"Find students who have enrolled in all mandatory courses."

"Find suppliers who supply all parts required."

3.61

62 of 108

Result:

Student

Alice

Bob

Charlie

All three students have taken both DBMS and OS.

If S had included "Networks", then only Bob would be in the result.

3.62

63 of 108

Assignment Operation

  • The assignment operation (←) provides a convenient way to express complex queries, write query as a sequential program consisting of a series of assignments followed by an expression whose value is displayed as a result of the query.
  • Assignments can also be used to update the database.
  • Example: Write r ÷ s as

temp1 ← ∏R-S (r) � temp2 ← ∏R-S ((temp1 x s) – ∏R-S,S (r))� result = temp1 – temp2

    • The result to the right of the ← is assigned to the relation variable on the left of the ←.
    • May use variable in subsequent expressions.
  • Example: Delete all account records in the Perryridge branch.

account account σ branch-name = “Perryridge” (account)

3.63

64 of 108

Customer_ID

101

Alice

102

Bob

103

Charlie

Account_ID

Customer_ID

Balance

A1

101

5000

A2

102

3000

A3

104

7000

Loan_ID

Customer_ID

Amount

L1

101

10000

L2

103

15000

Sample Relations

CUSTOMER

ACCOUNT

LOAN

3.64

65 of 108

Customer_ID

101

1. Set Intersection (∩)

Find customers who have both an account and a loan.

Result:

π Customer_ID (ACCOUNT) ∩ π Customer_ID (LOAN)

2. Natural Join (⨝)

Join CUSTOMER and ACCOUNT on Customer_ID.

CUSTOMER ⨝ ACCOUNT

Customer_ID

Name

Account_ID

Balance

101

Alice

A1

5000

102

Bob

A2

3000

Result:

3.65

66 of 108

3. Division (÷)

Suppose we have:

ENROLLS

Student

Course

S1

C1

S1

C2

S2

C1

S2

C2

S3

C1

REQUIRED_COURSES

Course

C1

C2

To find students who have taken all required courses:

ENROLLS ÷ REQUIRED_COURSES

Result:

Student

S1

S2

3.66

67 of 108

4. Assignment (←)

Store the result of a join in a temporary relation:

TEMP ← CUSTOMER ⨝ ACCOUNT

Now you can use TEMP in further operations like:

π Name (TEMP)

Result:

Name

Alice

Bob

3.67

68 of 108

Extended Operators

  • Generalized Projection
  • Aggregate Functions

3.68

69 of 108

Generalized Projection

  • Extends the projection operation by allowing arithmetic functions to be used in the projection list.�� ∏ F1, F2, …, Fn(E)
  • E is any relational-algebra expression
  • Each of F1, F2, …, Fn are are arithmetic expressions involving constants and attributes in the schema of E.
  • Given relation credit-info(customer-name, limit, credit-balance), find how much more each person can spend:

customer-name, limit – credit-balance (credit-info)

3.69

70 of 108

Aggregate Functions and Operations

  • Aggregation function takes a collection of values and returns a single value as a result.

avg: average value� min: minimum value� max: maximum value� sum: sum of values� count: number of values

  • Aggregate operation in relational algebra

G1, G2, …, Gn g F1( A1), F2( A2),…, Fn( An) (E)

    • E is any relational-algebra expression
    • G1, G2 …, Gn is a list of attributes on which to group (can be empty)
    • Each Fi is an aggregate function
    • Each Ai is an attribute name

3.70

71 of 108

Aggregate Operation – Example

  • Relation r:

A

B

α

α

β

β

α

β

β

β

C

7

7

3

10

g sum(c) (r)

sum-C

27

3.71

72 of 108

Aggregate Operation – Example

  • Relation account grouped by branch-name:

branch-name g sum(balance) (account)

branch-name

account-number

balance

Perryridge

Perryridge

Brighton

Brighton

Redwood

A-102

A-201

A-217

A-215

A-222

400

900

750

750

700

branch-name

balance

Perryridge

Brighton

Redwood

1300

1500

700

3.72

73 of 108

Aggregate Functions (Cont.)

  • Result of aggregation does not have a name
    • Can use rename operation to give it a name
    • For convenience, we permit renaming as part of aggregate operation�

branch-name g sum(balance) as sum-balance (account)

3.73

74 of 108

Null Values

  • It is possible for tuples to have a null value, denoted by null, for some of their attributes
  • null signifies an unknown value or that a value does not exist.
  • The result of any arithmetic expression involving null is null.
  • Comparisons with null values return false.
  • Aggregate functions simply ignore null values
    • Is an arbitrary decision. Could have returned null as result instead.
    • We follow the semantics of SQL in its handling of null values
  • For duplicate elimination and grouping, null is treated like any other value, and two nulls are assumed to be the same
    • Alternative: assume each null is different from each other
    • Both are arbitrary decisions, so we simply follow SQL

3.74

75 of 108

Tuple Relational Calculus

  • A nonprocedural query language, where each query is of the form

{t | P (t) }

  • It is the set of all tuples t such that predicate P is true for t
  • t is a tuple variable, t[A] denotes the value of tuple t on attribute A
  • tr denotes that tuple t is in relation r
  • P is a formula similar to that of the predicate calculus

3.75

76 of 108

Predicate Calculus Formula

1. Set of attributes and constants

2. Set of comparison operators: (e.g., <, ≤, =, ≠, >, ≥)

3. Set of connectives: and (∧), or (v)‚ not (¬)

4. Implication (⇒): x ⇒ y, if x if true, then y is true

xy ≡ ¬x v y

5. Set of quantifiers:

    • t r (Q(t)) ”there exists” a tuple in t in relation r� such that predicate Q(t) is true
    • t r (Q(t)) ≡ Q is true “for all” tuples t in relation r

3.76

77 of 108

Sample Schema

Let’s assume the following relations:

  • Student(SID, Name, Age, Marks)
  • Course(CID, Title, Credits)
  • Enrolled(SID, CID)

 TRC Query Examples

1. Get names of students who scored more than 80 marks

TRC Expression:

{ t.Name ∣ t∈Student ∧ t.Marks>80 }

2. Find students enrolled in course with CID = 'C101‘

TRC Expression:

{s.Name ∣ s∈Student ∧ ∃e( e∈Enrolled ∧ e.SID=s.SID ∧ e.CID=′C101′) }

3. Get names and ages of students younger than 20

TRC Expression:

{ t ∣ t∈Student ∧ t.Age<20}

3.77

78 of 108

4. Find students who are enrolled in at least one course

TRC Expression:

{ s.Name∣ s∈Student ∧ ∃e(e∈Enrolled ∧ e.SID=s.SID) }

Sample Schema

Let’s assume the following relations:

  • Student(SID, Name, Age, Marks)
  • Course(CID, Title, Credits)
  • Enrolled(SID, CID)

5. Get names of students who are not enrolled in any course

TRC Expression:

{s.Name∣ s∈Student ∧ ¬∃e(e∈Enrolled ∧ e.SID=s.SID)}

3.78

79 of 108

Domain Relational Calculus

  • A nonprocedural query language equivalent in power to the tuple relational calculus
  • Each query is an expression of the form:

{ < x1, x2, …, xn > | P(x1, x2, …, xn)}

    • x1, x2, …, xn represent domain variables
    • P represents a formula similar to that of the predicate calculus

3.79

80 of 108

Sample Schema

  • Student(SID, Name, Age, Marks)
  • Course(CID, Title, Credits)
  • Enrolled(SID, CID)

Domain Relational Calculus Examples

  1. Get names of students who scored more than 80 marks

DRC Expression:

{⟨name⟩∣ ∃sid,age,marks(Student(sid,name,age,marks) ∧ marks>80)}

2. Find names of students enrolled in course with CID = 'C101‘

DRC Expression:

{⟨name⟩∣ ∃sid,age,marks(Student(sid,name,age,marks) ∧ ∃cid(Enrolled(sid,cid) ∧ cid=′C101′))}

3.80

81 of 108

3. Get names and ages of students younger than 20

DRC Expression:

{⟨name,age⟩ ∣ ∃sid,marks(Student(sid,name,age,marks) ∧ age<20)}

4. Find names of students not enrolled in any course

DRC Expression:

{ ⟨name⟩∣ ∃sid,age,marks(Student(sid,name,age,marks) ∧ ¬∃cid(Enrolled(sid,cid)))}

5. Get names of students enrolled in at least one course

DRC Expression:

{⟨name⟩∣ ∃sid,age,marks(Student(sid,name,age,marks) ∧ ∃cid(Enrolled(sid,cid)))}

3.81

82 of 108

Example Queries

  • Find the branch-name, loan-number, and amount for loans of over $1200

{< l, b, a > | < l, b, a > ∈ loana > 1200}

  • Find the names of all customers who have a loan of over $1200�

{< c > | ∃ l, b, a (< c, l > ∈ borrower ∧ < l, b, a > ∈ loana > 1200)}

  • Find the names of all customers who have a loan from the Perryridge branch and the loan amount:

{< c, a > | ∃ l (< c, l > ∈ borrower ∧ ∃b(< l, b, a > ∈ loan

b = “Perryridge”))}

or {< c, a > | ∃ l (< c, l > ∈ borrower ∧ < l, “Perryridge”, a > ∈ loan)}

3.82

83 of 108

Example Queries

  • Find the names of all customers having a loan, an account, or both at the Perryridge branch:

{< c > | ∃ l ({< c, l > ∈ borrower � b,a(< l, b, a > ∈ loanb = “Perryridge”))� ∨ ∃ a(< c, a > ∈ depositor� ∧ ∃ b,n(< a, b, n > ∈ accountb = “Perryridge”))}

  • Find the names of all customers who have an account at all branches located in Brooklyn:

{< c > | ∃ n (< c, s, n > ∈ customer) ∧

x,y,z(< x, y, z > ∈ branchy = “Brooklyn”) ⇒� ∃ a,b(< x, y, z > ∈ account ∧ < c,a > ∈ depositor)}

3.83

84 of 108

Banking Example

  • branch (branch-name, branch-city, assets)
  • customer (customer-name, customer-street, customer-city)
  • account (account-number, branch-name, balance)
  • loan (loan-number, branch-name, amount)
  • depositor (customer-name, account-number)
  • borrower (customer-name, loan-number)

3.84

85 of 108

Example Queries

  • Find the loan-number, branch-name, and amount for loans of over $1200

{t | tloant [amount] > 1200}

or {t | loan (t) t .amount > 1200}

  • Find the loan number for each loan of an amount greater than $1200

{t | ∃ s ∈ loan (t[loan-number] = s[loan-number]� ∧ s [amount] > 1200}

or {t.loan-number | t ∈ loan ∧ t [amount] > 1200}

Notice that a relation on schema [loan-number] is implicitly defined by the query

3.85

86 of 108

Example Queries

  • Find the names of all customers having a loan, an account, or both at the bank

{t | s borrower(t[customer-name] = s[customer-name])� ∨ ∃u depositor(t[customer-name] = u[customer-name])�

  • Find the names of all customers who have a loan and an account at the bank

{t | s borrower(t[customer-name] = s[customer-name])� ∧ ∃u depositor(t[customer-name] = u[customer-name])

3.86

87 of 108

Example Queries

  • Find the names of all customers having a loan at the Perryridge branch

{t | s borrower(t[customer-name] = s[customer-name] � ∧ ∃u loan(u[branch-name] = “Perryridge”� ∧ u[loan-number] = s[loan-number]))}�

  • Find the names of all customers who have a loan at the Perryridge branch, but no account at any branch of the bank

{t | s borrower(t[customer-name] = s[customer-name]� ∧ ∃u loan(u[branch-name] = “Perryridge”� ∧ u[loan-number] = s[loan-number]))� ∧ notv depositor (v[customer-name] = � t[customer-name]) }

3.87

88 of 108

Example Queries

  • Find the names of all customers having a loan from the Perryridge branch, and the cities they live in

{t | s loan(s[branch-name] = “Perryridge”� ∧ ∃u borrower (u[loan-number] = s[loan-number]� ∧ t [customer-name] = u[customer-name])� ∧ ∃ v customer (u[customer-name] = v[customer-name]� ∧ t[customer-city] = v[customer-city])))}

3.88

89 of 108

Example Queries

  • Find the names of all customers who have an account at all branches located in Brooklyn:

{t | ∃ c ∈ customer (t[customer.name] = c[customer-name]) ∧

s branch(s[branch-city] = “Brooklyn” ⇒ � ∃ u account ( s[branch-name] = u[branch-name]� ∧ ∃ m depositor ( t[customer-name] = m[customer-name]� ∧ m[account-number] = u[account-number] )) )}

3.89

90 of 108

Universal and Existential Quantifiers

  • (∀x) ((P(x)) <==> not (∃x)(not (P(x)))
  • (∃ x) ((P(x)) <==> not (∀ x)(not (P(x)))
  • (∀x) ((P(x) and Q(x)) <==> not (∃x)(not (P(x)) or not (Q(x)))
  • (∀x) ((P(x) or Q(x)) <==> not (∃x)(not (P(x)) and not (Q(x)))
  • (∃ x) ((P(x) or Q(x)) <==> not (∀ x)(not (P(x)) and not (Q(x)))
  • (∃ x) ((P(x) and Q(x)) <==> not (∀ x)(not (P(x)) or not (Q(x)))

  • (∀x) ((P(x)) ==> (∃x) (P(x))
  • Not (∃ x) ((P(x)) ==> not (∀ x)(P(x))

3.90

91 of 108

Examples

  • Find the names of employees who have no dependents.

{e.name | e ∈ employee and (not (∃d (d ∈ dependent and

e.ssn = d.essn))}

or

{e.name | e ∈ employee and ((∀ d (not (d ∈ dependent) or

not (e.ssn = d.essn)))}

(notice that:

∀ d (d ∈ dependent) ==> not (e.ssn = d.essn) )

3.91

92 of 108

Banking Example

branch (branch-name, branch-city, assets)�

customer (customer-name, customer-street, customer-only)

account (account-number, branch-name, balance)

loan (loan-number, branch-name, amount)

depositor (customer-name, account-number)

borrower (customer-name, loan-number)

3.92

93 of 108

Example Queries

  • Find all loans of over $1200

σamount > 1200 (loan)

  • Find the loan number for each loan of an amount greater than $1200

loan-numberamount > 1200 (loan))

3.93

94 of 108

Example Queries

  • Find the names of all customers who have a loan, an account, or both, from the bank

customer-name (borrower) ∪ ∏customer-name (depositor)

  • Find the names of all customers who have a loan and an account at bank.

customer-name (borrower) ∩ ∏customer-name (depositor)

3.94

95 of 108

Example Queries

  • Find the names of all customers who have a loan at the Perryridge branch.

customer-name (σbranch-name=“Perryridge

(σborrower.loan-number = loan.loan-number(borrower x loan)))

  • Find the names of all customers who have a loan at the Perryridge branch but do not have an account at any branch of the bank.

customer-name (σbranch-name = “Perryridge”

(σborrower.loan-number = loan.loan-number(borrower x loan)))�� – ∏customer-name(depositor)

3.95

96 of 108

Example Queries

  • Find the names of all customers who have a loan at the Perryridge branch.
    • Query 1� customer-namebranch-name = “Perryridge”

borrower.loan-number = loan.loan-number(borrower x loan)))

− Query 2

customer-nameloan.loan-number = borrower.loan-number(� (σbranch-name = “Perryridge”(loan)) x� borrower)� )

3.96

97 of 108

Example Queries

Find the largest account balance

  • Rename account relation as d
  • The query is:

balance(account) - ∏account.balance

account.balance < d.balance (account x ρd (account)))

3.97

98 of 108

Example Queries

  • Find all customers who have an account from at least the “Downtown” and the Uptown” branches.
    • Query 1

CN(σBN=“Downtown(depositor account)) ∩

CN(σBN=“Uptown(depositor account))

where CN denotes customer-name and BN denotes �branch-name.

    • Query 2

customer-name, branch-name (depositor account)� ÷ ρtemp(branch-name) ({(“Downtown”), (“Uptown”)})

3.98

99 of 108

Example Queries

� ∏customer-name, branch-name (depositor account)� ÷ ∏branch-name branch-city = “Brooklyn” (branch))

  • Find all customers who have an account at all branches located in Brooklyn city.

3.99

100 of 108

Outer Join

  • An extension of the join operation that avoids loss of information.
  • Computes the join and then adds tuples form one relation that does not match tuples in the other relation to the result of the join.
  • Uses null values:
    • null signifies that the value is unknown or does not exist
    • All comparisons involving null are (roughly speaking) false by definition.
      • Will study precise meaning of comparisons with nulls later

3.100

101 of 108

Outer Join – Example

  • Relation loan

loan-number

amount

L-170

L-230

L-260

3000

4000

1700

  • Relation borrower

customer-name

loan-number

Jones

Smith

Hayes

L-170

L-230

L-155

branch-name

Downtown

Redwood

Perryridge

3.101

102 of 108

Outer Join – Example

  • Inner Join�loan Borrower

loan borrower

  • Left Outer Join

loan-number

amount

L-170

L-230

3000

4000

customer-name

Jones

Smith

branch-name

Downtown

Redwood

loan-number

amount

L-170

L-230

L-260

3000

4000

1700

customer-name

Jones

Smith

null

branch-name

Downtown

Redwood

Perryridge

3.102

103 of 108

Outer Join – Example

  • Right Outer Join

loan borrower

loan-number

amount

L-170

L-230

L-155

3000

4000

null

customer-name

Jones

Smith

Hayes

loan-number

amount

L-170

L-230

L-260

L-155

3000

4000

1700

null

customer-name

Jones

Smith

null

Hayes

loan borrower

  • Full Outer Join

branch-name

Downtown

Redwood

null

branch-name

Downtown

Redwood

Perryridge

null

3.103

104 of 108

View Definition

  • A view is defined using the create view statement which has the form

create view v as <query expression>

where <query expression> is any legal relational algebra query expression. The view name is represented by v.

  • Once a view is defined, the view name can be used to refer to the virtual relation that the view generates.
  • View definition is not the same as creating a new relation by evaluating the query expression Rather, a view definition causes the saving of an expression to be substituted into queries using the view.

3.104

105 of 108

Views

  • In some cases, it is not desirable for all users to see the entire logical model (i.e., all the actual relations stored in the database.)
  • Consider a person who needs to know a customer’s loan number but has no need to see the loan amount. This person should see a relation described, in the relational algebra, by

customer-name, loan-number (borrower loan)

  • Any relation that is not of the conceptual model but is made visible to a user as a “virtual relation” is called a view.

3.105

106 of 108

View Examples

  • Consider the view (named all-customer) consisting of branches and their customers.

create view all-customer as

branch-name, customer-name (depositor account)

∪ ∏branch-name, customer-name (borrower loan)

  • We can find all customers of the Perryridge branch by writing:

branch-name

branch-name = “Perryridge” (all-customer))

3.106

107 of 108

Views Defined Using Other Views

  • One view may be used in the expression defining another view
  • A view relation v1 is said to depend directly on a view relation v2 if v2 is used in the expression defining v1
  • A view relation v1 is said to depend on view relation v2 if either v1 depends directly to v2 or there is a path of dependencies from v1 to v2
  • A view relation v is said to be recursive if it depends on itself.

3.107

108 of 108

Snapshots

  • Any relation that is not of the conceptual model but is made visible to a user to reflect the status of the database at the moment when the relation is created is called a snapshot.
  • A view is like a “mirror” always reflecting the current status of the database; it changes when the database is updated. A snapshot reflects just the status when the snapshot is created; once it’s created, it doesn’t change.
  • A snapshot is defined using the create snapshot statement which has the form

create snapshot v as <query expression>

3.108