CS 3200 - Fall 2024
The Relational Model of Data
Mark Fontenot, PhD
Northeastern University
These notes are based upon and adapted via the generosity of Prof. Nate Derbinsky.
Reminders
2
Applications and Data
3
From Kleppmann, Designing Data Intensive Applications, O’Reilly, 2017.
Stores of data?
4
What is a database?
5
Notion
6
Database Management Systems (DBMS)
7
Some Categories of DBMSs
Document-Oriented Databases
- Organizes and queries data based on the concept of a “document” often in JSON
- usually considered semi-structured
Graph Databases
- Organizes data by nodes, edges, labels
- Query about paths between nodes and node relationships
8
Some Categories of DBMSs
Key-Value Databases
- Everything is a key/value pair
- Based on associative array
Spatial Databases
- Stores data related to 2D/3D locations
- Query example: are 2 cars about to collide?
9
Some Categories of DBMSs
Vector Databases
10
The category for this course…
11
Relational Model of Data:
Overview
12
The Relational Database: Relation/Table
13
Image borrowed from this Medium post.
Customers
Relation/Table Name Attributes/Columns
Rows/
Tuples
Relation - the core construct in a relational database.
Relation Schema - represents the attributes and data types for a particular relation
Relation Instance - represents the state of the data in the relation at a particular point in time.
Row/Tuple - values for each relation’s attribute for one element of a relation instance
The Relational Database: Constraints
14
Image borrowed from this Medium post.
cannot be
null
Must be a valid dorm id in the dorm relation
Example
Constraints:
Constraint - conditions that must hold on all valid relation instances
Types:
ID | Name | Phone | Dorm | Age | GPA |
1123141 | Mark | 555-1234 | 1 | 19 | 3.21 |
2323411 | Kim | 555-9876 | 2 | 25 | 3.53 |
17642352 | Sam | 555-6758 | 1 | 19 | 3.25 |
Student
The Relational Database: Relationships
15
Name | ID | Phone | Dorm | Age | GPA |
Mark | 1123141 | 555-1234 | 1 | 19 | 3.21 |
Kim | 2323411 | 555-9876 | 2 | 25 | 3.53 |
Sam | 17642352 | 555-6758 | 1 | 19 | 3.25 |
Dorm | Name |
1 | 555 Huntington |
2 | Baker |
ID | Class |
1123141 | COMP355 |
2323411 | COMP355 |
17642352 | MATH650 |
1123141 | MATH650 |
2323411 | BIOL110 |
Student
Dorm
Class
Some values in one table are related (by design) to values in another table
Referential Integrity Constraint Examples
The Relational Database: Queries
16
Name | ID | Phone | Dorm | Age | GPA |
Mark | 1123141 | 555-1234 | 1 | 19 | 3.21 |
Kim | 2323411 | 555-9876 | 2 | 25 | 3.53 |
Sam | 17642352 | 555-6758 | 1 | 19 | 3.25 |
Dorm | Name |
1 | 555 Huntington |
2 | Baker |
ID | Class |
1123141 | COMP355 |
2323411 | COMP355 |
17642352 | MATH650 |
1123141 | MATH650 |
2323411 | BIOL110 |
Student
Dorm
Class
Questions (Queries):
“Provide a list of student names and IDs with GPA between 3.0 and 3.5.”
The Relational Database: Queries
17
Name | ID | Phone | Dorm | Age | GPA |
Mark | 1123141 | 555-1234 | 1 | 19 | 3.21 |
Kim | 2323411 | 555-9876 | 2 | 25 | 3.53 |
Sam | 17642352 | 555-6758 | 1 | 19 | 3.25 |
Dorm | Name |
1 | 555 Huntington |
2 | Baker |
ID | Class |
1123141 | COMP355 |
2323411 | COMP355 |
17642352 | MATH650 |
1123141 | MATH650 |
2323411 | BIOL110 |
Student
Dorm
Class
Questions (Queries):
“Which students live in Baker Dorm?”
Databases in the Context of a Software System
18
Database
Tier
Application
Tier
Client
Tier
The Relational Model of Data:�Digging In
19
History 101
Codd, Edgar F. "A relational model of data for large shared data banks." Communications of the ACM 13.6 (1970): 377-387.
20
“Future users of large data banks must be protected from having to know how the data is organized in the machine (the internal representation)…”
History 101
21
A Little Set Theory Review…
�
22
A set is a collection of unique objects. The things (objects) inside a set are called elements.
cardinality → the number of unique elements in the set.
What are some examples of sets?
?
A Little Set Theory Review…
23
A = {1, 3, 5} B = {3, 4, 5, 6}
A Little Set Theory Review…
A = {123, 435} B = {Chul, Kev, Sal}
24
{ (123, Chul), (123, Kev), (123, Sal), (435, Chul), (435, Kev), (435, Sal) }
What’s a Relational Database?
25
Relations
26
More formally…
27
Relations - Example
28
Is the following a valid instance of TA?
{
(‘Jared’, 001928374, ‘CS’)� (‘Sakshi’, 001122334, ‘DS’)
}
?
Relations - Example
29
Is the following a valid instance of TA?
{
(‘Sakshi’, 001928374, ‘CS’)� (‘Sakshi’, 001122334, ‘CY’)
}
?
Relations - Example
30
Is the following a valid instance of TA?
{
(‘Sakshi’, 001928374, ‘CS’)� (‘Dylan’, 001122334, ‘DS’)
}
?
Relations - Example
31
Is the following a valid instance of TA?
{
(‘Sakshi’, 001928374, ‘CS’)� (‘Jared’, 001122334)
}
?
Relation Instance
32
Name | ID | Phone | Dorm | Age | GPA |
Mark | 1123141 | 555-1234 | 1 | 19 | 3.21 |
Kim | 2323411 | 555-9876 | 2 | 25 | 3.53 |
Sam | 17642352 | 555-6758 | 1 | 19 | 3.25 |
Student
1 Tuple
Tuples: Theory vs. Practice
33
Null Value
34
Value of an Attr in a Tuple
35
Name | ID | Address | Phone | Dorm | Age | GPA |
Mark | 1123141 | 121 Anystreet�Boston MA 02212 | 555-1234 555-1876 | 1 | 19 | 3.21 |
Kim | 2323411 | 235 Huntington�Boston MA 02215 | 555-9876 | 2 | 25 | 3.53 |
Student
Super and Candidate Keys
36
Keys - Superkey
37
Some Possible Superkeys:
(customerNumber, customerName)
(customerNumber, salesRepEmployeeNumber)
(customerNumber)
Not Minimal
Minimal
Customers
Keys - Candidate Keys
38
Some Possible Superkeys:
(customerNumber, customerName)
(customerNumber, salesRepEmployeeNumber)
(customerNumber)
Not Candidate Keys
Candidate Key
Customers
The Primary Key
39
Keys - Primary Key
40
Some Possible Superkeys:
(customerNumber, customerName)
(customerNumber, salesRepEmployeeNumber)
(customerNumber)
Chosen as Primary Key
If there are 2+ Candidate Keys, DB Designer will choose one as the Primary key
Customers
Foreign Keys
41
Foreign Key Example
42
Students
cID | sID | Grade |
332 | 234 | B |
332 | 332 | A |
535 | 336 | C |
221 | 134 | A |
535 | 332 | C |
221 | 332 | A |
221 | 336 | B |
Grades
sID | Name | Year |
134 | Chul | 1 |
234 | Kev | 3 |
332 | Sam | 2 |
336 | Ashwin | 2 |
Parent Relation Child Relation
(the one being referenced) (the one referencing another relation)
Primary Keys Foreign Key
Note: in this example, Grades.sID cannot contain null values because it is part of the PK of Grades Relation.
FKs - but why???
43
Customers
Notice: empLast, empFirst, and empEmail attributes contain duplicated/repeated data. Opens up the possibility of insert or update anomalies.
So, put them in a separate table along with empNum and refer back to the new table (becomes the parent table)
Relational Algebra
44
Relational Algebra
45
Quick Aside: Predicate Functions
Examples:
46
empID | firstName |
333 | Bob |
143 | Sam |
Employees
The First 6 Basic Operations of RA
47
Relational Algebra: Select Operator
Notation:�
48
Relational Algebra: Select Operator
49
Dept | Class | Enrollment |
CS | 3200 | 40 |
CS | 2500 | 643 |
CS | 1800 | 680 |
DS | 2000 | 412 |
Enrollments
Dept | Class | Enrollment |
CS | 2500 | 643 |
CS | 1800 | 680 |
Enrollments
Result:
Relational Algebra: Select Operator
50
Dept | Class | Taught_by |
CS | 3200 | CS |
CS | 2500 | CS |
CS | 1800 | Math |
DS | 2000 | Math |
Enrollments
Dept | Class | Taught_by |
CS | 3200 | CS |
Enrollments
Result:
Relational Algebra: Project Operator
Project - returns a relation with a subset of attributes (A1… Ak) from R.
�Notation:
Duplicate tuples will be removed from the resulting relation (because relations are sets).
51
Relational Algebra: Project Operator
52
Dept | Class | Enrollment |
CS | 3200 | 40 |
CS | 2500 | 643 |
CS | 1800 | 680 |
DS | 2000 | 412 |
Enrollments
Dept | Class |
CS | 3200 |
CS | 2500 |
CS | 1800 |
DS | 2000 |
Enrollments
Dept |
CS |
DS |
Enrollments
Result:
Result:
Relational Algebra: Cartesian Product
Same operation we know from set theory.
53
Attr_1 | Attr_2 |
123 | abc |
456 | def |
R
Attr_1 | Attr_3 |
123 | CS |
789 | DS |
111 | Cyber |
S
R.Attr_1 | R.Attr_2 | S.Attr_1 | S.Attr_3 |
123 | abc | 123 | CS |
123 | abc | 789 | DS |
123 | abc | 111 | Cyber |
456 | def | 123 | CS |
456 | def | 789 | DS |
456 | def | 111 | Cyber |
Union, Intersection & Difference
54
More Complex RA Expressions
55
Relational Algebra: Temporary Relation Names
For more complex RA queries, you can have:
Example:
TEMP_NAME can then be used as a relation in subsequent steps of the same query.
56
Relational Algebra: Rename Operator
Rename Operator (rho) – Allows us to “rename” a relation, the attributes of a relation, or both.
57
Developing Relational Algebra Expressions
58
Writing RA Queries
59
Writing RA Queries
60
Stu_ID | Name | Year |
134 | Chul | 1 |
234 | Kev | 3 |
332 | Sam | 2 |
336 | Ashwin | 2 |
Cou_ID | Dept | Number | Prof |
332 | CS | 3345 | Lawrimore |
221 | CS | 2341 | Fontenot |
535 | MATH | 2339 | Norris |
Students
Courses
C_ID | S_ID | Grade |
332 | 234 | B |
332 | 332 | A |
535 | 336 | C |
221 | 134 | A |
535 | 332 | C |
221 | 332 | A |
221 | 336 | B |
Grades
Next few examples use this database schema.
Writing RA Queries
61
Stu_ID | Name | Year |
134 | Chul | 1 |
234 | Kev | 3 |
332 | Sam | 2 |
336 | Ashwin | 2 |
Cou_ID | Dept | Number | Prof |
332 | CS | 3345 | Lawrimore |
221 | CS | 2341 | Fontenot |
535 | MATH | 2339 | Norris |
Students
Courses
C_ID | S_ID | Grade |
332 | 234 | B |
332 | 332 | A |
535 | 336 | C |
221 | 134 | A |
535 | 332 | C |
221 | 332 | A |
221 | 336 | B |
Grades
Write a RA query that returns the names of all students.
Writing RA Queries
62
Stu_ID | Name | Year |
134 | Chul | 1 |
234 | Kev | 3 |
332 | Sam | 2 |
336 | Ashwin | 2 |
Cou_ID | Dept | Number | Prof |
332 | CS | 3345 | Lawrimore |
221 | CS | 2341 | Fontenot |
535 | MATH | 2339 | Norris |
Students
Courses
C_ID | S_ID | Grade |
332 | 234 | B |
332 | 332 | A |
535 | 336 | C |
221 | 134 | A |
535 | 332 | C |
221 | 332 | A |
221 | 336 | B |
Grades
Provide a list of all course numbers taught by the CS Department.
Writing RA Queries
63
Stu_ID | Name | Year |
134 | Chul | 1 |
234 | Kev | 3 |
332 | Sam | 2 |
336 | Ashwin | 2 |
Cou_ID | Dept | Number | Prof |
332 | CS | 3345 | Lawrimore |
221 | CS | 2341 | Fontenot |
535 | MATH | 2339 | Norris |
Students
Courses
C_ID | S_ID | Grade |
332 | 234 | B |
332 | 332 | A |
535 | 336 | C |
221 | 134 | A |
535 | 332 | C |
221 | 332 | A |
221 | 336 | B |
Grades
List all 2nd year student names.
Writing RA Queries
64
Stu_ID | Name | Year |
134 | Chul | 1 |
234 | Kev | 3 |
332 | Sam | 2 |
336 | Ashwin | 2 |
Cou_ID | Dept | Number | Prof |
332 | CS | 3345 | Lawrimore |
221 | CS | 2341 | Fontenot |
535 | MATH | 2339 | Norris |
Students
Courses
C_ID | S_ID | Grade |
332 | 234 | B |
332 | 332 | A |
535 | 336 | C |
221 | 134 | A |
535 | 332 | C |
221 | 332 | A |
221 | 336 | B |
Grades
What course or courses (dept and number) are taught by Professor Norris?
Writing RA Queries
65
Stu_ID | Name | Year |
134 | Chul | 1 |
234 | Kev | 3 |
332 | Sam | 2 |
336 | Ashwin | 2 |
Cou_ID | Dept | Number | Prof |
332 | CS | 3345 | Lawrimore |
221 | CS | 2341 | Fontenot |
535 | MATH | 2339 | Norris |
Students
Courses
C_ID | S_ID | Grade |
332 | 234 | B |
332 | 332 | A |
535 | 336 | C |
221 | 134 | A |
535 | 332 | C |
221 | 332 | A |
221 | 336 | B |
Grades
List the letter grades that Sam has earned (don’t need to include course info).
Joining Data from Two Relations
66
Relational Algebra: The Join Operator
67
Natural Join
where attr(R) returns a set containing the attributes of R
68
Example
69
Students
cID | sID | Grade |
332 | 234 | B |
332 | 332 | A |
535 | 336 | C |
221 | 134 | A |
535 | 332 | C |
221 | 332 | A |
221 | 336 | B |
Grades
sID | Name | Year |
134 | Chul | 1 |
234 | Kev | 3 |
332 | Sam | 2 |
336 | Ashwin | 2 |
Query:
cID | sID | Grade | Name | Year |
332 | 234 | B | Kev | 3 |
332 | 332 | A | Sam | 2 |
535 | 336 | C | Ashwin | 2 |
221 | 134 | A | Chul | 1 |
535 | 332 | C | Sam | 2 |
221 | 332 | A | Sam | 2 |
221 | 336 | B | Ashwin | 2 |
Result:
theta-Join (or Join… or Condition(al) Join)
70
theta-Join (or Join… or Condition(al) Join)
71
A
B
1
Cat
2
Dog
3
Bird
C
D
1
Meow
3
Chirp
S
T
A
B
1
Cat
3
Bird
C
D
1
Meow
3
Chirp
New: Entity Relationship Diagram
72
Northwind Traders Data Model
- sample database model that has been around
for years; originally developed by Microsoft
- represents data model for fake Northwind Traders company, an importer/exporter of specialty foods
New: ER Diagram
73
Relation Name
Attributes
Primary Key Attributes
Relationships
Relations in Northwind
74
Practice Time!
75
For each of the following, compose a relational algebra query that satisfies the narrative-form of the question.
Query Time!
76
1. For each supplier, provide its name, contact person, and phone number.
Query Time!
77
2. Which products are supplied by FoodsRUs? Please include the product name and unit price.
Query Time!
78
3. Provide a list of all product names ordered by World Market (a customer’s name).
Query Time!
79
4. Which customers has employee Sam Johnson worked with? Provide the complete customer information.
Further Reading
80