Relational Algebra
Prof. Sharad Mehrotra
University of California, Irvine
What is Relational Algebra
2
Relational Algebra: in 1 slide!
A relational Algebra query is a composition of the above operators
3
3
Sample Schema
Patients(name, age) -- patients who currently have some disease
RecoveredPatients(name, age) -- patients who have recovered from the disease they had. They may still have other diseases in which case they will also be part of the Patients table as well.
Diagnosis(pname, disease, diagnosis_date) -- the diagnosis made about patients on a given date.
ToDiagnose(disease, test) -- tests used to diagnose the disease
Outcomes(pname, test, result, test_date) -- the result of the test when conducted on the patient pid on the given date.
4
Sample Tables
5
name | age |
'Mary' | 45 |
'Jane' | 60 |
'Mehdi' | 33 |
'Alex' | 30 |
'Bob' | 27 |
name | age |
'Mary' | 45 |
'John' | 60 |
'Al' | 33 |
'Barbara' | 30 |
'Alex' | 27 |
pname | disease | diagnosis_date |
'Mary' | 'Flu' | 2021-01-01 |
'Jane' | 'COVID' | 2021-01-01 |
'Mehdi' | 'Mono' | 2021-01-02 |
'Alex' | 'Strep' | 2021-01-03 |
'Bob' | 'Mono' | 2021-02-01 |
'John' | 'COVID' | 2021-02-10 |
'Al' | 'Strep' | 2021-02-16 |
'Mary' | 'Strep' | 2021-02-16 |
'Mehdi' | 'Meningitis' | 2021-02-10 |
disease | test |
'Flu' | 'A' |
'COVID' | 'B' |
'Mono' | 'C' |
'Strep' | 'D' |
'Mono' | 'E' |
'COVID' | 'F' |
'Strep' | 'G' |
'Meningitis' | 'H' |
Patients
RecoveredPatients
Diagnosis
ToDiagnose
select[name= mary]
6
pname | test | result | test_date |
'Mary' | 'A' | 'true' | 2021-01-01 |
'Mary' | 'B' | 'false' | 2021-01-01 |
'Mary' | 'D' | 'true' | 2021-01-01 |
'Jane' | 'B' | 'true' | 2021-01-01 |
'Jane' | 'C' | 'false' | 2021-01-01 |
'Jane' | 'F' | 'false' | 2021-01-01 |
'Mehdi' | 'E' | 'true' | 2021-01-02 |
'Mehdi' | 'E' | 'false' | 2021-01-02 |
'Mehdi' | 'F' | 'true' | 2021-01-02 |
'Alex' | 'G' | 'true' | 2020-01-03 |
'Bob' | 'C' | 'true' | 2021-02-01 |
'Al' | 'G' | 'true' | 2020-02-16 |
'Barbara' | 'G' | 'false' | 2020-02-10 |
'Alex' | 'A' | 'true' | 2020-01-03 |
'Mary' | 'A' | 'true' | 2020-01-01 |
Outcomes
Selection
σ c (R): return tuples in R that satisfy condition C.
7
Patients.name | Patients.age |
'Mary' | 45 |
'Jane' | 60 |
'Mehdi' | 33 |
'Alex' | 30 |
Patients.name | Patients.age |
'Mary' | 45 |
'Jane' | 60 |
'Mehdi' | 33 |
'Alex' | 30 |
'Bob' | 27 |
Patients
σage ≥30 ∧ age < 50(Patients)
Patients.name | Patients.age |
'Mary' | 45 |
'Mehdi' | 33 |
'Alex' | 30 |
σage ≥ 30(Patients)
Projection
ΠA1,…,Ak(R): pick columns of attributes A1,…,Ak of R.
8
πpname,disease (Diagnosis)
pname | disease |
'Mary' | 'Flu' |
'Jane' | 'COVID' |
'Mehdi' | 'Mono' |
'Alex' | 'Strep' |
'Bob' | 'Mono' |
'John' | 'COVID' |
'Al' | 'Strep' |
'Mary' | 'Strep' |
'Mehdi' | 'Meningitis' |
pname | disease | diagnosis_date |
'Mary' | 'Flu' | 2021-01-01 |
'Jane' | 'COVID' | 2021-01-01 |
'Mehdi' | 'Mono' | 2021-01-02 |
'Alex' | 'Strep' | 2021-01-03 |
'Bob' | 'Mono' | 2021-02-01 |
'John' | 'COVID' | 2021-02-10 |
'Al' | 'Strep' | 2021-02-16 |
'Mary' | 'Strep' | 2021-02-16 |
'Mehdi' | 'Meningitis' | 2021-02-10 |
Diagnosis
pname |
'Mary' |
'Jane' |
'Mehdi' |
'Alex' |
'Bob' |
'John' |
'Al' |
πpname (Diagnosis)
Notice: Duplicates eliminated in answer
Union ( ∪)
9
Union
name | age |
'Mary' | 45 |
'Jane' | 60 |
'Mehdi' | 33 |
'Alex' | 30 |
'Bob' | 27 |
name | age |
'Mary' | 45 |
'John' | 60 |
'Al' | 33 |
'Barbara' | 30 |
'Alex' | 27 |
Patients
RecoveredPatients
Query in Relax: Patients ∪ RecoveredPatients
Patients.name | Patients.age |
'Mary' | 45 |
'Jane' | 60 |
'Mehdi' | 33 |
'Alex' | 30 |
'Bob' | 27 |
'John' | 60 |
'Al' | 33 |
'Barbara' | 30 |
'Alex' | 27 |
Set operations must be over relations of the same type.
10
Error: at line 1: schemas are not unifiable: types are different or size is different: [RecoveredPatients.name : string, RecoveredPatients.age : number] and [Diagnosis.pname : string, Diagnosis.disease : string, Diagnosis.diagnosis_date : date]
name | age |
'Mary' | 45 |
'John' | 60 |
'Al' | 33 |
'Barbara' | 30 |
'Alex' | 27 |
Diagnosis
pname | disease | diagnosis_date |
'Mary' | 'Flu' | 2021-01-01 |
'Jane' | 'COVID' | 2021-01-01 |
'Mehdi' | 'Mono' | 2021-01-02 |
'Alex' | 'Strep' | 2021-01-03 |
'Bob' | 'Mono' | 2021-02-01 |
'John' | 'COVID' | 2021-02-10 |
'Al' | 'Strep' | 2021-02-16 |
'Mary' | 'Strep' | 2021-02-16 |
'Mehdi' | 'Meningitis' | 2021-02-10 |
RecoveredPatients
Diagnosis ∪ RecoveredPatients
Intersection
11
Patients.name | Patients.age |
'Mary' | 45 |
name | age |
'Mary' | 45 |
'Jane' | 60 |
'Mehdi' | 33 |
'Alex' | 30 |
'Bob' | 27 |
name | age |
'Mary' | 45 |
'John' | 60 |
'Al' | 33 |
'Barbara' | 30 |
'Alex' | 27 |
Patients
RecoveredPatients
Query in Relax: Patients ∩RecoveredPatients
Set Difference
12
name | age |
'Mary' | 45 |
'Jane' | 60 |
'Mehdi' | 33 |
'Alex' | 30 |
'Bob' | 27 |
name | age |
'Mary' | 45 |
'John' | 60 |
'Al' | 33 |
'Barbara' | 30 |
'Alex' | 27 |
Patients
RecoveredPatients
Query in Relax: Patients - RecoveredPatients
Patients.name | Patients.age |
'Jane' | 60 |
'Mehdi' | 33 |
'Alex' | 30 |
'Bob' | 27 |
Cartesian Product
R × S: pair each tuple r in
R with each tuple s in S.
13
‘Mary' | 'Flu' |
'Mary' | 'COVID' |
'Mary' | 'Mono' |
'Mary' | 'Strep' |
'Mary' | 'Meningitis' |
'Jane' | 'Flu' |
'Jane' | 'COVID' |
'Jane' | 'Mono' |
'Jane' | 'Strep' |
'Jane' | 'Meningitis' |
'Mehdi' | 'Flu' |
'Mehdi' | 'COVID' |
'Mehdi' | 'Mono' |
'Mehdi' | 'Strep' |
'Mehdi' | 'Meningitis' |
'Alex' | 'Flu' |
'Alex' | 'COVID' |
'Alex' | 'Mono' |
'Alex' | 'Strep' |
'Alex' | 'Meningitis' |
'Bob' | 'Flu' |
'Bob' | 'COVID' |
'Bob' | 'Mono' |
'Bob' | 'Strep' |
'Bob' | 'Meningitis' |
π name ( Patients ) ⨯ π disease ( ToDiagnose )
Join
R S = σ c (R × S)
14
14
C
<cond_1> AND <cond_2> AND … AND <cond_k>
Each cond_i is of the form A op B, where:
Theta-Join
(Diagnosis)
⨝diagnosis_date>test_date
(Outcomes)
15
15
Diagnosis × Outcomes
pname | disease | diagnosis_date |
'Mary' | 'Flu' | 2021-01-01 |
'Mehdi' | 'Mono' | 2021-01-02 |
'John' | 'COVID' | 2021-02-10 |
Diagnosis
pname | test | result | test_date |
'Mary' | 'A' | 'true' | 2021-01-01 |
'Mehdi' | 'E' | 'true' | 2021-01-02 |
'Barbara' | 'G' | 'false' | 2021-02-10 |
Outcomes
Diagnosis.pname | Diagnosis.disease | Diagnosis.diagnosis_date | Outcomes.pname | Outcomes.test | Outcomes.result | Outcomes.test_date |
'Mehdi' | 'Mono' | 2021-01-02 | 'Mary' | 'A' | 'true' | 2021-01-01 |
'John' | 'COVID' | 2021-02-10 | 'Mary' | 'A' | 'true' | 2021-01-01 |
'John' | 'COVID' | 2021-02-10 | 'Mehdi' | 'E' | 'true' | 2021-01-02 |
Diagnosis.pname | Diagnosis.disease | Diagnosis.diagnosis_date | Outcomes.pname | Outcomes.test | Outcomes.result | Outcomes.test_date |
'Mary' | 'Flu' | 2021-01-01 | 'Mary' | 'A' | 'true' | 2021-01-01 |
'Mary' | 'Flu' | 2021-01-01 | 'Mehdi' | 'E' | 'true' | 2021-01-02 |
'Mary' | 'Flu' | 2021-01-01 | 'Barbara' | 'G' | 'false' | 2021-02-10 |
'Mehdi' | 'Mono' | 2021-01-02 | 'Mary' | 'A' | 'true' | 2021-01-01 |
'Mehdi' | 'Mono' | 2021-01-02 | 'Mehdi' | 'E' | 'true' | 2021-01-02 |
'Mehdi' | 'Mono' | 2021-01-02 | 'Barbara' | 'G' | 'false' | 2021-02-10 |
'John' | 'COVID' | 2021-02-10 | 'Mary' | 'A' | 'true' | 2021-01-01 |
'John' | 'COVID' | 2021-02-10 | 'Mehdi' | 'E' | 'true' | 2021-01-02 |
'John' | 'COVID' | 2021-02-10 | 'Barbara' | 'G' | 'false' | 2021-02-10 |
Theta-Join
(Diagnosis)
⨝diagnosis_date>=test_date AND Diagnosis.pname ≠ Outcomes.pname
(Outcomes)
16
16
Diagnosis⨯Outcomes
pname | disease | diagnosis_date |
'Mary' | 'Flu' | 2021-01-01 |
'Mehdi' | 'Mono' | 2021-01-02 |
'John' | 'COVID' | 2021-02-10 |
Diagnosis
pname | test | result | test_date |
'Mary' | 'A' | 'true' | 2021-01-01 |
'Mehdi' | 'E' | 'true' | 2021-01-02 |
'Barbara' | 'G' | 'false' | 2021-02-10 |
Outcomes
Diagnosis.pname | Diagnosis.disease | Diagnosis.diagnosis_date | Outcomes.pname | Outcomes.test | Outcomes.result | Outcomes.test_date |
'Mehdi' | 'Mono' | 2021-01-02 | 'Mary' | 'A' | 'true' | 2021-01-01 |
'John' | 'COVID' | 2021-02-10 | 'Mary' | 'A' | 'true' | 2021-01-01 |
'John' | 'COVID' | 2021-02-10 | 'Mehdi' | 'E' | 'true' | 2021-01-02 |
'John' | 'COVID' | 2021-02-10 | 'Barbara' | 'G' | 'false' | 2021-02-10 |
Diagnosis.pname | Diagnosis.disease | Diagnosis.diagnosis_date | Outcomes.pname | Outcomes.test | Outcomes.result | Outcomes.test_date |
'Mary' | 'Flu' | 2021-01-01 | 'Mary' | 'A' | 'true' | 2021-01-01 |
'Mary' | 'Flu' | 2021-01-01 | 'Mehdi' | 'E' | 'true' | 2021-01-02 |
'Mary' | 'Flu' | 2021-01-01 | 'Barbara' | 'G' | 'false' | 2021-02-10 |
'Mehdi' | 'Mono' | 2021-01-02 | 'Mary' | 'A' | 'true' | 2021-01-01 |
'Mehdi' | 'Mono' | 2021-01-02 | 'Mehdi' | 'E' | 'true' | 2021-01-02 |
'Mehdi' | 'Mono' | 2021-01-02 | 'Barbara' | 'G' | 'false' | 2021-02-10 |
'John' | 'COVID' | 2021-02-10 | 'Mary' | 'A' | 'true' | 2021-01-01 |
'John' | 'COVID' | 2021-02-10 | 'Mehdi' | 'E' | 'true' | 2021-01-02 |
'John' | 'COVID' | 2021-02-10 | 'Barbara' | 'G' | 'false' | 2021-02-10 |
Equi-Join
17
17
pname | disease | diagnosis_date |
'Mary' | 'Flu' | 2021-01-01 |
'Mehdi' | 'Mono' | 2021-01-02 |
'John' | 'COVID' | 2021-02-10 |
Diagnosis
pname | test | result | test_date |
'Mary' | 'A' | 'true' | 2021-01-01 |
'Mehdi' | 'E' | 'true' | 2021-01-02 |
'Barbara' | 'G' | 'false' | 2021-02-10 |
Outcomes
Diagnosis.pname | Diagnosis.disease | Diagnosis.diagnosis_date | Outcomes.pname | Outcomes.test | Outcomes.result | Outcomes.test_date |
'Mary' | 'Flu' | 2021-01-01 | 'Mary' | 'A' | 'true' | 2021-01-01 |
'Mehdi' | 'Mono' | 2021-01-02 | 'Mehdi' | 'E' | 'true' | 2021-01-02 |
(Diagnosis) ⨝Diagnosis.pname=Outcomes.pname (Outcomes)
Natural-Join
18
18
(Diagnosis) ⨝ (Outcomes)
R S = Π L (R S)
R.A1=S.A1 AND … AND R.Ak=S.Ak
pname | disease | diagnosis_date |
'Mary' | 'Flu' | 2021-01-01 |
'Mehdi' | 'Mono' | 2021-01-02 |
'John' | 'COVID' | 2021-02-10 |
Diagnosis
pname | test | result | test_date |
'Mary' | 'A' | 'true' | 2021-01-01 |
'Mehdi' | 'E' | 'true' | 2021-01-02 |
'Barbara' | 'G' | 'false' | 2021-02-10 |
Outcomes
Diagnosis.pname | Diagnosis.disease | Diagnosis.diagnosis_date | Outcomes.test | Outcomes.result | Outcomes.test_date |
'Mary' | 'Flu' | 2021-01-01 | 'A' | 'true' | 2021-01-01 |
'Mehdi' | 'Mono' | 2021-01-02 | 'E' | 'true' | 2021-01-02 |
Renaming ρ
19
19
ρS(B1,…,Bn)(R)�
A relation identical to R, with new attributes B1,…,Bn.
name | age |
'Mary' | 45 |
'Jane' | 60 |
'Mehdi' | 33 |
'Alex' | 30 |
'Bob' | 27 |
Patients (name, age)
name1 | age1 |
'Mary' | 45 |
'Jane' | 60 |
'Mehdi' | 33 |
'Alex' | 30 |
'Bob' | 27 |
Patients1 (name1, age1)
ρ Patients1 (ρ name1←name, age1←age (Patients))
Combining Operations
20
20
C
Query 0: Who has what disease
Query: Diagnosis
21
Diagnosis.pname | Diagnosis.disease | Diagnosis.diagnosis_date |
'Mary' | 'Flu' | 2021-01-01 |
'Jane' | 'COVID' | 2021-01-01 |
'Mehdi' | 'Mono' | 2021-01-02 |
'Alex' | 'Strep' | 2021-01-03 |
'Bob' | 'Mono' | 2021-02-01 |
'John' | 'COVID' | 2021-02-10 |
'Al' | 'Strep' | 2021-02-16 |
'Mary' | 'Strep' | 2021-02-16 |
'Mehdi' | 'Meningitis' | 2021-02-10 |
Query 1: Which Patients are younger compared to Mary
22
22
Patients.name | Patients.age |
'Mary' | 45 |
'Jane' | 60 |
'Mehdi' | 33 |
'Alex' | 30 |
'Bob' | 27 |
Patients1.name1 |
'Mehdi' |
'Alex' |
'Bob' |
Query 1: Which Patients are younger compared to Mary
23
23
ρPatients1(Patients) ⨝Patients1.age < Patients.age σname = 'Mary' (Patients)
Diagonsis | Patient | Disease |
| Mary | Strep |
ToDiagnose | Disease | Test |
| Strep | A |
Patient | name | age |
| Mary | 45 |
name | age |
'Mary' | 45 |
'Jane' | 60 |
'Mehdi' | 33 |
'Alex' | 30 |
'Bob' | 27 |
name | age |
'Mary' | 45 |
'Jane' | 60 |
'Mehdi' | 33 |
'Alex' | 30 |
'Bob' | 27 |
Query 1: Which Patients are younger compared to Mary
24
24
ρPatients1(Patients) ⨝Patients1.age < Patients.age σname = 'Mary' (Patients)
project_p1.name (select[name= mary] Patients join[patient.age:> p1.age rename[p1](patient))
project (select[name= mary] (Patients join[patient.age:> p1.age rename[p1](patient)))
Query 2: Who has a disease they have been tested for?
25
Diagonsis | Patient | Disease |
| Mary | Strep |
ToDiagnose | Disease | Test |
| Strep | A |
Outcome | Patient | Test | Outc |
| Mary | A | T |
Mary has been tested for A
A is a test for Strep
Mary has Strep
Query 2: Who has a disease they have been tested for?
πpname (Outcomes ⨝ToDiagnose⨝Diagnosis)
26
'Mary' |
'Jane' |
'Mehdi' |
'Alex' |
'Bob' |
'Al' |
Query 2: Who has a disease they have been tested for?
27
Several relational algebra expressions are equivalent.
πpname ((Diagnosis ⨝ ToDiagnose) ⨝ Outcomes)
Mary' |
'Jane' |
'Mehdi' |
'Alex' |
'Bob' |
'Al' |
Query 2: Who has a disease they have been tested for?
28
πpname ((πdisease,pname Diagnosis ⨝ ToDiagnose) ⨝ πpname, test Outcomes)
Several relational algebra expressions are equivalent.
Mary' |
'Jane' |
'Mehdi' |
'Alex' |
'Bob' |
'Al' |
Query 3: Who has a disease they have been tested Positive for?
29
29
πpname((Diagnosis ⨝ ToDiagnose) ⨝ σ result='true' (Outcomes))
Diagonsis | Patient | Disease |
| Mary | Strep |
ToDiagnose | Disease | Test |
| Strep | A |
Outcome | Patient | Test | Outc |
| Mary | A | T |
Mary has been tested for A
A is a test for Strep
Mary has Strep
outcome is true
Query 3: Who has a disease they have been tested Positive for?
30
πpname((Diagnosis ⨝ ToDiagnose) ⨝ σ result='true' (Outcomes))
'Mary' |
'Jane' |
'Mehdi' |
'Alex' |
'Bob' |
'Al' |
Query 3’’ : Who has a disease they have been tested Negative for?
31
πpname ((Diagnosis ⨝ ToDiagnose) ⨝ σresult='false' (Outcomes))
Jane' |
'Mehdi' |
Query 4: Who has a disease they have tested both positively & negatively for?
32
(πpname ((Diagnosis ⨝ ToDiagnose) ⨝ σresult='true' Outcomes)) ∩ (πpname ((Diagnosis ⨝ ToDiagnose) ⨝ σresult='false' Outcomes))
Jane' |
'Mehdi' |
Why will simply taking intersection of the previous two queries result in wrong results?
Query 4: Who has a disease they have tested both positively & negatively for?
33
(πpname ((Diagnosis ⨝ ToDiagnose) ⨝ σresult='true' Outcomes)) ∩ (πpname ((Diagnosis ⨝ ToDiagnose) ⨝ σresult='false' Outcomes))
Jane' |
'Mehdi' |
Why will simply taking intersection of the previous two queries result in wrong results?
A person could be in the answer if he/she tested positively for one disease and negatively for another!
Query 4: Who has a disease they have tested both positively & negatively for? - Take 2!
34
Jane' |
'Mehdi' |
pos = πpname,disease ((Diagnosis ⨝ ToDiagnose) ⨝ σresult='true' Outcomes)
neg = (πpname,disease ((Diagnosis ⨝ ToDiagnose) ⨝ σresult='false' Outcomes))
πpname (pos ∩ neg)
Turns out the results did not change between the wrong & right version of the query. But they could have. Play around with the database by inserting perhaps more people into diagnosis until the answers change
Query 5: Who tested both positively and negatively for a disease, whether or not they have it…..
35
35
ToDiagnose | Disease | Test |
| Strep Sterp | A B |
Outcome | Patient | Test | Outc |
| Mary | A | T |
Mary has been tested for strep positive outcome
A is a test for Strep
Mary tested for Strep negative outcome
B is a test for strep
Notice: Query ONLY needs Outcome and not Diagnosis Table!!
Outcome | Patient | Test | Outc |
| Mary | B | F |
ToDiagnose | Disease | Test |
| Strep Sterp | A B |
intersection
Query 5: Who tested both positively and negatively for a disease, whether or not they have it…..
36
pos = π pname,disease (ToDiagnose ⨝ σ result='true' Outcomes)
neg = π pname,disease (ToDiagnose ⨝ σ result='false' Outcomes)
πpname (pos ∩ neg)
Notice: Query ONLY needs Outcome and not Diagnosis Table!!
'Jane' |
'Mehdi' |
Query 6: Who tested both positively and negatively for the same test….
We need to join Outcomes table with itself and check for the same person having the same test once with result true and another time with false.
37
Outcome | Patient | Test | Outc |
| Mary | A | T |
Outcome | Patient | Test | Outc |
| Mary | A | F |
Query 6: Who tested both positively and negatively for the same test….
38
π Outcomes.pname ((Outcomes) ⨝Outcomes.pname = Outcomes1.pname AND Outcomes.test = Outcomes1.test AND Outcomes.result ≠ Outcomes1.result (ρ Outcomes1 (Outcomes)))
'Mehdi'
Query 7: What testable disease does no one have?
testable disease does no one have?
39
HINT HINT……think set difference…
Let us find
Then subtract b) from a)
Query 7: What testable disease does no one have?
40
πdisease(ToDiagnose) - πdisease(Diagnosis)
Query 8: What disease more than one person have?
41
Diagnosis | Patient | disease | date |
| Mary | flu | …. |
Diagnosis | Patient | disease | date |
| John | flu | …. |
Same disease
Different names
Query 8: What disease more than one person have?
42
πDiagnosis.disease ((Diagnosis) ⨝Diagnosis.pname≠Diagnosis2.pname AND Diagnosis.disease=Diagnosis2.disease (ρDiagnosis2 Diagnosis))
Diagnosis.disease |
'COVID' |
'Mono' |
'Strep' |
Query 9: What disease does everyone have?
Difficult to write query to answer directly! So think opposite….
Can we find diseases that not everyone has?
Let P be the set of people, and D the set of diseases……
Then P x D is the set of ALL possible people disease pairs.
Now we subtract from this set the table Diagnosis
- the result is a set of tuples, where each tuple represents a name of a person and the disease the person does NOT have.
NOTE: any disease that everyone has will NOT be in the table...
So if we subtract this table from the set of diseases, the result will be the set of disease EVERYONE has….
OK… we are now ready to express the above in relational algebra...
43
43
Query 9: What disease does everyone have?
AllDiseases = πdisease Diagnosis
AllPatients = πpname Diagnosis
DiseasesNotEveryoneHas= πdisease ((AllPatients x AllDiseases) - (πpname,disease Diagnosis))
DiseaseEveryonehas = AllDiseases - DiseasesNotEveryoneHas
DiseaseEveryonehas
44
Diagnosis
Diagnosis.disease |
'Mono' |
Result
Diagnosis.pname | Diagnosis.disease | Diagnosis.diagnosis_date |
'Mary' | 'Flu' | 2021-01-01 |
'Jane' | 'COVID' | 2021-01-01 |
'Mehdi' | 'Mono' | 2021-01-02 |
'Alex' | 'Strep' | 2021-01-03 |
'Bob' | 'Mono' | 2021-02-01 |
'Mary' | 'Strep' | 2021-02-16 |
'Mehdi' | 'Meningitis' | 2021-02-10 |
'Mary' | 'Mono' | 2021-01-01 |
'Jane' | 'Mono' | 2021-01-01 |
'Alex' | 'Mono' | 2021-01-03 |
Division Operator
45
Division
46
pname | disease | diagnosis_date |
'Mary' | 'Flu' | 2021-01-01 |
'Jane' | 'COVID' | 2021-01-01 |
'Mehdi' | 'Mono' | 2021-01-02 |
'Alex' | 'Strep' | 2021-01-03 |
'Bob' | 'Mono' | 2021-02-01 |
'Mary' | 'Mono' | 2021-01-01 |
'Jane' | 'Mono' | 2021-01-01 |
'Alex' | 'Mono' | 2021-01-03 |
Diagnosis
(πdisease, pname Diagnosis) ÷ (πpname Diagnosis)
Diagnosis.disease |
'Mono' |
Result
47
47
Outer Joins
Patients
name | age |
'Mary' | 45 |
'Bob' | 27 |
Diagnosis
pname | disease | diagnosis_date |
'Mary' | 'Flu' | 2021-01-01 |
'Mehdi' | 'Mono' | 2021-01-02 |
'Alex' | 'Strep' | 2021-01-03 |
Right Outer Join
48
48
Patients
Diagnosis
Pad null value for right dangling tuples
Patients × Diagnosis
name | age |
'Mary' | 45 |
'Bob' | 27 |
pname | disease | diagnosis_date |
'Mary' | 'Flu' | 2021-01-01 |
'Mehdi' | 'Mono' | 2021-01-02 |
'Alex' | 'Strep' | 2021-01-03 |
Right Outer Join: Patients ⟖Patients.name=Diagnosis.pname Diagnosis
Patients.name | Patients.age | Diagnosis.pname | Diagnosis.disease | Diagnosis.diagnosis_date |
'Mary' | 45 | 'Mary' | 'Flu' | 2021-01-01 |
'Mary' | 45 | 'Mehdi' | 'Mono' | 2021-01-02 |
'Mary' | 45 | 'Alex' | 'Strep' | 2021-01-03 |
'Bob' | 27 | 'Mary' | 'Flu' | 2021-01-01 |
'Bob' | 27 | 'Mehdi' | 'Mono' | 2021-01-02 |
'Bob' | 27 | 'Alex' | 'Strep' | 2021-01-03 |
Patients.name | Patients.age | Diagnosis.pname | Diagnosis.disease | Diagnosis.diagnosis_date |
'Mary' | 45 | 'Mary' | 'Flu' | 2021-01-01 |
NULL | NULL | 'Mehdi' | 'Mono' | 2021-01-02 |
NULL | NULL | 'Alex' | 'Strep' | 2021-01-03 |
Left Outer Join
49
49
Patients
Diagnosis
Pad null value for left dangling tuples
Patients × Diagnosis
name | age |
'Mary' | 45 |
'Bob' | 27 |
pname | disease | diagnosis_date |
'Mary' | 'Flu' | 2021-01-01 |
'Mehdi' | 'Mono' | 2021-01-02 |
'Alex' | 'Strep' | 2021-01-03 |
Patients.name | Patients.age | Diagnosis.pname | Diagnosis.disease | Diagnosis.diagnosis_date |
'Mary' | 45 | 'Mary' | 'Flu' | 2021-01-01 |
'Bob' | 27 | NULL | NULL | NULL |
Left Outer Join: Patients ⟕Patients.name=Diagnosis.pname Diagnosis
Patients.name | Patients.age | Diagnosis.pname | Diagnosis.disease | Diagnosis.diagnosis_date |
'Mary' | 45 | 'Mary' | 'Flu' | 2021-01-01 |
'Mary' | 45 | 'Mehdi' | 'Mono' | 2021-01-02 |
'Mary' | 45 | 'Alex' | 'Strep' | 2021-01-03 |
'Bob' | 27 | 'Mary' | 'Flu' | 2021-01-01 |
'Bob' | 27 | 'Mehdi' | 'Mono' | 2021-01-02 |
'Bob' | 27 | 'Alex' | 'Strep' | 2021-01-03 |
Full Outer Join
50
50
Patients
Diagnosis
Pad null values for both left and right dangling tuples
Patients × Diagnosis
name | age |
'Mary' | 45 |
'Bob' | 27 |
pname | disease | diagnosis_date |
'Mary' | 'Flu' | 2021-01-01 |
'Mehdi' | 'Mono' | 2021-01-02 |
'Alex' | 'Strep' | 2021-01-03 |
Full Outer Join: Patients ⟗Patients.name=Diagnosis.pname Diagnosis
Patients.name | Patients.age | Diagnosis.pname | Diagnosis.disease | Diagnosis.diagnosis_date |
'Mary' | 45 | 'Mary' | 'Flu' | 2021-01-01 |
'Mary' | 45 | 'Mehdi' | 'Mono' | 2021-01-02 |
'Mary' | 45 | 'Alex' | 'Strep' | 2021-01-03 |
'Bob' | 27 | 'Mary' | 'Flu' | 2021-01-01 |
'Bob' | 27 | 'Mehdi' | 'Mono' | 2021-01-02 |
'Bob' | 27 | 'Alex' | 'Strep' | 2021-01-03 |
'Alex' | 'Strep' | 2021-01-03 |
Patients.name | Patients.age | Diagnosis.pname | Diagnosis.disease | Diagnosis.diagnosis_date |
'Mary' | 45 | 'Mary' | 'Flu' | 2021-01-01 |
'Bob' | 27 | NULL | NULL | NULL |
NULL | NULL | 'Mehdi' | 'Mono' | 2021-01-02 |
NULL | NULL | 'Alex' | 'Strep' | 2021-01-03 |
Might not work on Relax
Bag Semantics
51
51
A bag.
name | age |
'Mary' | 45 |
'Jane' | 60 |
'Alex' | 30 |
'Alex' | 30 |
'Bob' | 27 |
Patients
Bag Operations
52
52
Laws for Bags
53
53
Extended Relational Algebra
54
54
Duplicate elimination δ
55
55
name | age |
'Mary' | 45 |
'Jane' | 60 |
'Alex' | 30 |
'Alex' | 30 |
'Bob' | 27 |
Patients
name | age |
'Mary' | 45 |
'Jane' | 60 |
'Alex' | 30 |
'Bob' | 27 |
δ (Patients)
Sorting τ
56
56
τname (Patients)
name | age |
'Mary' | 45 |
'Jane' | 60 |
'Mehdi' | 33 |
'Alex' | 30 |
'Bob' | 27 |
Patients
name | age |
'Alex' | 30 |
'Bob' | 27 |
'Jane' | 60 |
'Mary' | 45 |
'Mehdi' | 33 |
Aggregation γ
57
57
γ SUM(age)→sum_age (Patients)
γ AVG(age)→avg_age (Patients)
γ COUNT(*)→count (Patients)
γ MAX(age)→max_age (Patients)
γ MIN(age)→min_age (Patients)
name | age |
'Mary' | 45 |
'Jane' | 60 |
'Mehdi' | 33 |
'Alex' | 30 |
'Bob' | 27 |
Patients
5
195
39
60
27
Grouping and aggregation γ
58
58
πdisease,avg_age (γdisease; AVG(age)→avg_age (Patients ⨝name=pname Diagnosis))
Group tuples according to “disease”
Then do the aggregation within each group.
name | age |
'Mary' | 45 |
'Jane' | 60 |
'Mehdi' | 33 |
'Alex' | 30 |
'Bob' | 27 |
pname | disease | diagnosis_date |
'Mary' | 'Flu' | 2021-01-01 |
'Jane' | 'COVID' | 2021-01-01 |
'Mehdi' | 'Mono' | 2021-01-02 |
'Alex' | 'Strep' | 2021-01-03 |
'Bob' | 'Mono' | 2021-02-01 |
'John' | 'COVID' | 2021-02-10 |
'Al' | 'Strep' | 2021-02-16 |
'Mary' | 'Strep' | 2021-02-16 |
'Mehdi' | 'Meningitis' | 2021-02-10 |
Patients
Diagnosis
name | age | disease | diagnosis_date |
'Mary' | 45 | 'Flu' | 2021-01-01 |
'Alex' 'Mary' | 30 45 | 'Strep' 'Strep' | 2021-01-03 2021-02-16 |
'Jane' 'John' | 60 60 | 'COVID' 'COVID' | 2021-01-01 2021-02-10 |
'Bob' 'Mehdi' | 27 33 | 'Mono' 'Mono' | 2021-02-01 2021-01-02 |
'Mehdi' | 33 | 'Meningitis' | 2021-02-10 |
Diagnosis.disease | avg_age |
'Flu' | 45 |
'Strep' | 37.5 |
'COVID' | 60 |
'Mono' | 30 |
'Meningitis' | 33 |
Limitation of Relational Algebra
59
59