1 of 27

Relational Database

Project 3

2 of 27

Part A - Creating the Database

Only making classes

(data members, constructors, toString)

Functionality will be added in Part B

3 of 27

Schemes = Relations (or Tables)

Schemes:� snap(S,N,A,P)� csg(C,S,G)

S	N	A	P

snap

C	S	G

csg

Scheme

Name

Attribute

4 of 27

Facts = Tuples (or Rows)

Facts:� snap('12345','Charlie','12 Apple St.','555-1234').� snap('67890','Lucy','34 Pear Ave.','555-5678').� snap('33333','Snoopy','12 Apple St.','555-1234').� csg('CS101','12345','A').� csg('CS101','67890','B').� csg('EE200','12345','C').� csg('EE200','33333','B').� csg('CS101','33333','A-').

snap

C	S	G
'CS101'	'12345'	'A'
'CS101'	'67890'	'B'
'EE200'	'12345'	'C'
'EE200'	'33333'	'B'
'CS101'	'33333'	'A-'

csg

S	N	A	P
'12345'	'Charlie'	'12 Apple St.'	'555-1234'
'67890'	'Lucy'	'34 Pear Ave.'	'555-5678'
'33333'	'Snoopy'	'12 Apple St.'	'555-1234'

Collection of Tuples

Tuple

Values

5 of 27

Relation as a Whole

snap

S	N	A	P
'12345'	'Charlie'	'12 Apple St.'	'555-1234'
'67890'	'Lucy'	'34 Pear Ave.'	'555-5678'
'33333'	'Snoopy'	'12 Apple St.'	'555-1234'

Set of Tuples

Scheme

Name

6 of 27

Queries and Rules

Questions about the Database structure?

Save for later
They will stay the same (no new classes for them!)
We will implement Queries in Part B
We will implement Rules in Project 4

Rules:� �Queries:� snap(Id,'Snoopy',A,P)?� csg(Course,'33333',Grade)?

7 of 27

Details about Tuples

The collection of tuples in a relation needs to be sorted alphabetically and have no duplicates; a set is perfect for this.
A tuple is simply a vector of strings, but so is the scheme

Prevent errors by making a Tuple class and a Scheme class
These classes each contain a vector<string>
We need to connect the tuple's less-than operator to the vector<string> less-than operator

Why the less-than operator?

Sets need to know how to store items efficiently.
The less-than operator is used automatically.
Sets know how to store vector<string> to detect duplicates and sort alphabetically.
We don't need any fancy custom sorting code.

8 of 27

Details about Relations

Data members:

The name (string)
The Scheme
The collection of Tuples (set<Tuple>)

What defines a Relation?

The name and the Scheme
Therefore the constructor should take a name and a Scheme
The initial Relation can simply have an empty set of Tuples

Make an addTuple function

Receives a Tuple
Adds it to the set

snap

S	N	A	P
'12345'	'Charlie'	'12 Apple St.'	'555-1234'
'67890'	'Lucy'	'34 Pear Ave.'	'555-5678'
'33333'	'Snoopy'	'12 Apple St.'	'555-1234'

9 of 27

Printing out Relations

For each tuple, print out each value

and its associated attribute in the

Scheme

"For each tuple"

sets work differently than vectors
You'll need to use either an

iterator or a for-each loop

Relation toString() results:

S='12345', N='Charlie', A='12 Apple St.', P='555-1234'

S='33333', N='Snoopy', A='12 Apple St.', P='555-1234'�S='67890', N='Lucy', A='34 Pear Ave.', P='555-5678'

snap

S	N	A	P
'12345'	'Charlie'	'12 Apple St.'	'555-1234'
'33333'	'Snoopy'	'12 Apple St.'	'555-1234'
	'Lucy'	'34 Pear Ave.'	'555-5678'

for (Tuple t : mySet) {

// print each attribute & value

}

'67890'

10 of 27

Details about the Database

You need to be able to look up a relation when you are given the relation's name

Have your Database class use the standard Map class (#include<map>)

Nothing else for now
More will be added in Project 4

11 of 27

An Interpreter that manages the Database

Databases are independent of DatalogPrograms

The Database (or any of the classes it uses) shouldn't even know that the DatalogProgram classes exist (decoupling)
Another class should be the manager of the Database; it does all the connecting between a DatalogProgram and a Database

Make an Interpreter class that

Takes a DatalogProgram (the vectors of schemes, facts, rules, and queries)
Stores the DatalogProgram as a data member
Makes a Database using the schemes and the facts, and stores it as a data member
Will evaluate the queries in Part B and the rules in project 4

12 of 27

Overall Structure

Token

Scanner

Parser

Datalog

Program

Predicate

Rule

Parameter

Interpreter

Database

Relation

Scheme

Tuple

Proj 4

Proj 5

13 of 27

Interpreter Pseudocode

Input: DatalogProgram

Store the DatalogProgram as a data member
Make a Relation for each scheme-Predicate, and put that Relation in the Database data member
Make a Tuple for each fact-Predicate, and put that Tuple in the appropriate Relation in the Database

14 of 27

TODO

Make the Tuple class
Make the Scheme class
Make the Relation class

toString that prints out each tuple
Use a for-each loop on the set

Make the Database class
Make the Interpreter class

15 of 27

Part B - Evaluating Queries

select, project, and rename

16 of 27

Assumptions You Get to Make

You may assume the following about the Datalog program:

All schemes, facts, rules, and queries with the same name will have the same number of parameters.
No two attributes in the same scheme will have the same value.
No two schemes in the scheme list will have the same name.
Every relation referenced in a fact, rule, or query will have been defined in the scheme section.

None of the pass-off driver tests will violate these assumptions!�(We're not going to give it any funny business)

17 of 27

Evaluating Queries

Each query is searching for certain tuples and needs to be formatted correctly
Variables and constants

Modify Parameter class to contain a bool (isConstant)
Strings are constants, IDs are variables

Make a function in the Interpreter that evaluates one query predicate and returns a relation

Relation Interpreter::evaluateQuery(Predicate p)

Make a function in the Interpreter that evaluates all queries

Queries:

snap(Id,'Snoopy',A,P)?

csg(Course,'33333',Grade)?

Variable

Constant

18 of 27

Query Examples

father(same,same)? Yes(4)� same='Alma'� same='Helaman'� same='Nephi'

same='Pahoran'

father('Pahoran',kid)? Yes(3)

kid='Pahoran'

kid='Paanchi'

kid='Pacumeni'

Schemes:

father(name, son)

Facts:

father('Alma', 'Alma')

father('Alma', 'Helaman')

father('Helaman', 'Helaman')

father('Helaman', 'Nephi')

father('Helaman', 'Lehi')

father('Nephi', 'Nephi')

father('Pahoran','Pahoran')

father('Pahoran','Paanchi')

father('Pahoran','Pacumeni')

19 of 27

Evaluate Query

Grab the relation from the database
Go through the parameters: (see flowchart on next slide)

Do a select for each constant
Do a select for each duplicate variable
Mark a variable as a column and an attribute you want to keep if it's the first time you've seen the variable

Using a map<string, int> is one idea, representing the name of the variable and the index where you first found it

Do one whole project
Do one whole rename

father('Pahoran',kid)? Yes(3)

kid='Pahoran'

kid='Paanchi'

kid='Pacumeni'

Project to only keep the last column

Rename the column to say "kid" instead of "son"

20 of 27

Query Flowchart

What type of parameter is this?

Do a select

(Type 1)

Have we seen this variable before?

Variable

Constant

Do a select

(Type 2)

Mark it to keep for the project and rename

Yes

No

For each parameter:

21 of 27

Relational Operations

Functions of the Relation class

select
project
rename

When called, each function will:

make an empty relation
fill the relation
return the relation

22 of 27

Select

σ_{Editor='Smith'}Books

σ_{Author=Editor}Books

What changes between Books and σBooks?
What do we need to know to do a σ?
How should we turn this algorithm into code?

Title	Author	Editor
'Home Renovations'	'Fortt'	'Smith'
'Scuba Diving'	'Jones'	'Brown'
'Italian Cooking'	'Smith'	'Smith'
'Travel the World'	'Barker'	'Hales'

Books

23 of 27

Project

π_TitleBooks

π_{Author,Editor}Books

What changes between Books and πBooks?
What do we need to know to do a π?
How should we turn this algorithm into code?

Title	Author	Editor
'Home Renovations'	'Fortt'	'Smith'
'Scuba Diving'	'Jones'	'Brown'
'Italian Cooking'	'Smith'	'Smith'
'Travel the World'	'Barker'	'Hales'

Books

24 of 27

Rename

ρ_{Title ← headline}Books

ρ_{Author ← writer}Books

What changes between Books and ρBooks?
What do we need to know to do a ρ?
How should we turn this algorithm into code?***

Title	Author	Editor
'Home Renovations'	'Fortt'	'Smith'
'Scuba Diving'	'Jones'	'Brown'
'Italian Cooking'	'Smith'	'Smith'
'Travel the World'	'Barker'	'Hales'

Books

25 of 27

Query Flowchart

snap(X,'Snoopy',A,X)?

-This query will do type 2 select on the X variables

-Also type 1 select on the string 'Snoopy'

What type of parameter is this?

Do a select

(Type 1)

Have we seen this variable before?

Variable

Constant

Do a select

(Type 2)

Mark it to keep for the project and rename

Yes

No

For each parameter:

26 of 27

Evaluate Predicate -- Mark for Project and Rename

Go through and do all the selects

How do you keep track of the variables you've already seen and what columns you saw them in?
map<string, int> is a good idea, but this alphabetizes them
Also keep a vector<string> to retain the order we saw them

Keeping track of the first time we see each variable prepares us for Project and Rename, plus type-2 selects.
Project each column that we have saved (not in alphabetical order)
Rename columns to their corresponding variable names

27 of 27

Example

Schemes:

abcde(a,b,c,d,e)

Facts:

abcde('a','b','c','d','e').

abcde('s','t','r','r','s').

abcde('t','t','t','t','t').

abcde('g','t','g','h','g').

abcde('m','n','m','n','m').

Rules:

Queries:

abcde(X,'t',X,Y,X)?

0	1	2	3	4
X	't'	X	Y	X

Var	Col
X	0
Y	3

For selects:

0	3

X	Y

For project:

For rename:

Query parameters:

abcde(X,'t',X,Y,X)? Yes(2)

X='g', Y='h'

X='t', Y='t'