1 of 27

Relational Database

Project 3

2 of 27

Part A - Creating the Database

Only making classes

(data members, constructors, toString)

Functionality will be added in Part B

3 of 27

Schemes = Relations (or Tables)

Schemes:� snap(S,N,A,P)� csg(C,S,G)

S

N

A

P

snap

C

S

G

csg

Scheme

Name

Attribute

4 of 27

Facts = Tuples (or Rows)

Facts:� snap('12345','Charlie','12 Apple St.','555-1234').� snap('67890','Lucy','34 Pear Ave.','555-5678').� snap('33333','Snoopy','12 Apple St.','555-1234').� csg('CS101','12345','A').� csg('CS101','67890','B').� csg('EE200','12345','C').� csg('EE200','33333','B').� csg('CS101','33333','A-').

snap

C

S

G

'CS101'

'12345'

'A'

'CS101'

'67890'

'B'

'EE200'

'12345'

'C'

'EE200'

'33333'

'B'

'CS101'

'33333'

'A-'

csg

S

N

A

P

'12345'

'Charlie'

'12 Apple St.'

'555-1234'

'67890'

'Lucy'

'34 Pear Ave.'

'555-5678'

'33333'

'Snoopy'

'12 Apple St.'

'555-1234'

Collection of Tuples

Tuple

Values

5 of 27

Relation as a Whole

snap

S

N

A

P

'12345'

'Charlie'

'12 Apple St.'

'555-1234'

'67890'

'Lucy'

'34 Pear Ave.'

'555-5678'

'33333'

'Snoopy'

'12 Apple St.'

'555-1234'

Set of Tuples

Scheme

Name

6 of 27

Queries and Rules

Questions about the Database structure?

  • Save for later
  • They will stay the same (no new classes for them!)
  • We will implement Queries in Part B
  • We will implement Rules in Project 4

Rules:� �Queries:� snap(Id,'Snoopy',A,P)?� csg(Course,'33333',Grade)?

7 of 27

Details about Tuples

  • The collection of tuples in a relation needs to be sorted alphabetically and have no duplicates; a set is perfect for this.
  • A tuple is simply a vector of strings, but so is the scheme
    • Prevent errors by making a Tuple class and a Scheme class
    • These classes each contain a vector<string>
    • We need to connect the tuple's less-than operator to the vector<string> less-than operator
  • Why the less-than operator?
    • Sets need to know how to store items efficiently.
    • The less-than operator is used automatically.
    • Sets know how to store vector<string> to detect duplicates and sort alphabetically.
    • We don't need any fancy custom sorting code.

8 of 27

Details about Relations

  • Data members:
    • The name (string)
    • The Scheme
    • The collection of Tuples (set<Tuple>)
  • What defines a Relation?
    • The name and the Scheme
    • Therefore the constructor should take a name and a Scheme
    • The initial Relation can simply have an empty set of Tuples
  • Make an addTuple function
    • Receives a Tuple
    • Adds it to the set

snap

S

N

A

P

'12345'

'Charlie'

'12 Apple St.'

'555-1234'

'67890'

'Lucy'

'34 Pear Ave.'

'555-5678'

'33333'

'Snoopy'

'12 Apple St.'

'555-1234'

9 of 27

Printing out Relations

  • For each tuple, print out each value

and its associated attribute in the

Scheme

  • "For each tuple"
    • sets work differently than vectors
    • You'll need to use either an

iterator or a for-each loop

Relation toString() results:

S='12345', N='Charlie', A='12 Apple St.', P='555-1234'

S='33333', N='Snoopy', A='12 Apple St.', P='555-1234'�S='67890', N='Lucy', A='34 Pear Ave.', P='555-5678'

snap

S

N

A

P

'12345'

'Charlie'

'12 Apple St.'

'555-1234'

'33333'

'Snoopy'

'12 Apple St.'

'555-1234'

'Lucy'

'34 Pear Ave.'

'555-5678'

for (Tuple t : mySet) {

// print each attribute & value

}

'67890'

10 of 27

Details about the Database

  • You need to be able to look up a relation when you are given the relation's name
    • Have your Database class use the standard Map class (#include<map>)
  • Nothing else for now
  • More will be added in Project 4

11 of 27

An Interpreter that manages the Database

  • Databases are independent of DatalogPrograms
    • The Database (or any of the classes it uses) shouldn't even know that the DatalogProgram classes exist (decoupling)
    • Another class should be the manager of the Database; it does all the connecting between a DatalogProgram and a Database
  • Make an Interpreter class that
    • Takes a DatalogProgram (the vectors of schemes, facts, rules, and queries)
    • Stores the DatalogProgram as a data member
    • Makes a Database using the schemes and the facts, and stores it as a data member
    • Will evaluate the queries in Part B and the rules in project 4

12 of 27

Overall Structure

Token

Scanner

Parser

Datalog

Program

Predicate

Rule

Parameter

Interpreter

Database

Relation

Scheme

Tuple

Proj 4

Proj 5

13 of 27

Interpreter Pseudocode

Input: DatalogProgram

  • Store the DatalogProgram as a data member
  • Make a Relation for each scheme-Predicate, and put that Relation in the Database data member
  • Make a Tuple for each fact-Predicate, and put that Tuple in the appropriate Relation in the Database

14 of 27

TODO

  1. Make the Tuple class
  2. Make the Scheme class
  3. Make the Relation class
    1. toString that prints out each tuple
    2. Use a for-each loop on the set
  4. Make the Database class
  5. Make the Interpreter class

15 of 27

Part B - Evaluating Queries

select, project, and rename

16 of 27

Assumptions You Get to Make

You may assume the following about the Datalog program:

  • All schemes, facts, rules, and queries with the same name will have the same number of parameters.
  • No two attributes in the same scheme will have the same value.
  • No two schemes in the scheme list will have the same name.
  • Every relation referenced in a fact, rule, or query will have been defined in the scheme section.

None of the pass-off driver tests will violate these assumptions!�(We're not going to give it any funny business)

17 of 27

Evaluating Queries

  • Each query is searching for certain tuples and needs to be formatted correctly
  • Variables and constants
    • Modify Parameter class to contain a bool (isConstant)
    • Strings are constants, IDs are variables
  • Make a function in the Interpreter that evaluates one query predicate and returns a relation
    • Relation Interpreter::evaluateQuery(Predicate p)
  • Make a function in the Interpreter that evaluates all queries

Queries:

snap(Id,'Snoopy',A,P)?

csg(Course,'33333',Grade)?

Variable

Constant

18 of 27

Query Examples

father(same,same)? Yes(4)� same='Alma'� same='Helaman'� same='Nephi'

same='Pahoran'

father('Pahoran',kid)? Yes(3)

kid='Pahoran'

kid='Paanchi'

kid='Pacumeni'

Schemes:

father(name, son)

Facts:

father('Alma', 'Alma')

father('Alma', 'Helaman')

father('Helaman', 'Helaman')

father('Helaman', 'Nephi')

father('Helaman', 'Lehi')

father('Nephi', 'Nephi')

father('Pahoran','Pahoran')

father('Pahoran','Paanchi')

father('Pahoran','Pacumeni')

19 of 27

Evaluate Query

  • Grab the relation from the database
  • Go through the parameters: (see flowchart on next slide)
    • Do a select for each constant
    • Do a select for each duplicate variable
    • Mark a variable as a column and an attribute you want to keep if it's the first time you've seen the variable
      • Using a map<string, int> is one idea, representing the name of the variable and the index where you first found it
  • Do one whole project
  • Do one whole rename

father('Pahoran',kid)? Yes(3)

kid='Pahoran'

kid='Paanchi'

kid='Pacumeni'

Project to only keep the last column

Rename the column to say "kid" instead of "son"

20 of 27

Query Flowchart

What type of parameter is this?

Do a select

(Type 1)

Have we seen this variable before?

Variable

Constant

Do a select

(Type 2)

Mark it to keep for the project and rename

Yes

No

For each parameter:

21 of 27

Relational Operations

  • Functions of the Relation class
    • select
    • project
    • rename
  • When called, each function will:
    • make an empty relation
    • fill the relation
    • return the relation

22 of 27

Select

σEditor='Smith' Books

σAuthor=Editor Books

  • What changes between Books and σBooks?
  • What do we need to know to do a σ?
  • How should we turn this algorithm into code?

Title

Author

Editor

'Home Renovations'

'Fortt'

'Smith'

'Scuba Diving'

'Jones'

'Brown'

'Italian Cooking'

'Smith'

'Smith'

'Travel the World'

'Barker'

'Hales'

Books

23 of 27

Project

πTitle Books

πAuthor,Editor Books

  • What changes between Books and πBooks?
  • What do we need to know to do a π?
  • How should we turn this algorithm into code?

Title

Author

Editor

'Home Renovations'

'Fortt'

'Smith'

'Scuba Diving'

'Jones'

'Brown'

'Italian Cooking'

'Smith'

'Smith'

'Travel the World'

'Barker'

'Hales'

Books

24 of 27

Rename

ρTitle ← headline Books

ρAuthor ← writer Books

  • What changes between Books and ρBooks?
  • What do we need to know to do a ρ?
  • How should we turn this algorithm into code?***

Title

Author

Editor

'Home Renovations'

'Fortt'

'Smith'

'Scuba Diving'

'Jones'

'Brown'

'Italian Cooking'

'Smith'

'Smith'

'Travel the World'

'Barker'

'Hales'

Books

25 of 27

Query Flowchart

snap(X,'Snoopy',A,X)?

-This query will do type 2 select on the X variables

-Also type 1 select on the string 'Snoopy'

What type of parameter is this?

Do a select

(Type 1)

Have we seen this variable before?

Variable

Constant

Do a select

(Type 2)

Mark it to keep for the project and rename

Yes

No

For each parameter:

26 of 27

Evaluate Predicate -- Mark for Project and Rename

  • Go through and do all the selects
    • How do you keep track of the variables you've already seen and what columns you saw them in?
    • map<string, int> is a good idea, but this alphabetizes them
    • Also keep a vector<string> to retain the order we saw them
  • Keeping track of the first time we see each variable prepares us for Project and Rename, plus type-2 selects.
  • Project each column that we have saved (not in alphabetical order)
  • Rename columns to their corresponding variable names

27 of 27

Example

Schemes:

abcde(a,b,c,d,e)

Facts:

abcde('a','b','c','d','e').

abcde('s','t','r','r','s').

abcde('t','t','t','t','t').

abcde('g','t','g','h','g').

abcde('m','n','m','n','m').

Rules:

Queries:

abcde(X,'t',X,Y,X)?

0

1

2

3

4

X

't'

X

Y

X

Var

Col

X

0

Y

3

For selects:

0

3

X

Y

For project:

For rename:

Query parameters:

abcde(X,'t',X,Y,X)? Yes(2)

X='g', Y='h'

X='t', Y='t'