1 of 14

CSE 344/414

NoSQL, SQL++

2 of 14

JSON and Semi-Structured Data

JSON, XML, Protobuf

Used in HTTP request/response

  • Good for data exchange
  • Maps to OOP paradigm

Also as a database file

  • Flexible tree-structured model
  • Query langs: XQuery, XPath, etc.

3 of 14

AsterixDB, SQL++

  • A semi-structured NoSQL style data model (ADM)
  • Extends JSON with object database ideas

Terms

  • DDL: type (open vs. closed), data types (e.g. multiset). Creating an index.
  • DML: Heterogeneous Collections, Nesting / Unnesting.

4 of 14

What is SQL++?

Just like SQL but parsed for processing JSON data

SQL++ has keywords to handle collections of data (i.e. non-flat data)

Why design SQL++ like SQL? Why not some other query language?

People are used to/like specifying data through SQL syntax

SQL-like language enforces idea of physical data independence

5 of 14

Output with Structure

If we wanted the name and a list of orders:

SELECT P.name, P.orders

FROM Person P;

{“name”: “Dan”, “orders”: [ { “date”: 1997,

“product”: “Furby”} ]}

{“name”: “Alvin”, “orders”: [ {date”: 2000,

“product”: “Furby”},

{date”: 2012,

“product”: “Magic8”} ]}

6 of 14

Unnesting

If we wanted the name and each product separately:

SELECT P.name, O.product as product

FROM Person P, P.orders O;

{“name”: “Dan”, “product”: “Furby”}

{“name”: “Alvin”, “product”: “Furby”}

{“name”: “Alvin”, “product”: “Magic8”}

7 of 14

Nesting

If we wanted the name and all products as a list:

SELECT P.name, products

FROM Person P

LET products = (SELECT O.product FROM P.order O);

{“name”: “Dan”, “products”: [“Furby”] }

{“name”: “Alvin”, “products”: [“Furby”, “Magic8”]}

8 of 14

Nesting (Same)

If we wanted the name and all products as a list:

SELECT P.name, (SELECT O.product FROM P.order O)� AS products

FROM Person P;

{“name”: “Dan”, “products”: [“Furby”] }

{“name”: “Alvin”, “products”: [“Furby”, “Magic8”]}

9 of 14

Compare

If we wanted the name and all products as a list:

SELECT P.name, products

FROM Person P

LET products = (SELECT O.product

FROM P.order O);

{“name”: “Dan”, “product”: [“Furby”] }

{“name”: “Alvin”, “product”: [“Furby”, “Magic8”]}

If we wanted the name and each product separately:

SELECT P.name, O.product as product

FROM Person P, P.orders O;

{“name”: “Dan”, “product”: “Furby”}

{“name”: “Alvin”, “product”: “Furby”}

{“name”: “Alvin”, “product”: “Magic8”}

10 of 14

HW SQL++ Example

SELECT y.`-car_code` as code,

y.name as name

FROM geo.world x, x.mondial.country y

ORDER BY y.name;

11 of 14

Casing

SELECT y.name, z.`-continent` as continent

FROM geo.world x, x.mondial.country y,

case when is_array(y.encompassed)

then y.encompassed

else [y.encompassed] end z

ORDER BY y.name;

“Find the encompassing continents for each country”

12 of 14

Useful Keywords/Syntax for HW

is_array( … ) -----> checks if value is an array

split(s, d) -----> splits string s on delimiter d

[ … ] -----> explicitly construct array

(CASE WHEN … THEN … ELSE … END) -----> combine with “is_array(…)

MISSING -----> reserved keyword like “NULL

` … ` -----> backtick needed for accessing keys with names containing “-

13 of 14

14 of 14

Demo