CSE 344/414
NoSQL, SQL++
JSON and Semi-Structured Data
JSON, XML, Protobuf
Used in HTTP request/response
Also as a database file
AsterixDB, SQL++
Terms
What is SQL++?
Just like SQL but parsed for processing JSON data
SQL++ has keywords to handle collections of data (i.e. non-flat data)
Why design SQL++ like SQL? Why not some other query language?
People are used to/like specifying data through SQL syntax
SQL-like language enforces idea of physical data independence
Output with Structure
If we wanted the name and a list of orders:
SELECT P.name, P.orders
FROM Person P;
{“name”: “Dan”, “orders”: [ { “date”: 1997,
“product”: “Furby”} ]}
{“name”: “Alvin”, “orders”: [ {date”: 2000,
“product”: “Furby”},
{date”: 2012,
“product”: “Magic8”} ]}
Unnesting
If we wanted the name and each product separately:
SELECT P.name, O.product as product
FROM Person P, P.orders O;
{“name”: “Dan”, “product”: “Furby”}
{“name”: “Alvin”, “product”: “Furby”}
{“name”: “Alvin”, “product”: “Magic8”}
Nesting
If we wanted the name and all products as a list:
SELECT P.name, products
FROM Person P
LET products = (SELECT O.product FROM P.order O);
{“name”: “Dan”, “products”: [“Furby”] }
{“name”: “Alvin”, “products”: [“Furby”, “Magic8”]}
Nesting (Same)
If we wanted the name and all products as a list:
SELECT P.name, (SELECT O.product FROM P.order O)� AS products
FROM Person P;
{“name”: “Dan”, “products”: [“Furby”] }
{“name”: “Alvin”, “products”: [“Furby”, “Magic8”]}
Compare
If we wanted the name and all products as a list:
SELECT P.name, products
FROM Person P
LET products = (SELECT O.product
FROM P.order O);
{“name”: “Dan”, “product”: [“Furby”] }
{“name”: “Alvin”, “product”: [“Furby”, “Magic8”]}
If we wanted the name and each product separately:
SELECT P.name, O.product as product
FROM Person P, P.orders O;
{“name”: “Dan”, “product”: “Furby”}
{“name”: “Alvin”, “product”: “Furby”}
{“name”: “Alvin”, “product”: “Magic8”}
HW SQL++ Example
SELECT y.`-car_code` as code,
y.name as name
FROM geo.world x, x.mondial.country y
ORDER BY y.name;
Casing
SELECT y.name, z.`-continent` as continent
FROM geo.world x, x.mondial.country y,
case when is_array(y.encompassed)
then y.encompassed
else [y.encompassed] end z
ORDER BY y.name;
“Find the encompassing continents for each country”
Useful Keywords/Syntax for HW
is_array( … ) -----> checks if value is an array
split(s, d) -----> splits string s on delimiter d
[ … ] -----> explicitly construct array
(CASE WHEN … THEN … ELSE … END) -----> combine with “is_array(…)”
MISSING -----> reserved keyword like “NULL”
` … ` -----> backtick needed for accessing keys with names containing “-”
Demo