1 of 45

CS 3200

SQL - Part 1

Mark Fontenot, PhD

Northeastern University

1

2 of 45

Setting the Stage…

  • Relational Algebra → query language for set of relations based on Relational Data Model
  • The relational data model forms the foundation for modern relational database management systems (RDBMS) such as MySql, Postgres, Oracle DB, MS SQL Server, etc.
  • Each RDBMS provides a query language that is a “flavor” of the Structured Query Language (SQL) standard.

3 of 45

SQL: Structured Query Language

  • SQL is a standardized language* used for managing and manipulating relational databases.
  • SQL is mostly a declarative language → say what you want rather than how to get it
  • but … many RDBMS vendors add procedural extensions to their SQL implementations
    • PL/SQL from Oracle
    • T-SQL from Microsoft SQL Server

3

* See the Standards Overview section of the SQL Wikipedia Article

4 of 45

Categories of SQL Commands

  • Major Categories of SQL Commands:
    • DQL (Data Query Language)
      • SELECT Statements
    • DDL (Data Definition Language)
      • CREATE, ALTER, and DROP Statements
    • DML (Data Manipulation Language)
      • INSERT, UPDATE, and DELETE Statements
    • some resources break it down even further

5 of 45

SQL vs. Relational Algebra

  • SQL is based on multisets where RA is based on sets.
  • Data is always in some order in SQL where order doesn’t matter in RA.
  • SQL has implementation-specific limits on max size of a row, max size of a table, etc. where RA has no such limits
    • details are RDBMS-product specific

6 of 45

SQL vs. Relational Algebra

  • Terminology Differences

Relational Algebra

SQL

Relation (Relation Instance)

Table

Relation Schema

Table Schema

Attribute

Attribute or Column

Tuple

Row

Primary Key

Primary Key

Foreign Key

Foreign Key

7 of 45

NorthWind DB Schema

7

8 of 45

It’s SQL Time:

The SELECT Statement

9 of 45

Aside: Generic Syntax Statements

  • Explanation of Generic Syntax in upcoming Slides
    • <...> - Placeholder
      • Indication that you should replace the placeholder (including the <>s) with something specific
      • Example: SELECT <value>; indicates that <value> should be replaced with an actual value, as in SELECT 123;
    • [...] - Optional
      • Indication that the component(s) enclosed in [...] are not always needed for a statement to be valid.
      • Example: SELECT <value> [FROM <table>]; indicates either of the following are syntactically valid:
        • SELECT 123;
        • SELECT 123 FROM categories;

10 of 45

Intro to SELECTs

  • SELECT statement is the querying “powerhouse” of SQL.
    • includes ways to filter horizontally and vertically, combine tables, aggregate by groups and more!�
  • The (almost) simplest SELECT statement

  • Retrieves all FirstName values from table Employees.�IOW - it returns the FirstName column.

10

SELECT FirstName

FROM Employees;

11 of 45

Intro to SELECTs

  • General Form of a Simple SELECT Statement:

What this does: retrieves a table with attributes <col1>, <col2>, … from the identified table.

Note: I will capitalize keywords in SQL statements, but not strictly required by SQL.

11

SELECT <col1>[, <col2>, …]

FROM <table name>;

12 of 45

SELECT Specific

Columns

Returns a table containing the ProductID and ProductName columns from the Products table.

12

SELECT ProductID, ProductName

FROM Products;

UnitPrice

UnitsInStock

13 of 45

Retrieve All Columns

  • Returns all columns and rows from the Products table
    • Basically, returns the complete Products table
  • * is a wildcard character… in this case, it means “all columns”

13

SELECT *

FROM Products;

UnitPrice

UnitsInStock

14 of 45

Renaming/Aliasing �Columns

14

SELECT ProductID AS p_id, � ProductName AS p_name

FROM Products;

SELECT <column> AS <newName>, …

FROM <table name>;

General�Form

Example

UnitPrice

UnitsInStock

15 of 45

Renaming/Aliasing �Columns

Returns a table with columns named p_id and p_name.

15

SELECT ProductID AS p_id, � ProductName AS p_name

FROM Products;

UnitPrice

UnitsInStock

16 of 45

Basic Calculations

  • You can perform mathematical operations between existing attributes in the table(s).
    • Note that functions available depend on the RDBMS
    • You should alias the resulting calculated attribute with AS.

16

SELECT OrderID, ProductID, � (UnitPrice * Quantity) AS cost

FROM OrderDetails;

UnitPrice

UnitsInStock

17 of 45

Sample Query 1

For all customers, provide a list containing the customer name, contact person’s name, their city, and postal code.

17

UnitPrice

UnitsInStock

18 of 45

Sample Query 2

Your boss sends you the following email… “Hi! Could you get me a list of all products with our current stock levels? Thanks!”

18

UnitPrice

UnitsInStock

19 of 45

Sample Query 3

The stockroom manager needs to know the current value of each product we currently have in stock. So, if we have 10 units of Sharp Cheddar, what is the total value for those 10 units. Can you help them?

19

UnitPrice

UnitsInStock

20 of 45

Selecting Specific Rows with WHERE

21 of 45

WHERE -

  • WHERE clause - includes predicates applied to every row to determine if it is returned or not.

    • Each condition should evaluate to true or false.

21

SELECT <list of columns>

FROM <table name>

[WHERE <condition list>]

General�Form

Why condition list?

22 of 45

WHERE Examples

22

SELECT ProductName, UnitPrice

FROM Products

WHERE UnitPrice < 15;

- Give the names and unit prices of products that are less than $15.

UnitPrice

UnitsInStock

23 of 45

WHERE Examples

23

SELECT LastName, FirstName, HireDate

FROM Employees

WHERE HireDate >= ‘2014-01-01’;

- Give the names of any employees hired on or after �Jan 1 2014.

Notice the way we format a date literal.

UnitPrice

UnitsInStock

24 of 45

Boolean Expression Operators

24

Operator

Meaning

Example

=

Equal to

InvoiceId = 2

<>

Not equal to

Name <> 'U2'

< or >

Less/Greater than

UnitPrice < 5

<= or >=

Less/Greater than or equal to

UnitPrice >= 0.99

LIKE

Matches pattern

PostalCode LIKE 'T2%'

IN

Within a set

City IN ('Calgary', 'Edmonton')

IS or IS NOT

Compare to NULL

ReportsTo IS NULL

BETWEEN

Inclusive range (esp. dates)

UnitPrice BETWEEN 0.99 AND 1.99

  • Boolean expressions can be combined with logical AND or OR.

25 of 45

WHERE Examples

25

SELECT ProductName, UnitPrice

FROM Products

WHERE SupplierID = 7 AND UnitPrice < 15 ;

- Give the names and unit prices of products from supplier with id 7 that are less than $15..

UnitPrice

UnitsInStock

26 of 45

WHERE Examples

26

SELECT OrderID

FROM Orders

WHERE CustomerID = 5 OR CustomerID = 12;

- Give a list of all orders placed by customer with id 5 or customer with id 12. Only return OrderID.

UnitPrice

UnitsInStock

UnitPrice

UnitsInStock

27 of 45

WHERE - IN operator

  • Check if attribute value is within an enumerated list of values

27

SELECT CompanyName, ContactName

FROM Customers

WHERE Region IN (‘British Isles’, ‘Central America’);

- The boss wants a list of our customers’ company names and contact names for all customers in the British Isles or Central America.

28 of 45

WHERE - LIKE Operator

  • LIKE operator is used for pattern matching in strings
    • <column|expression> LIKE ‘<pattern>’
    • <pattern> may contain wildcard characters
      • % - matches any sequence of zero or more characters in the string
      • _ (underscore) - matches any single character in the string
    • Any other character in <pattern> matches itself or its lower/upper case equivalent (i.e. case-insensitive matching)

SELECT CustomerID, CompanyName

FROM Customers

WHERE CompanyName LIKE ‘fo%’;

29 of 45

WHERE - Your Turn

1. Provide a list of products from the supplier with id of 12 that cost more than $25.

29

UnitPrice

UnitsInStock

30 of 45

WHERE - Your Turn

2. Customer relations wants a list of all customer info for any customer in Boston or New York (cities).

30

31 of 45

Adding the ORDER BY Clause

32 of 45

Sorting the Results

Specify the ordering of the rows in the output.

For each attribute in order by clause, ASC (default) or DESC can be specified.

32

SELECT <list of columns>

FROM <table name>

[WHERE <condition list>]

[ORDER BY <column-order list>];

General Form

33 of 45

Sorting the Results

Specify the ordering of the rows in the output.

If <column-order list> contains more than one column, the first column listed is the primary, and the next column is used to break ties when the value in the primary column are equal.

33

SELECT <list of columns>

FROM <table name>

[WHERE <condition list>]

[ORDER BY <column-order list>];

General Form

34 of 45

Output Order Example

34

SELECT ProductName, UnitPrice, UnitsOnOrder

FROM Products

WHERE UnitsOnOrder > 0

ORDER BY ProductName DESC;

- Give the names, unit prices, and number of units in stock for any products we are currently waiting to receive. Sort the list by product name in reverse alphabetical order.

UnitPrice

UnitsInStock

UnitsOnOrder

UnitPrice

UnitsInStock

35 of 45

Output Order Example

35

SELECT CompanyName, City, Country, Region

FROM Customers

ORDER BY Region, CompanyName;

- Provide a list of our customers’ company names with city, country, and region sorted by company name within each region.

Notice…. you aren’t required to have a WHERE clause.

UnitPrice

UnitsInStock

36 of 45

Removing Duplicates

  • Use the SELECT DISTINCT to remove duplicate rows from the result
    • Note: Only put DISTINCT once, not for each attribute

SELECT DISTINCT Country

FROM Customers

ORDER BY Country ASC;

- Provide a list of all Countries where Northwind currently has customers. Sort the result alphabetically.

UnitPrice

UnitsInStock

37 of 45

What do you think?

What do you think the potential difference between the output of these two queries?

37

SELECT DISTINCT ShipCity, ShipCountry

FROM Orders

WHERE shipregion = 'Western Europe'

ORDER BY shipcity, shipcountry;

SELECT ShipCity, ShipCountry

from Orders

where shipregion = 'Western Europe'

order by shipcity, shipcountry;

versus

38 of 45

Aggregate Functions

39 of 45

Aggregate Functions

- Allows us to apply some function to all rows in the result set of a query

- When used alone, resultset will be a single row

- Common aggregate functions:

- MAX

- MIN

- SUM

- AVG

- COUNT

39

SELECT MAX(UnitPrice)

FROM Products;

SELECT COUNT(*)

FROM Products;

Don’t forget… you can alias these to more readable column names for the output.

40 of 45

Aggregate Function Practice

What’s the average price of all products we sell?

What’s the average price of all non-discontinued products we sell?

40

41 of 45

Aggregate Function Practice

What is the date of the first order we ever processed?

How many orders has “Alfreds Futterkiste” placed with us?

41

42 of 45

SET Operations

43 of 45

Union

43

SELECT City

FROM Employees

WHERE Region = 'Western Europe'

UNION ALL

SELECT City

FROM Customers

WHERE Region = 'Western Europe';

Provide a list of all cities in Western Europe where either employees live or customers are based.

Includes duplicates

Use UNION (without ALL) to remove dupes

44 of 45

Intersection

44

SELECT City

FROM Employees

WHERE Country = 'USA'

INTERSECT

SELECT City

FROM Customers

WHERE Country = 'USA';

Provide a list of all cities in the USA where we have BOTH employees and customers.

There’s no ALL version with Intersect

45 of 45

Except

45

SELECT City

FROM Employees

WHERE Country = 'USA'

EXCEPT

SELECT City

FROM Customers

WHERE Country = 'USA';

Provide a list of all cities in the USA where Employees live but where we have no customers.