1 of 372

DATABASE ORGANIZATION

Lecturer: Ph.D. , Associate Professor, Associate Professor of Department 105

National Aerospace University

"Kharkiv Aviation Institute"

Karatanov Oleksandr Volodymyrovych

2 of 372

What will you learn in this course?

Conceptual database model

Database data model

Physical model of the database

SQL

Database design

DDL

DML

DQL

View

3 of 372

What will you learn in this course?

Development of a database for the client

in Visual Studio . C#.

User interface

Database connection

Database navigation and search

Delegates

4 of 372

Who can you become after studying the course?

  • database analyst;
  • database developer;
  • database system administrator;
  • someone else in the IT field.

5 of 372

Most popular languages

Programming, scripting, and markup languages

6 of 372

Most popular languages

© Zecomms Agency , 2023

7 of 372

Most popular languages (Djinni)

8 of 372

Database administrator salaries abroad

https://money.usnews.com/careers/best-jobs/database-administrator/salary

9 of 372

  • Bachelors degree in Computer Science , Computer or Electrical Engineering, Mathematics, or a related field

  • Minimum of 5 years Database Administration experience

  • Minimum of 5 years data modeling experience

  • Excellent SQL and DB performance tuning skills

  • Minimum of 5 years troubleshooting and problem-solving experience.

Database Engineer – FinTech

10 of 372

11 of 372

Database​​​

  1. a set of interrelated data organized according to a database schema in such a way that a user can work with them [GOST 34.321-96, DSTU ISO/IEC 2382:2017 Information technologies. Glossary of terms ].

  • a named structured collection of data related to a specific subject area [ DSTU 2226-93 ].

12 of 372

Data models

  • Database schema ( Database schema ) – a formal description of data according to a specific data schema [GOST 34.321-96].

  • Data schema ( Data schema ) or Model data – logical representation of data organization [GOST 34.321-96].

13 of 372

Data models

Data models

Infological

Datalogical

Documentary

Factual

Graph-theoretic

Network

Hierarchical

Set-theoretic

Relational

Object-oriented

Physical

14 of 372

Types of data models

Hierarchical

    • Relationships between data are firmly fixed (ordered graph)
    • Changing the connection leads to a reorganization of the structure
    • The number of connections is limited.

Network

    • The nature of the connections is more diverse (arbitrary graph)
    • It is difficult to introduce changes.

Relational

    • Tables are independent
    • The connections are completely interchangeable.
    • Ease of expansion

15 of 372

Hierarchical data model

  • This is a data model that uses a database representation in the form a tree-like (hierarchical) structure consisting of objects (data) of different levels.

16 of 372

  • There are relationships between objects, each object can include several lower-level objects.
  • Such objects:
    • ancestor (object closer to the root);
    • descendant (lower-level object).
  • An ancestor object may have no descendants or may have several.
  • In a child object necessarily only one ancestor .
  • Objects that share a common ancestor are called twins .

Root

Descendant

Ancestor

Bypass procedure

17 of 372

Network data model

  • The difference between a hierarchical data model and a network data model is that in hierarchical structures, a child record must have exactly one ancestor , and the network data structure of the descendant can be any number of ancestors .

In a network model, connections between all information objects and all others are possible.

For example, each teacher can teach many students and each student can study with many teachers.

Using hierarchical or network models speeds up access to information, but requires significant memory resources .

Teacher 1

Teacher 2

Teacher 3

Student 1

Student 2

Student 3

Student 4

18 of 372

Semantic network

  • Semantic network – an information model of a subject area, which has the form of a directed graph, the vertices of which correspond to objects of the subject area, and the arcs (edges) define the relationships between them.

19 of 372

Relational data model

  • relational data model aims to organize data as relationships .

  • Developed in the early 1970s by Edgar F. Codd .

20 of 372

Subject area model

When designing a database, the subject area is in the form of three models:

  1. representation of the subject area as it really exists infological (informational, semantic, conceptual) model;
  2. the data model as it is presented by the database designer data logical (logical) model;
  3. a data model in a form suitable for storage in computer memory a physical model.

The data logical model is designed based on the information model, by normalizing it and reducing it to third normal form.

21 of 372

Database design

0. Analysis of the subject area.

  1. Designing an information model.
  2. Development of a data logical model.
  3. Physical database design.

database design process is a sequential transition from an informal verbal description of the information structure of a subject area to a formalized description in terms of a certain model and, in turn, includes several stages.

22 of 372

DEVELOPMENT OF AN INFOLOGICAL MODEL

23 of 372

0 Domain Analysis

  1. What data will the database work with?
  2. Who will work with the database?
  3. How is the database work?

As a result, we should get a verbal description of the information objects of the subject area.

Designing an information model

Development of a data logical model

Physical database design

24 of 372

Subject area

  • At the heart of any database is the concept of a subject area.
  • A subject area is a part of the real world, information about which should be contained in the database.

25 of 372

Example of a subject area description

  • The company repairs computers.

  • It is necessary to develop a database to store information about the performance of repair work by the company's employees.

  • When placing an order , the date of order execution , type of work performed , and the person performing the work are recorded .

26 of 372

1 Designing an infological model

  • An infological model allows you to represent a subject area in a formalized form.

  • To build an infological data model, various types of semantic models are used : semantic networks , the entity-relationship model ( ER model ), the IDEF1X model, etc.

Designing an information model

Development of a data logical model

Physical database design

27 of 372

Designing an infological model

  1. After performing a subject area analysis, identify entities (table names) to represent them in the database.

  • Reveal attributes (column names) of entities.

  • Enter (select) a unique identifier attribute. It is called the primary key (or simply "key" ) .

  • Identify the type of dependencies ( relationships ) of entities.

28 of 372

Entity- relationship model ( ER model ) ERM) – a data model that allows you to describe the conceptual schemes of a subject area.

29 of 372

IDEF1X (IDEF1 Extended ) Data Modeling database modeling methodology based on the entity-relationship model.

It is used to build an information model that represents the structure of information necessary to support the functions of a production system or environment.

30 of 372

Basic concepts of ER diagrams

  1. Essence
  2. Entity instance
  3. Entity attribute
  4. Entity key
  5. Communication

31 of 372

Essence

  • An entity is a class of objects of the same type, information about which must be contained in the model.
    • Each entity must have a name expressed by a singular noun .
    • Examples of entities can be such object classes as "Student", "Faculty", "Teacher". Each entity of the model is depicted as a rectangle with a name.

32 of 372

Entity instance

  • An entity instance is a specific representative of that entity.
    • For example, the representative of the entity "Student" could be "Student Bozhenko".
    • Instances of entities must be distinct, meaning entities must have some properties that are unique to each instance of that entity.
    • Entity instances are not necessarily reflected in the database model, but we should keep them in mind when designing.

No.

Full name

Specialty

Group

1

Bozhenko Liliya Serhiivna

123 Computer Engineering

535-in

33 of 372

Entity attribute

  • An entity attribute is a named characteristic that is some property of an entity.
    • The name of the attribute should be expressed as a singular noun (possibly with characterizing adjectives).
    • Examples of attributes of the "Student" entity can be attributes such as "Student ID Number", "Last Name", "First Name", "Patronymic", "Group", "Specialty", etc.
    • Attributes are depicted within a rectangle that defines the entity.

34 of 372

Entity key

  • An entity key is a non-redundant set of attributes whose values are collectively unique for each instance of the entity.
    • This is the minimum set of attributes whose values can be used to uniquely find the required entity instance.
    • Key attributes are displayed in the diagram by underlining and/or by the abbreviation PK ( Primary Key ) .

35 of 372

Simple and compound key

  • A composite key is a key that consists of two or more attributes ( table columns ) that together uniquely identify an entity (table row).

Compound key

Simple key

36 of 372

Natural vs Surrogate Key

  • A natural key (also known as a business key or domain key) is a type of unique key in a database, formed from attributes that exist and are used in the external world outside the database .
  • Advantages of using a natural key:
    • less disk space usage;
    • facilitates database model design ;
    • simplifies its understanding.

37 of 372

Communication

  • A connection is some association between two entities.
    • One entity can be related to another entity or to itself.
    • For example, relationships between entities can be expressed in the following phrases:

"A teacher can have multiple Disciplines",

"Every student must study at least one specialty."

    • Graphically, a relationship is represented by a line connecting two entities.

38 of 372

Types of relationships between entities

  • "one to one" - each instance of the first entity corresponds to only one instance of the second entity.
  • Such connections mostly don't make sense, because these two entities can usually be combined into one.
    • Example: each company has one director.

  • "one-to-many" - each instance of the first entity can correspond to multiple instances of the second entity, but not vice versa.
    • Example : One company has many board members.

  • "many to many" - each instance of the first entity can correspond to several instances of the second entity and vice versa.
  • It is best to avoid such connections. They are not supported by the DBMS.
    • Example: each company can have several lawyers, while each lawyer can serve several companies.

39 of 372

Overview of software for designing database architecture

MySQL Workbench

MS Visio

ErWin

ARIS

draw.io

40 of 372

Crow's Foot Notation� (Crow's foot)

  • Three symbols are used to represent relationships:
    1. the circle is "zero";
    2. a dash represents "one";
    3. crow's foot " many" or "infinity".

41 of 372

Types of connections in �Crow's Foot notation

Description

Symbol

Circle and dash: minimum 0, maximum 1 (optional)

Two dashes: minimum 1, maximum 1 (required)

Circle and crow's foot: minimum 0, maximum many (optional)

Dash and crosier: minimum 1, maximum many (required)

The inner element of the notation represents the minimum value, and the outer element (closest to the essence) represents the maximum.

42 of 372

Subject area

  • Let's consider the dean's office data model.

ID

Full name

Faculty

Chair

Group

Rating for "OD"

OBD assessment

1

Duraev Andriy Sergeevich

1

105

126

failure to complete

unsatisfactorily

2

Barely

Stanislav

Yuriyovych

1

105

126

test

perfectly

3

Shifershtein Eduard Volodymyrovych

1

105

126

test

satisfactorily

43 of 372

Data model

44 of 372

DEVELOPMENT OF A DATA LOGICAL MODEL

45 of 372

Normalization of relations

Normalization is a sequential process of partitioning and transforming some small source set of tables to construct a set of interdependent tables in normal forms.

Definition for the alternatively gifted

Normalization – this is when we had a lot of tables, and we would make even more so that they would comply with some rules.

Designing an information model

Development of a data logical model

Physical database design

46 of 372

Normal form

Normal form (NF) is a requirement for the structure of tables in theory relational databases to remove redundant data from the database functional dependencies between attributes (table fields).

eight NFs in total :

1-6 NF;

NF Boyce-Codd;

domain-key NF.

47 of 372

Functional dependency

A set of attributes Y is functionally dependent on X if and only if at any point in time for each of the different values of Y there is only one of the different values of X.

There is also an equivalent term: plural X determines Y .

Designation –

XY .

Each of the normalization levels restricts the types of allowable functional dependencies of a relation.

determinant

dependent part

48 of 372

Functional dependency. Example

  • Consider the relationship given by the following scheme:

schedule (Pilot, Flight, Date, Time).

  • Their dependence is given by the following constraints :
    • only the departure time is specified for each flight;
    • for the attributes ( Pilot, Date, Time ) only one flight is defined;
    • for attributes ( Flight, Date ) a single pilot is defined.

  • Thus, the set is given functional dependencies :
    • Flight → Time
    • (Pilot, Date, Time)→ Flight
    • (Flight, Date)→ Pilot

49 of 372

1 normal form

  • Conditions of the first normal form:
    1. each entity corresponds to a separate table;
    2. each set of related data is identified by a primary key;
    3. fields do not have duplicates in each record;
    4. each field contains only one value ( atomically ).

Atomicity ( indivisibility ) of a field means that the values it contains must not be divided into smaller ones.

50 of 372

1 normal form. Example

  • Let's say for a relationship with the Flight schema (Number, Destination, Departure), the Departure attribute is defined as the pair (Day, Time) .
  • In this case, it is easy to implement queries like “Return all flights to Kharkiv,” unlike the query “Return all flights departing on Monday.” From the perspective of the second query, the relation is not in 1NF.

Flight

Number

Destination

Departure

1

Kharkiv

Mon , 9:40

1

Kharkiv

Wed, 10:20

632

Kyiv

Mon , 6:00

632

Kyiv

Tue , 6:00

Duplication of information

Lack of atomicity

51 of 372

1 normal form. Example

  • The transformation is obvious: the relation is replaced by another one with the scheme:

Flight (Number, Destination, Departure day, Departure time).

  • This will allow for atomicity .

Flight

No.

Destination

Departure

1

Kharkiv

Mon , 9:40

1

Kharkiv

Wed, 10:20

632

Kyiv

Mon , 6:00

632

Kyiv

Tue , 6:00

Allocating destinations to a separate table will eliminate duplication of information.

Flight

No.

Destination

Departure day

Departure time

1

Kharkiv

Mon

9:40

1

Kharkiv

Wed

10:20

632

Kyiv

Mon

6:00

632

Kyiv

Tue

6:00

Flight

No.

Destination ID

Departure day

Departure time

1

78

Mon

9:40

1

78

Wed

10:20

632

193

Mon

6:00

632

193

Tue

6:00

Destination

Destination ID

Destination

78

Kharkiv

193

Kyiv

52 of 372

1 normal form. Example 2

  • If the "Division" field contains the name of the faculty and department, the requirement of indivisibility is not met and it is necessary to separate the name of the faculty or department into a separate field.

Teachers

No.

Full name

Unit

1

Karatanov O. V.

1, 105

2

Krytskyi D. M.

1, 105

3

Brega D. A.

1, 101

Teachers

No.

Full name

Faculty

Chair

1

Karatanov O. V.

1

105

2

Krytskyi D. M.

1

105

3

Brega D. A.

1

101

53 of 372

1 normal form. Redundancy

Employee number

Name

Specialty

Manager number

Building No.

1235

Kameneva

Electrician

1311

312

1235

Kameneva

Electrician

1311

515

1412

Wet

Plasterer

312

1412

Wet

Plasterer

460

1412

Wet

Plasterer

435

1412

Wet

Plasterer

515

1311

Dosya

Electrician

435

Employee

Name

Specialty

Manager number

No.

buildings

1235

Kameneva

Electrician

1311

{312, 515}

1412

Wet

Plasterer

{312, 460, 435, 515}

1311

Dosya

Electrician

435

Data redundancy data duplication in the database.

Reduction to first normal form provokes redundancy

Example 3

54 of 372

1 normal form. Integrity

  • Data redundancy or duplication not only wastes space; it can also cause data integrity issues (inconsistencies) in the database.

Data integrity consistency of data in the database.

55 of 372

Anomalies

  • To achieve integrity, anomalies must be eliminated.

    • Data inconsistency caused by data redundancy and partial updates

Update anomaly

    • Unintentional data loss caused by deleting other data

Removal anomaly

    • The inability to enter data into the table is caused by the lack of other data.

Input anomaly

56 of 372

1st normal form. Example 4

  • Let's assume that during logical modeling, in the first step, it is proposed to store data in a single relation that has the following attributes:

  • The potential key of the relation should be a pair of attributes (Employee's timesheet number , Project number), since each employee in each project performs exactly one task.
  • From the table EMPLOYEES_DEPARTMENTS_PROJECTS It is clear that these relations are preserved in it with great excess .
  • Many lines contain repeated employee names, phone numbers, and project names. In addition, in this regard, independent data is stored together - data about employees, departments, projects, and project work.

Employee's record number

Name

Department number

Department

Project number

Project name

Task number

1

Karpenko

1

11-22-33

1

Space

1

1

Karpenko

1

11-22-33

2

Climate

1

2

Yen

1

11-22-33

1

Space

2

3

Konovalova

2

33-22-11

1

Space

3

3

Konovalova

2

33-22-11

2

Climate

2

EMPLOYEES_DEPARTMENTS_PROJECTS

57 of 372

1st normal form. Example 4

  • The relationship EMPLOYEES_DEPARTMENTS_PROJECTS is decomposed into three relationships – COLLABORATORS_DEPARTMENTS, PROJECTS, TASKS.

Employee's record number

Name

Department number

Department

1

Karpenko

1

11-22-33

2

Yen

1

11-22-33

3

Konovalova

2

33-22-11

Table 2. Relationship EMPLOYEES_DEPARTMENTS

Project number

Project name

1

Space

2

Climate

Table 3. Relationship PROJECTS

Functional dependencies:

Project number

→ Project name

Employee's record number

Project number

Task number

1

1

1

1

2

1

2

1

2

3

1

3

3

2

2

Table 4. TASKS relationships

Functional dependencies:

( Employee's record number , Project number ) → Task number

Functional dependencies:

Dependence of attributes characterizing an employee on the employee's personnel number:

Registration number → Last name

Registration number → Department number

Dependence of phone number on department number:

Department number → Department phone number

58 of 372

2 normal form

  • Conditions of the second normal form:
    1. the model meets the conditions of 1 NF;
    2. all fields of each table that are not part of the primary key are linked by a full functional dependency to the primary key.
    3. Note: If the key of a relation is simple, then the relation is automatically in 2NF.

A simple key is a key with a single attribute.

A composite key is a key made up of two or more attributes.

59 of 372

Complete and partial functional dependence

  • A full functional dependency is a dependency in which a non-key field depends on all attributes of the composite primary key.

  • A partial functional dependency is a dependency in which a non-key field depends on some of the attributes of a composite primary key.

60 of 372

2nd normal form. Example

  • The entity EMPLOYEES_DEPARTMENTS is not in 2NF, because there are attributes that depend on part of the composite key:
    • The employee's last name does not depend on the department number.
    • In order to eliminate the dependency of attributes on part of the composite key, it is necessary to decompose the relationship into several relations. In this case, the attributes that depend on part of the composite key are transferred to a separate relation.

Employee's record number

Name

Department number

Department

1

Karpenko

1

11-22-33

2

Yen

1

11-22-33

3

Konovalova

2

33-22-11

Table 2. Relationships DEPARTMENT EMPLOYEES

Department number

Department

1

11-22-33

2

33-22-11

Employee's record number

Name

Department number

1

Karpenko

1

2

Yen

1

3

Konovalova

2

61 of 372

3 normal form

  • Conditions of the third normal form:
    1. the model satisfies the conditions of 2 NF;
    2. all non-key fields are independent of each other.

A non-key attribute is an attribute that is not part of the primary key.

Two or more attributes are called mutually independent if none of them is functionally dependent on any combination of the other attributes.

62 of 372

3rd normal form. �Alternative definition

  • Conditions of the third normal form:
    1. the model satisfies the conditions of 2 NF;
    2. there is no transitive functional dependence of key fields on the primary key.

A transitive functional dependency is a dependency where two attributes are related to each other through a third attribute.

63 of 372

+ and – normalizations

  • As can be seen from the table, more highly normalized relationships appear to be better designed ( three pluses , one minus ).
  • They are more relevant to the subject area, easier to develop, and database modification operations are faster for them.
  • However, this is achieved at the cost of some slowdown in data retrieval operations .

Criterion

The relations are weakly normalized (1NF, 2NF)

The relationship is strongly normalized (3NF)

Adequacy of the subject area database

WORSE (-)

BETTER (+)

Ease of database development and maintenance

MORE DIFFICULT (-)

EASIER (+)

Insert, update, delete execution speed

SLOWER (-)

FASTER (+)

Data sampling speed

FASTER (+)

SLOWER (-)

64 of 372

HIGHER NORMAL FORMS

65 of 372

BCNF ( Boyce -Codd Normal Form)

Boyce-Codd normal form conditions :

  1. the model satisfies the conditions of 3 NF;
  2. the determinants of all functional dependencies are potential keys.

Alternative item 2

  1. a relation must not have two (or more) possible keys that are complex and share a common attribute.

66 of 372

Potential key

  • There can be several potential keys at the same time.

  • One of them can be chosen as the primary key of the relation, then the other potential keys are called alternate keys.

Potential key in the relational data model – a subset of the attributes of a relationship that satisfies the requirements of uniqueness and minimality (irreducibility/ atomicity ).

Uniqueness means that there are no two tuples of this relation in which the values of this subset of attributes are the same (equal).

67 of 372

BCNF ( Boyce -Codd Normal Form)

  • There is a functional dependency Tariff Court Number in which the left-hand side (determinant) is not a potential key of the relation, i.e. the relation is not in Boyce-Codd normal form .

N court

Start time

End time

Tariff

1

09:30

10:30

"Frugal"

1

11:00

12:00

"Frugal"

1

14:00

15:30

"Standard"

2

10:00

11:30

"Premium- B"

2

11:30

13:30

"Premium- B"

2

15:00

16:30

"Premium-A"

Tariff

N court

Clubman

"Frugal"

1

Yes

"Standard"

1

No

"Premium-A"

2

Yes

"Premium- B"

2

No

Tariff

Start time

End time

"Frugal"

09:30

10:30

"Frugal"

11:00

12:00

"Standard"

14:00

15:30

"Premium- B"

10:00

11:30

"Premium- B"

11:30

13:00

"Premium-A"

15:00

16:30

The disadvantage of this structure is that, for example, the "Economical" rate can be mistakenly attributed to a reservation of 2 courts, although it may only apply to 1 court.

68 of 372

Fourth Normal Form (4 NF)

Conditions of the fourth normal form :

  1. the model complies with the BCNF conditions;
  2. all non-trivial multivalued dependencies are practically functional dependencies on its potential keys.

Alternative item 2

  1. all rows of the table must be independent of each other .

69 of 372

Fourth normal form

  • In other words, all rows of the table must be independent of each other .
  • In the sense that the presence of some row X should not mean that row Y is also somewhere in this table.

Relations with non-trivial multivalued dependencies typically arise from the natural combination of two relations on a common field that is not a key in either relation. This effectively results in an attempt to store information about two independent entities in a single relation.

For example, an employee may have many jobs and many children. Storing information about jobs and children in a single relation results in a non-trivial multivalued dependency Employee - Job - Children.

70 of 372

Fourth normal form

  • The decomposition of the relationship "Applicants-Faculties-Subjects" cannot be performed taking into account functional dependencies , since this relationship does not contain any functional dependencies . This relationship is completely key, that is, the key of the relationship is the entire set of attributes. But it is clear that there is some kind of relationship between the attributes. This relationship is described by the concept of multivalued dependency .

Entrant

Faculty

Subject

Fedorenko

Aircraft construction

Mathematics

Fedorenko

Aircraft construction

Informatics

Fedorenko

Sula

Mathematics

Fedorenko

Sula

Physics

Shelepova

Aircraft construction

Mathematics

Shelepova

Aircraft construction

Informatics

Entrant

Faculty

Fedorenko

Aircraft construction

Fedorenko

Sula

Shelepova

Aircraft construction

Faculty

Subject

Aircraft construction

Mathematics

Aircraft construction

Informatics

Sula

Mathematics

Sula

Physics

"Applicants-Faculties-Subjects" has a non-trivial multivalued dependency:

Faculty→→ Applicant | Subject.

71 of 372

Fourth Normal Form (4 NF)

There seems to be an update anomaly related to the fact that the names of applicants, names of faculties, and names of subjects are duplicated.

Elimination: by moving all names into separate relations, leaving only the corresponding numbers in the original relation.

Insertion anomaly . When trying to add a new tuple to the relationship "Applicants-Faculties-Subjects", for example ( Star , Aircraft Construction, Mathematics), we must also add a tuple ( Star , Aircraft Engineering, Computer Science), since all applicants to the Aircraft Engineering Faculty are required have the same list of items to be given. Accordingly, when trying to insert the tuple (3, 1, 1) into the modified relation, we must also insert the tuple (3, 1, 2).

Entrant

Faculty

Subject

Fedorenko

Aircraft construction

Mathematics

Fedorenko

Aircraft construction

Informatics

Fedorenko

Sula

Mathematics

Fedorenko

Sula

Physics

Shelepova

Aircraft construction

Mathematics

Shelepova

Aircraft construction

Informatics

Item No.

Subject

1

Mathematics

2

Informatics

3

Physics

Applicant number

Faculty No.

Item No.

1

1

1

1

1

2

1

3

1

1

3

3

2

1

1

2

1

2

Applicant number

Entrant

1

Fedorenko

2

Shelepova

Faculty No.

Faculty

1

Aircraft construction

2

Aviation engines

3

Sula

4

RCT

5

RTS LA

6

Economic

7

Humanitarian

Applicant number

Faculty No.

1

1

2

1

Faculty No.

Item No.

1

1

1

2

3

1

3

3

Removal anomaly.

Insertion and deletion of tuples cannot be performed independently of other tuples of the relation.

In addition, if we delete these tuples, information about the subjects that must be taken at the SULA faculty will be lost.

72 of 372

Fifth Normal Form (5NF)

Conditions of the fifth normal form ( projective -adjoint normal form) :

  1. the model satisfies the conditions of 4 NF;
  2. each non-trivial join dependency in it is defined by the potential key(s) of this relation.

Alternative item 2

  1. The model complies with certain constraints on the combination of attributes .

73 of 372

Fifth Normal Form (5 NF)

Suppose that it is necessary to take into account the following restriction : each seller has in his assortment a limited list of firms and a limited list of types of good goods and offers goods from the list of goods, produced by firms from the list of firms.

In the example, in particular, it is assumed that the seller Bova has the right to sell goods only from the company "Rogy ta Kopyta", the seller Oliynyk - goods only from the company "Kalashnikov", however, the seller Karpenko does not have the right to sell barbiturates and steroids, etc.

Seller

Firm

Goods

Bova

Horns and hooves

Amphetamine

Bova

Horns and hooves

Barbiturates

Oil varnish

Kalashnikov

AK-74

Oil varnish

Kalashnikov

Steroids

Karpenko

Horns and hooves

Amphetamine

Karpenko

Horns and hooves

Heroin

Karpenko

Kalashnikov

AKS-74U

Seller

Goods

Bova

Amphetamine

Bova

Barbiturates

Oil varnish

AK-74

Oil varnish

Steroids

Karpenko

Amphetamine

Karpenko

Heroin

Karpenko

AKS-74U

Sellers' goods

Seller

Firm

Bova

Horns and hooves

Oil varnish

Kalashnikov

Karpenko

Horns and hooves

Karpenko

Kalashnikov

Firm

Goods

Horns and hooves

Amphetamine

Horns and hooves

Barbiturates

Kalashnikov

AK-74

Kalashnikov

Steroids

Horns and hooves

Heroin

Kalashnikov

AKS-74U

Companies sellers

Goods firms

74 of 372

Non-trivial connection dependency

  • A compound dependence X is called a nontrivial compound dependence if two conditions are met:
    • one of the attribute sets X does not contain a potential key of relation Y.
    • none of the attribute sets of X coincides with the entire attribute set of relation Y.

  • For convenience, we will formulate the definition in the same way in negative form.

  • A connection dependency X is called a trivial connection dependency if one of the following conditions is met:
    • or all attribute sets X contain a potential relation key Y;
    • or one of the attribute sets of X coincides with the entire attribute set of the relation Y.

75 of 372

Sixth normal form

  • A model is in sixth normal form if and only if it satisfies all nontrivial connection dependencies.

  • This model is final and cannot be further decomposed without loss.

  • Practically not applicable. For chronological databases, the maximum possible decomposition allows you to combat redundancy and simplifies maintaining database integrity.

Time

Position

2020

schoolboy

2023

student

Time

Position

Marital status

2020

schoolboy

In active search

2022

schoolboy

I meet you.

2023

schoolboy

In love

2023

student

In love

2024

student

Everything is complicated.

2025

student

Unmarried

Time

Marital status

2020

In active search

2022

I meet you.

2023

In love

2024

Everything is complicated.

2025

Unmarried

76 of 372

Domain- key normal form

Alternative definition: a relation is in DCNF if it has no modification anomalies.

This is achieved by adding additional constraints on the values of the record field. Data integrity maintenance relies on the mechanisms of a specific DBMS . For example, default field values, data range restrictions, triggers that fire when creating, modifying, or deleting records, and so on. In short, anything that helps maintain data integrity. It was proposed by Ronald Fagin in 1981.

Domain -key normal form (DKNF) is a form where any operation of adding or deleting a record cannot lead to a violation of data integrity.

Domain restrictions – a constraint that requires that only values from a specific domain be used for a particular attribute. A constraint is essentially a list (or logical equivalent of a list) of permissible values of a type and a declaration that the specified attribute has that type.

Key constraint – a constraint that states that a particular attribute or combination of attributes is a potential key.

77 of 372

DEVELOPMENT OF �A PHYSICAL MODEL

78 of 372

DBMS

Data Management System ( DBMS ) (DBMS ) – a set of language and software tools that provide database management [GOST 34.321-96].

Alternative definition

A DBMS is a program, or a set of programs, designed for full-featured work with data.

Typically, it includes tools for creating and modifying the storage structure of data sets, as well as means of accessing stored data with the ability to read, add, modify, and delete it.

The DBMS is designed to play the role of an intermediary between the user (or rather, his direct query or the application in which the query is formulated) and the database.

Information model design

Development of data logic model

Physical database design

79 of 372

80 of 372

Classification of DBMS. �By method of accessing the database

  • There are the following types of DBMS:
    1. file server;
    2. client-server;
    3. built-in.

81 of 372

Database classification. �By data model

  • There are the following types of DBMS:
    1. hierarchical;
    2. network;
    3. relational;
    4. object-oriented.

82 of 372

Classification of DBMS. �By level of distribution

  • There are the following types of DBMS:
    1. local DBMS (all parts of a local DBMS are located on one computer);
    2. distributed DBMS (parts of the DBMS can be located on two or more computers).

83 of 372

File server

  • In file server DBMS files data is placed centrally on a file server .
  • The DBMS is located on each client computer (workstation).
  • DBMS access to data is carried out through local Network . Synchronization of reads and updates is done using file locks .

  • the advantage of this architecture is the low load on the file server processor

  • potentially high load on the local network ;
  • the difficulty or impossibility of centralized management;
  • the difficulty or impossibility of ensuring such important characteristics as high reliability, high availability and high security;
  • is considered obsolete , and its use in large information systems is a disadvantage

Microsoft Access

Paradox

dBase

FoxPro

84 of 372

Client-server

  • The client-server DBMS is located on server together with DB and accesses the DB directly, in exclusive mode.
  • All client requests for data processing are handled centrally by a client-server DBMS.

  • potentially lower loading local network ;
  • convenience of centralized management;
  • convenience in providing such important characteristics as high reliability, high availability and high security

  • increased server requirements;
  • cost

Oracle

MySQL

MS SQL Server

PostgreSQL

85 of 372

File server vs. Client -server

- user interface,

- processing logic ,

- data management

- user interface,

- processing logic

- data storage

- data storage

- processing logic ,

- data management

File server

Client-server

Server

Clients

High network load – large amounts of data are being transferred

Low network load – requests and results are transmitted

86 of 372

Built-in

  • Embedded DBMS – a DBMS that can be delivered as a component of a software product, without requiring a separate installation procedure.
  • A physically embedded DBMS is most often implemented as a plug-in library. Access to data from the application side can be via SQL or through special programming interfaces.

  • high speed and low memory consumption;
  • does not require installation;
  • does not create a separate process, thereby not overloading the computer's processor

  • intended only for local storage of your application data and is not intended for collective use on the network;
  • the maximum database size is often small by DB standards;
  • inferior in reliability to server counterparts

SQLite

Firebird Embedded

Built-in H2

Built-in version Interbase

87 of 372

Database rating

All users Beginners

88 of 372

Database rating

https://db-engines.com/en/ranking_trend

89 of 372

MySQL Workbench

  1. Visual database design
  2. Ability to edit data in the table in visual mode
  3. Full-featured SQL editor
  4. Schema and database synchronization

How to install MySQL Workbench on Windows?

MySQL Workbench – a tool for visual database design.

MySQL Workbench is NOT a DBMS!

90 of 372

MySQL Workbench

91 of 372

MySQL Workbench

92 of 372

MySQL Workbench . �Visual design

  • The first thing worth noting is that Workbench allows you to visually design database, that is, to create a database schema.
  • A visual representation of your database always provides much more information than a dry list of tables.
  • In this version, you immediately see how the tables are related to each other, you can group the tables by any parameters and reflect this in the diagram.

93 of 372

MySQL Workbench

  • The program has a built-in SQL code editor, which allows you to quickly make any corrections to SQL queries.
  • At the same time, you can build queries of any complexity:
    1. get different samples from tables,
    2. to tie them,
    3. create new tables and edit existing ones,
    4. work with keys, fields, relationships.

94 of 372

MySQL Workbench

  • MySQL Workbench allows you to synchronize your local database schema with the real database on your local or production server.

  • Thanks to this, after designing , you do not need to manually create tables in the database on your server; just perform a few simple actions in the program, after which a full-fledged database with all the specified relationships and parameters will be created on the working server.

95 of 372

MySQL Workbench. �Creating a new data model

  • To create a new model, select File -> New Model from the menu or click on the plus sign at the bottom of the list of all models.
  • A window will appear for building tables, views, schemas, roles, scripts, etc.

.

To create a new table on the Physical schema tab ( Physical Schemas ) select " Add Table » and fill in the fields

96 of 372

MySQL Workbench

Next, go to the " Columns " tab and create table fields.

Enter the field name, data type, parameters, and comment for it.

Each column has:

1 name (do not use Cyrillic characters in the name!),

Data type 2. The most common data types:

  • INT – integer;
  • VARCHAR (size) – variable-length character data, the maximum size is indicated in parentheses;
  • DECIMAL (size, decimal_signs ) – decimal number;
  • DATE – date:
  • DATETIME – date and time.

3 PK ( primary key ) – primary key;

4 NN ( not null ) – the cell does not allow empty values;

5 U Q ( unique ) – the value must be unique within the column;

6 AI ( auto incremental) – this property is useful for a simple primary key, it means that the primary key will be automatically filled with natural numbers 1, 2, 3, etc.;

7 DEFAULT – default value, that is, the value that is automatically inserted into the cell by the server when adding a new row to the table if the user left the cell blank.

97 of 372

98 of 372

Data types

  1. numeric types;
  2. character types;
  3. temporal types (date and/or time);
  4. other data types.

BOOL or BOOLEAN

Size

1 bytes

Minimum value

0

Maximum value

1

99 of 372

Numeric integer types

TINYINT

SMALLINT

MEDIUMINT

INT

BIGINT

Size

1 byte

2 bytes

3 bytes

4 bytes

8 byte

Minimum value

-128

-32768

-8388608

-2 32 =

-2147483648

-2 63 =

-9223372036854775808

Maximum value

127

32767

8388607

2 32 -1 = 2147483647

2 63 - 1 =

9223372036854775807

Minimum unsigned value

0

0

0

0

0

Maximum unsigned value

255

65535

16777215

4294967295

18446744073709551615

  • INT (4) – values are assumed to be four-digit, but in fact will store the maximum possible.
  • INT (5) ZEROFILL – fills the empty positions on the left with zeros. For example, the value 2 will be displayed as 00002.

100 of 372

Numerical thing types

FLOAT (M, D)

DOUBLE (M,D )

DECIMAL (M,D) or DEC (M, D) or NUMERIC (M, D)

Size

4 bytes

8 bytes

M + 2 bytes

Minimum value

-1.175494351 × 10 -39

-2.2250738585072015 × 10 -308

depend on the parameters M and D

Maximum value

3. 402823466 × 10 38

1.797693134862315 × 10 308

  • UNSIGNED disallows negative numbers, but the range of values changes.
  • FLOAT (5,2) – will store numbers with 5 characters, 2 of which will be after the decimal point (for example: 46,58).
  • DECIMAL (5,2) – will store numbers from -99.99 to 99.99. M – the number of characters to be allocated (maximum value – 64). D – the number of decimal places (maximum value – 30).

101 of 372

Character data types

character

varchar

nchar

nvarchar

Coding

Non- Unicode

Non- Unicode

Unicode

Unicode

Maximum length

8000

8000

4000

4000

Character size

1 byte

1 byte

2 bytes

2 bytes

Storage size

n bytes

n + 2 bytes

2 × n bytes

2 × n + 2 bytes

Using

If the length of the string is strictly fixed

For variable length strings

Use only if you need support for Unicode characters, such as Japanese ( Kanji ) or Korean ( Hangul ) characters.

102 of 372

Calendar data types

DATE

TIME

DATATIME

TIMESTAMP

YEAR ( 2 )

YEAR ( 4 )

Sense

Date

Time of day

Date and time of day

Date and time of day as seconds since midnight on January 1, 1970 (the beginning of the UNIX era)

Year (two-digit)

Year

Size

3 bytes

3 bytes

8 bytes

4 bytes

1 byte

1 byte

Format

YYYY-MM-DD

hh:mm:ss

YYYY-MM-DD hh:mm:ss

YYYY-MM-DD hh:mm:ss

YY

YYYY

Minimum value

1000-01-01

-838:59:59

1000-01-01 00:00:00

1970-01-01 00:00:00

1970

1901

Maximum value

9999-12-31

838:59:59

9999-12-31 23:59:59

2037-12-31 23:59:59

2069

2155

  • DATATIME- Time zone independent.
  • TIMESTAMP - when received from the database, it is displayed taking into account the time zone.

103 of 372

Calendar data types

  • A "light" syntax is allowed - you can use any punctuation mark as a separator between parts of date or time sections.
  • For example, like this:
    1. '87-12-31 11:30:45',
    2. '87.12.31 11+30+45',
    3. '87/12/31 11*30*45',
    4. ' 1987 @ 12 @ 31 11 ^ 30 ^ 45'.

104 of 372

Other types

BLOB, TEXT

MEDIUMBLOB, MEDIUMTEXT

LONGBLOB, LONGTEXT

ENUM ('value1', 'value2', ...,' valueN ')

SET ('value1', 'value2', ...,' valueN ')

Size

L+2 characters

L+3 characters

L+4 characters

1 or 2 bytes

up to 8 bytes

Maximum value

2 16 -1 characters

2 24 -1 characters

2 32 -1 characters

65535 items

64 items

  • The TEXT type is used to store text, and the BLOB type is used to store images, sound, electronic documents, etc.
  • ENUM ('yes', 'no') – a column with this type can store only one of the available values (+or null , or an empty string).
  • SET ('first', 'second') – a column with this type can store one of the listed values, both at once, or the value may be completely absent ( null ).

http://www.mysql.ru/docs/man/ENUM.html

105 of 372

MySQL

106 of 372

MySQL Workbench

After creating several tables, you can proceed to create relationships between them. You can do this either through the Foreign tab Keys in the required table, or directly when drawing a diagram of your database.

If you do it through the " Foreign" tab Keys ", then go to this tab, enter the name of your key and select the table for the relationship. Then, in the right part of the form, where the table with fields is indicated, select the fields of the previously specified table for the desired field, that is, this is where the relationship between the fields is indicated.

Here is the simplest example of a key:

107 of 372

ON DELETE

  • CASCADE (to spread, cascade )

  • SET NULL (set empty value)

  • SET DEFAULT (set default value)

DDL

108 of 372

Integrity by reference

  • The referential integrity requirement , or foreign key requirement , is that for each foreign key value that appears in a referenced relation, there must be a tuple with the same primary key value in the referenced table, or the foreign key value must be undefined, i.e., point to nothing.

109 of 372

Integrity by reference

  • What happens when a tuple is deleted from the relation it references?
  • Operation cascaded that is, deleting tuples results in deleting the corresponding tuples in the related relation.
  • Operation is restricted that is, those tuples that do not have related information in another respect are deleted. If such information exists, deletion cannot be performed. In this case, you must first either delete the referenced tuples or change their foreign key values accordingly.

110 of 372

Integrity by reference

  • What happens when you try updating the primary key of a relation referenced by a specific foreign key?
  • Operation cascaded that is, when the primary key is updated, the foreign key in the related relationship is updated.

  1. Operation is limited that is, only those primary keys for which there is no related information in another respect are updated.

If such information exists, the update cannot be made.

111 of 372

MySQL Workbench

Creating a database diagram

To create a diagram, use the Model -> Add menu. Diagram ( Ctrl+T ) or on the " EER Diagrams " tab, click on the add new diagram button. A checkered drawing field will open. Select the required tables on the left in the table list and drag them with the mouse onto this checkered area. As a result, a complete diagram of these tables will be immediately built, taking into account the connections we created earlier .

That's all the drawing. Nothing complicated.

As you may have noticed here, there is a toolbar where you can choose to create a new table, images, comment blocks, blocks for merging tables, and various options for relationships between tables (one-to-one, one-to-many, etc. ). By the way, if you create a new table visually here, it will immediately be added to the model itself, i.e. you don't need to duplicate anything.

EER diagram ( EER = Enhanced Entity Relationship)

Creating an SQL CREATE script

MySQL database . To do this, you need to create an SQL script. Go to the menu. Select the menu item File -> Export -> Forward Engineer SQL CREATE Script .

In the window that appears in the Output field file using the Browse button Select the path to save the script and specify its name. Click the Next button .

In the next window, click the Finish button .

112 of 372

MySQL Workbench

Synchronization with the server

Select Database -> Synchronize Model… from the menu and enter your server parameters in the window.

After that, go further, the connection will be checked and, if everything is ok , you can proceed to selecting a model for synchronization. Here you select your model, and below you select the database on the server. Go further again, where all your data will be checked again.

After that, you will already be in the list of tables in your model in the program and in the database on the server. It is important that this synchronization method completely synchronizes data, i.e. not only does it upload new tables to the server, but it can also upload tables from the server to the program . To do this, just select the tables in the database and click the " Update Model " button so that they are uploaded to the model during synchronization.

113 of 372

Identifying and non-identifying relationships

  • An identifying relationship is shown on the diagram as a solid line with a bold dot at the child end of the relationship, a non-identifying relationship is shown as a dotted line .

  • When establishing a non-identifying relationship (dotted line), the child entity remains independent , and the primary key attributes of the parent entity migrate to the non-key components of the parent entity.
  • A non-identifying relationship serves to link independent entities.
  • An instance of the "Employee" entity can exist independently of any instance of the "Department" entity, meaning an employee can work in an organization and not be listed in any department.

There cannot be an order item number without an order.

114 of 372

Lecture 8. SQL language

115 of 372

SQL

116 of 372

SQL

  • SQL ( language structured Structured Query Language (pronounced "s-q-el") is a declarative language programming for interactions user with bases data , used for querying, updating and managing relational databases, creating and modifying database schemas, and database access control systems [ SQL:2016 or ISO/IEC 9075:2016 ] .

Differences between SQL and relational theory

The SQL language uses terms that are somewhat different from relational theory terms, for example,

  • instead of " relationships " " tables " are used,
  • instead of " tuples " – " strings ",
  • instead of " attributes " – "columns" or " columns ".

The SQL language standard, although based on relational theory, deviates from it in many places.

117 of 372

History of SQL

  • In the early 1970s, an experimental relational database system, IBM System R, was developed in one of IBM 's research laboratories, for which a special language, SEQUEL ( Structured Query Language) , was later created. EnglishQUERYLanguage ) . Later language SEQUEL was renamed to SQL.
  • The first DBMS to support the new language was Oracle V2 from Relational in 1979 . Software Inc. (which later became Oracle ) and IBM's System /38, based on System /R.
  • In 1986, the first SQL language standard was adopted by ANSI ( American National Standards Institute ).
  • A year later, the International Organization for Standardization ( International Organization for Standardization , ISO) published the standard ISO 9075-1987 « Database Language SQL» (SQL database language).
  • The latest revision at the moment is: ISO/IEC 9075:2016 "Information technology. Database languages. Structured query language (SQL)", or SQL:2016 .

118 of 372

Most popular languages

Programming, scripting, and markup languages

119 of 372

Advantages of SQL

  • independence from specific DBMSs
  • SQL language standardization
  • portability from one hardware environment to another
  • relational basis of language

Bonus

  • the ability to create interactive queries
  • possibility of programmatic access to the database
  • providing different data representations
  • the ability to dynamically change and expand the database structure
  • support for client/server architecture

120 of 372

Dialects SQL

  • Despite the existence of an international SQL standard, many companies involved in DBMS development make changes to the SQL language.
  • Each implementation of the SQL language in a particular DBMS is called a dialect.
  • PL/SQL – in the Oracle DBMS;
  • Transact-SQL – in the Microsoft SQL DBMS;
  • Informix SQL – in the Informix DBMS;
  • MySQL – in the MySQL database ;
  • JetSQL – Microsoft Access.

121 of 372

How to use SQL?

127.0.0.1 (aka localhost ) is an IP address that a computer can use to contact itself over the network, regardless of whether it is connected to the network, the type, and the address of the computer on it.

122 of 372

How to use SQL?

123 of 372

How to study SQL ?

  • Writing SQL queries online: http://sqlfiddle.com/

  • http://www.sql-ex.ru ( VPN ) – a site with SQL exercises

124 of 372

How to study SQL ?

  1. HackerRank : https://www.hackerrank.com/domains/sql
  2. SQL simulator ( VPN ): https://stepik.org/course/63054/syllabus
  3. PostgreSQL Exercises : https://pgexercises.com/
  4. Live SQL: https://livesql.oracle.com/apex/f?p=590:1000
  5. Codility: https://app.codility.com/programmers/trainings/6/
  6. CodeSignal : https://app.codesignal.com/arcade/db

125 of 372

CodePad.SQL​

SELECT Product_ID, Name , Price

FROM product

WHERE price > 100

ORDER BY price Description

126 of 372

Useful links

  • https://habrahabr.ru/post/181033/ – testing for SQL knowledge;

127 of 372

Useful links

  • SQL Academy a simulator for writing SQL queries that is pleasant to use (with VPN): https://sql-academy.org/

  • Step-by-step interactive SQL tutorial (in English): https://sqlbolt.com/

  • Interactive online courses on SQL DBMS PostgreSQL (with VPN): https://learndb.ru/

  • Course "Interactive SQL Trainer": https://stepik.org/course/63054/

128 of 372

SQL language components

  • The basis of the SQL language is made up of operators, conventionally divided into several groups according to the functions performed:

  1. DDL ( Data Definition Language ) – definition operators database objects

  • DML ( Data Manipulation Language ) – manipulation operators data.

  • DQL ( Data Query Language ) is a data selection operator .

  • DCL ( Data Control Language ) – operators for working with access rights .

  • TCL ( Transaction Control Language ) – transaction control operators .

129 of 372

Operators SQL

DDL

CREATE

DROP

ALTER

DML

UPDATE

DELETE

INSERT

DQL

SELECT

DCL

GRANT

REVOKE

TCL

COMMIT

ROLLBACK

130 of 372

Operators SQL

DDL (Data Definition Language) – statements defining database objects

CREATE SCHEMA – create a database schema DROP SHEMA – delete a database schema

CREATE TABLE – create a table ALTER TABLE – alter a table DROP TABLE – delete a table

CREATE DOMAIN – create a domain ALTER DOMAIN – change a domain DROP DOMAIN – delete a domain

CREATE COLLATION – create a sequence DROP COLLATION – delete a sequence

CREATE VIEW – create a data view DROP VIEW – delete a data view

DML ( Data Manipulation Language ) – data manipulation operators

SELECT – select rows from tables

INSERT – add rows to a table

UPDATE – change rows in a table

DELETE – delete rows in a table

D C L ( Data Control Language ) – operators for working with access rights

CREATE ASSERTION – create a constraint DROP ASSERTION – remove constraint

GRANT – grant privileges to a user or program to manipulate objects

REVOKE – revoke user or program privileges

TCL ( Transaction Control Language ) – statements transaction management

COMMIT – commit changes

ROLLBACK – roll back changes made

131 of 372

DDL data definition statements

CREATE

DROP

ALTER

TABLE

VIEW

+

Name

tables /

kind

+

132 of 372

DDL data definition statements

Operator

Value

Action

CREATE TABLE

Create a table

Creates a new table in the database

DROP TABLE

Delete table

Deletes a table from the database.

ALTER TABLE

Change table

Changes the structure of an existing table or the integrity constraints set for that table

CREATE VIEW

Create a data slice

Creates a virtual table that corresponds to some SQL query

DROP VIEW

Delete data slice

Deletes a previously created data slice

ALTER VIEW

Change data slice

Changes a previously created data slice

CREATE INDEX

Create index

Creating a table index for quick access to attributes included in the index

DROP INDEX

Delete index

Deletes a previously created index

133 of 372

Loading data using a text file

name

owner

species

sex

birth

death

Fluffy

Harold

cat

f

1993-02-04

Claws

Gwen

cat

m

1994-03-17

Buffy

Harold

dog

f

1989-05-13

Fang

Benny

dog

m

1990-08-27

Bowser

Diane

dog

m

1998-08-31

1995-07-29

Chirpy

Gwen

bird

f

1998-09-11

Whistler

Gwen

bird

1997-12-09

Slim

Benny

snake

m

1996-04-29

name

owner

species

sex

birth

death

Whistler

Gwen

bird

\N

1997-12-09

\N

Let's say your records match the ones in this table (note: MySQL accepts dates in the format YYYY-MM-DD ; you may not be used to this notation).

Since you are starting with a blank table, the easiest way to fill it in is to create a text file containing a row for each animal, and then load its contents into the table with a single command.

Blank fields (such as unknown sex or death dates for animals alive today) can be given a NULL value. In a text file, this value is the characters \N.

For example, the entry for the bird Whistler should look something like this (single tabs should be placed between values):

The end-of-line marker and column-value separator can be specified in the LOAD DATA command, but the defaults are tab and newline. Assuming these, the command will be able to read the pet.txt file

You can load the pet.txt file into a table using the following command:

mysql > LOAD DATA LOCAL INFILE "pet.txt" INTO TABLEpet ;

CREATE TABLE pet ( name VARCHAR(20), owner VARCHAR(20), species VARCHAR(20), sex CHAR(1), birth DATE,death DATE);

134 of 372

SQL not SQL

135 of 372

CREATE DATABASE

CREATE DATABASE database_name [ ON

{[PRIMARY] (

    NAME = logical_file_name,

    FILENAME = 'os_file_name'

[, SIZE = size]

[, MAXSIZE = max_size]

[, FILEGROWTH = growth_increment] )

} [,...n]

]

[ LOG ON

{ ( NAME = logical_file_name,

FILENAME = 'os_file_name'

[, SIZE = size] )

} [,...n] ] [FOR RESTORE]

FILENAME – the full path and file name for the database location, must point to the local disk of the computer on which SQL Server is installed.

SIZE – initial size of each file in MB .

MAXSIZE – maximum file size in MB , if not specified, the size is not limited.

FILEGROWTH – the unit of volume added to a file each time it needs to be increased, specified in MB (by default) or as a percentage (i.e., 0 is added to the number); the value 0 prohibits the file from increasing.

1 MB – default

DCL

136 of 372

DROP DATABASE

  • Completely deletes the existing database:

  • You must have the appropriate rights to delete.

DROP DATABASE databasename;

DCL

137 of 372

CREATE TABLE

  • CREATE TABLE statement is used to create tables .
  • Here is the simplified syntax of this operator:

CREATE TABLE table_name

(

column_name data_type [NULL | NOT NULL]

[,...n]

)

CREATE TABLE table (

column1 type1 [(size1)][CONSTRAINT _ column-constraint1]

[, column2 type2 [(size2)][CONSTRAINT _ column-constraint2]

[, ...]]

[CONSTRAINT table-constraint1 _ [,table-constraint2 [, ...]]]

);

DDL

138 of 372

CREATE TABLE

  • This statement creates a table member consisting of four columns:
    • member _ no – has type int , NULL values are not allowed;
    • last name – has type char ( 50 ) – 50 characters, NULL values are not allowed;
    • first name – similar to lastname ;
    • photo – has the type image (image), NULL is allowed .

CREATE TABLE member (

member_no int       NOT NULL ,

lastname char ( 50 ) NOT NULL ,

firstname char ( 50 ) NOT NULL ,

photo image

)

-- MySQL:

id INTEGER PRIMARY KEY AUTO_INCREMENT

-- MS SQL Server:

id INT identity ( 1 , 1 ) PRIMARY KEY

DDL

139 of 372

CREATE TABLE

  • In this statement, you must specify the field name, its data type (the data type must be supported by the given DBMS), length (for some field types), and, if necessary, server constraints (using the CONSTRAINT keyword).
  • For example, the following query creates a table named Simple with four columns – LastName , FirstName , Email and HomePage :

CREATE TABLE Simple (

FirstName varchar ( 50 ) NOT NULL ,

LastName varchar ( 50 ) NOT NULL ,

Email varchar ( 50 ),

HomePage varchar ( 255 )

)

DDL

140 of 372

CREATE TABLE

  • We can extend this table by adding a PersonID field , which will be used as a primary key :

  • and specify that the combination of the LastName and FirstName fields must be unique .

CREATE TABLE Simple (

PersonID Integer     NOT NULL PRIMARY KEY ,

FirstName varchar ( 50 ) NOT NULL ,

LastName varchar ( 50 ) NOT NULL ,

Email varchar ( 50 ),

HomePage varchar ( 255 )

)

DDL

CREATE TABLE Simple (

PersonID Integer NOT NULL PRIMARY KEY ,

FirstName varchar ( 50 ) NOT NULL ,

LastName varchar ( 50 ) NOT NULL ,

Email varchar ( 50 ),

HomePage varchar ( 255 ),

  CONSTRAINT SimpleConstraint UNIQUE (FirstName, LastName)

)

141 of 372

DROP TABLE

  • To be able to delete a table, the user must have owner rights (i.e. be the creator) of that table.

  • Before deleting a table from a SQL database, you will need to clear it of data.

DROP TABLE <table name>;

DDL

142 of 372

ALTER TABLE

  • You can use the ALTER TABLE statement to change the structure of an existing table.
  • Using it, you can add or remove a field or server restriction.
  • There are four varieties of the ALTER TABLE statement:
  • for adding columns to tables ;
  • for adding server restrictions to tables ;
  • to remove fields from tables ;
  • to remove about indexed​ fields .

DDL

143 of 372

ALTER TABLE

ALTER TABLE table name

�,

DROP column name

�ADD column name data type

 

(size) NOT NULL

NOT NULL WITH DEFAULT

,

RENAME column name new name

TABLE new name

,

MODIFY column name

 

data type ( length ) NULL

NOT NULL

NOT NULL WITH DEFAULT

Changes the structure of columns in a table.

144 of 372

ALTER TABLE �( Referential Integrity )

ALTER TABLE table name

,

PRIMARY KEY (column name)

DROP

,

,

FOREIGN KEY (column name) REFERENCES parent table name

DROP foreign key

name

ON DELETE RESTRICT

CASCADE

SET NULL

Changes the PRIMARY KEY/FOREIGN KEY structure in a table.

145 of 372

ALTER TABLE

  1. The first variant of this operator is used to add a column to a table , and its syntax is as follows:

    • In queries of this type, the table name, the name of the new field, its data type, and, if necessary, the size are specified.
    • Additionally, you can specify a server restriction associated with this field.
    • For example, to add a Phone field to the Simple table , you can execute the following query:

ALTER TABLE table ADD [COLUMN] column datatype [(size)]

[CONSTRAINT sinlge-column-constraint]

ALTER TABLE Simple ADD Phone varchar ( 30 )

DDL

146 of 372

ALTER TABLE

  • The column will be added with a NULL value. for all rows of the table.

  • The new column will be the last column in the table.

  • You can add multiple new columns at once, separated by commas, in one command.

DDL

147 of 372

ALTER TABLE

  • It is also possible to add a column with a specific condition:

DDL

ALTER TABLE Departments

ADD PhoneDepartment VARCHAR ( 9 )

CHECK (PhoneDepartment LIKE '[2.5][0-9][0-9]-[0-9][0-9]-[0-9][0-9]' )

ALTER TABLE Results ADD CONSTRAINT ch_res_marks

CHECK (Mark IN ( 3 , 4 , 5 )

AND ((Mark = 3 AND Balls BETWEEN 60 AND 75 )

OR ( Mark = 4 AND Balls BETWEEN 75 AND 90 )

OR (Mark = 5 AND Balls BETWEEN 90 AND 100 )));

148 of 372

ALTER TABLE

  1. The second type of ALTER TABLE statement is used for addition server constraints to the table , and its syntax is as follows:

  • Such queries allow you to add various field properties, such as the uniqueness property:

ALTER TABLE table ADD CONSTRAINT constraint

DDL

ALTER TABLE Departments

ADD CONSTRAINT un_title

UNIQUE (TitleDepartment);

149 of 372

ALTER TABLE

  1. The third type of ALTER TABLE statement is used to delete a field from a table :

  • The COLUMN keyword is optional. For example:

ALTER TABLE table DROP [COLUMN] column

DDL

ALTER TABLE Simple DROP Phone

150 of 372

ALTER TABLE

  1. Please note that to remove indexed fields you must first drop the index. This can be done using the fourth variant of the ALTER TABLE statement:

  • Below is an example of such a query:

 ALTER TABLE table DROP CONSTRAINT index

DDL

ALTER TABLE Simple DROP CONSTRAINT PrimaryKey

151 of 372

Data types in SQL

Data type

Description

INTEGER, INT

Whole numbers

SMALLINT

Small integers

DECIMAL(p, q), DEC(p, q)

(0 < p < 16 , q < p)

Decimal numbers

FLOAT( n ) (n is the number of bytes)

Floating point substance

REAL

Low precision floating point substance

DOUBLE PRECISION

High precision floating point substance

152 of 372

Data types in SQL

Data type

Description

CHARACTER(n), CHAR( n )

(n – number of characters 0 < n < 254)

Constant-length character strings

VARCHAR(n)

Variable-length character strings

DATE

Date in a format that you specify

special team

(default ) mm / dd / yyy )

TIME

time in a format that you specify

special team

(default ) hh.mm.ss )

DATETIME

date and time combination

MONEY

money

153 of 372

Information entities of the " Rembrigada " database

154 of 372

Example of defining �tables for the Rembrigade database

  • Employees:

  • Objects :

CREATE TABLE WORKERS (

REGISTER_NUMBER INTEGER NOT NULL PRIMARY KEY ,

    Full name CHARACTER ( 15 ),

    SPECIALTY CHARACTER ( 15 ),

    TIME_RATE DECIMAL ( 6 , 1 ),

    BRIGADIER INTEGER

)

CREATE TABLE OBJECTS (

OBJECT_NUMBER INTEGER NOT NULL PRIMARY KEY ,

OBJECT_ADDRESS CHARACTER ( 15 ),

OBJECT_TYPE CHARACTER ( 9 ) DEFAULT 'office' CHECK (

        TYPE IN ( 'office' , 'warehouse' , 'store' , 'residential building' )

),

DIFFICULTY_LEVEL INTEGER DEFAULT 1 CHECK (

DIFFICULTY LEVEL > 0

        AND DIFFICULTY_LEVEL < 5

)

)

DDL

155 of 372

Example of defining �tables for the Rembrigada database

  • Work:

CREATE TABLE WORK (

REGISTER_NUMBER INTEGER NOT NULL ,

    OBJECT_NUMBER INTEGER NOT NULL ,

    START_DATE DATE ,

    NUMBER_DAYS INTEGER ,

    PRIMARY KEY ( TABLE_NUMBER, OBJECT_NUMBER),

    FOREIGN KEY REFERENCES REFERENCES ON DELETE CASCADE WORKERS ,

    FOREIGN KEY OBJECT_NUMBER REFERENCES ON DELETE CASCADE

)

DDL

156 of 372

Bonus operators for MySQL

  • SELECT VERSION(), CURRENT_DATE; – a command that requests information from the server about its version and current date.
  • SELECT NOW(); – a command that requests information about the current time from the server.
  • SELECT SIN(PI()/4), (4+1)*5; – demonstrated using mysql as a calculator .
  • SELECT USER(); – displays the username.
  • SHOW DATABASES; – shows which currently exist on the server.
  • SHOW TABLES; – indicates which tables exist in the current database.
  • USE test; – use a database with the name test (this is a specific command – it must be written on one line).
  • GRANT ALL ON menagerie.* TO your_mysql_name ; – grant unlimited rights to the user named your_mysql_name for working with the menagerie database .
  • DESCRIBE pet; – displays all information about the structure of the pet table.

157 of 372

SQL .DML​

158 of 372

Data modification and sampling

DML

INSERT

DELETE

UPDATE

DQL

SELECT

159 of 372

Data manipulation operators�

Operator

Sense

Action

DELETE

Delete rows

Deletes one or more rows that match the filter condition from the table.

The result depends on the chosen strategy for maintaining referential integrity.

INSERT

Insert line

Inserts a row into the table.

are allowed in which several rows can be transferred from one table to the base table at once.

UPDATE

Update row

Updates the values of one or more columns in one or more rows that meet the filter conditions.

DML

160 of 372

INSERT statement

  • To add records into a table, you should use the INSERT statement , the syntax of which is as follows:

  • For example, to add a new customer to the Customers table , you could use the following query:

DML

INSERT [INTO] table ([column_list]

{ VALUES ( { DEFAULT | NULL | expression } } [, …] )

INSERT INTO Customers(CustomerID, CompanyName)

VALUES ( 'XYZFO' , 'XYZ Deli' )

161 of 372

DELETE operator

  • To delete rows from tables, you should use the DELETE operator , the syntax of which is as follows:

  • Item WHERE is not required, but if it is not added, the table will all records deleted.

  • For example, to remove from the list all products that are no longer available, you can run the following query:

DML

DELETE FROM table [WHERE criteria]

DELETE FROM Product WHERE Discontinued = 1

162 of 372

When I execute a DELETE query and realize I forgot to add a WHERE

DML

163 of 372

UPDATE operator

  • To change values in one or more columns of a table, the UPDATE statement is used :

  • The expression in a SET clause can be a constant or the result of a calculation. For example, to increase the prices of all products that cost less than $10, you could run the following query:

DML

UPDATE table

SET column1 = expression1 [,

column2 = expression2][,…]

[WHERE criteria]

UPDATE Product

SET UnitPrice = UnitPrice * 1 . 1

WHERE Unit Price < 10

164 of 372

Test

  • This table was created using the SQL statement:

  • What queries will allow you to add a record to this table?
  • INSERT INTO students(id, first_name, last_name) VALUES (1, 'Name', 'Surname');
  • INSERT INTO students(id, first_name) VALUES (2 , 'Name');
  • INSERT INTO students VALUES (3, 'Name', 'Surname', NULL);
  • INSERT INTO students VALUES (4, NULL, 'Surname', NULL);

CREATE TABLE STUDENTS (

ID INTEGER PRIMARY KEY,

FIRST_NAME VARCHAR (50) NOT NULL,

LAST_NAME VARCHAR (50) NOT NULL,

  ADDRESS    VARCHAR (100)

)

165 of 372

SQL .DQL​

166 of 372

Data Query Language (DQL)

  • Selecting data is the most common operation performed using SQL.

  • SELECT statement — one of the most important operators in this language, used for data selection.

  • SELECT statements are the only DQL statement.

  • DQL is often combined with DML.

DQL

167 of 372

SELECT

  • The SELECT statement must contain the words SELECT and FROM ; other keywords such as WHERE or ORDER BY are optional.

  • Elementary version request SELECT , which will display all the data from the table:

DQL

SELECT column-list

FROM table-list

[ WHERE where-clause]

[ ORDER BY order-by-clause] ;

SELECT * FROM Table_name;

168 of 372

SELECT

  • The general syntax of the operator is a bit more complicated:

DQL

SELECT

[DISTINCT | DISTINCTROW | ALL]

select_expression,...

FROM table_references

[ WHERE where_definition]

[ GROUP BY {unsigned_integer | col_name | formula}]

[ HAVING where_definition]

[ ORDER BY {unsigned_integer | col_name | formula} [ASC | DESC], ...] ;

169 of 372

FROM

  • To specify the names of the tables from which records are selected, use the FROM keyword , for example , this query will return all fields from the table Customers :

  • The SELECT keyword is followed by information about which fields should be included in the resulting data set.
  • An asterisk ( * ) indicates all fields in the table.

DQL

SELECT * FROM Customers;

170 of 372

SELECT

  • To select a single column, the following syntax is used:

  • An example of selecting multiple columns looks like this:

  • If choice data is carried out from several tables and at the same time fields with the same name are selected from different tables, table names should be referenced to fully identify the fields included in the resulting dataset, for example:

DQL

SELECT CompanyName FROM Customers;

SELECT CompanyName, ContactName FROM Customers;

SELECT Customers.CompanyName, Shippers.CompanyName

FROM Customers, Shippers;

171 of 372

WHERE

  • To filter the results returned by a SELECT statement, you can use the WHERE clause , the syntax of which is as follows:

  • For example, instead of getting a full list of products, you can limit yourself to only those whose CategoryID field value is 4:

DQL

WHERE expression1 [{AND | OR} expression2 […]]

SELECT * FROM Product WHERE CategoryID = 4 ;

172 of 372

WHERE

  • You can use various expressions in the WHERE clause, for example:

DQL

SELECT *

FROM Product

WHERE CategoryID = 2 AND SupplierID > 10 ;

SELECT ProductName, UnitPrice

FROM Product

WHERE CategoryID = 3 OR UnitPrice < 50 ;

 

SELECT ProductName, UnitPrice

FROM Product

WHERE Discontinued IS NOT NULL ;

Comparison operators

<

Less

<=

Less than or equal to

<>

Not equal to

=

Exactly

>

More

>=

Greater than or equal to

173 of 372

Examples of simple queries

  • Who works as a plasterer?

  • Provide all data about objects of the "office" type:

SELECT Full name

FROM ROB AND TNIKS

WHERE SPECIALTY = 'plasterer' ;

SELECT *

FROM OBJECTS

WHERE OBJECT_TYPE = 'office' ;

DQL

174 of 372

WHERE . �Special comparison operators

DQL

    • Used in conjunction with comparison operators when comparing to a list of values

ALL

    • Used in conjunction with comparison operators when comparing to a list of values

ANY

    • Used when checking whether a value is within a given interval (including its boundaries)

BETWEEN

    • Used to check if a value exists in a list

IN

    • Used when checking the value of a given mask

LIKE

175 of 372

WHERE. �Special comparison operators

  • To match data with a mask, the LIKE keyword is used :

  • In this mask, the ' % ' ( percent ) symbol replaces any sequence characters, and the '_' ( underscore ) character is one any character .

DQL

SELECT CompanyName, ContactName

FROM Customers

WHERE CompanyName LIKE 'M%'

176 of 372

WHERE . �BETWEEN

  • When searching for companies with names starting with the letters A through C, you can execute the following SELECT statement:

  • The BETWEEN operator in SQL includes range boundaries.

  • There is also the opposite predicate NOT BETWEEN, which is true when the compared value does not fall within the specified interval, including its boundaries.

DQL

SELECT CompanyName, ContactName

FROM Customers

WHERE CompanyName BETWEEN 'A' AND 'D'

177 of 372

WHERE. �Special comparison operators

  • Using the LIKE operator , we can narrow the search range by applying a more complex comparison mask.
  • For example, to find companies that contain the substring " bl " in their name (the mask '% bl %' indicates that there can be any number of arbitrary characters before and after the searched substring ), you can use the following query:

  • Using the operator IN , you can specify a list of values that should contain the field value:

DQL

SELECT CompanyName, ContactName FROM Customers

WHERE CompanyName LIKE '%bl%'

SELECT CompanyName, ContactName FROM Customers

WHERE CustomerID IN ( 'ALFKI' , 'BERGS' , 'VINET' )

178 of 372

Examples of simple queries

  • List plasterers and roofers:

  • At the same time, there is an opposite predicate NOT IN , which is true when the compared value is not included in the given set.

DQL

SELECT *

FROM WORKERS

WHERE Specialty IN ( 'plasterer' , 'roofer' )

179 of 372

Examples of simple queries

  • List employees whose specialty name begins with the phrase " elec ."

DQL

SELECT *

FROM Employees

WHERE Specialty LIKE ( 'elec%' )

180 of 372

WHERE + AND, OR, NOT

  • The result of this query will be a list of customers located in the United States whose names begin with the letter S.

  • As a result of executing this query, we will receive a list of customers from all countries except the USA and the UK.

DQL

SELECT CompanyName, ContactName

FROM Customers

WHERE CompanyName LIKE 'S%' AND Country = 'USA'

SELECT CompanyName, ContactName

FROM Customers

WHERE Country NOT IN ( 'USA' , 'UK' )

181 of 372

Truth table of logical expressions

A

B

NOT A

A AND B

A OR B

True

True

False

True

True

True

False

False

False

True

True

Null

False

Null

True

False

True

True

False

True

False

False

True

False

False

False

Null

True

False

Null

Null

True

Null

Null

True

Null

False

Null

False

Null

Null

Null

Null

Null

Null

182 of 372

IS NULL

  • IS NULL – comparison operator with an undefined value (unknown at this point in time)

  • When comparing undefined values, standard comparison rules do not apply: one undefined value is never considered equal to another undefined value.

NULL 1 ≠ NULL 2

DQL

183 of 372

Examples of simple queries

  • Who has an hourly rate of 60 to 70 UAH?

DQL

SELECT *

FROM WORKERS

WHERE Hourly_rate >= 60

AND Hourly_rate <= 70

SELECT *

FROM WORKERS

WHERE Hourly_rate BETWEEN 60 AND 70

184 of 372

ORDER BY

  • ORDER BY expression (optional) is used for sorting the resulting data set by one or more columns.

  • To specify the sort order, the keywords ASC ( ascending ) or DESC ( descending ) are used .

  • By default, data is sorted in ascending order.

DQL

ORDER BY column1 [{ASC | DESC}] [,

column2 [{ASC | DESC}] [,…]

185 of 372

ORDER BY

  • For example, to sort employees by last name and then by first name, you would use the following SQL query:

  • To sort in descending order (for example, for a list of products in descending price order), use the DESC keyword :

DQL

SELECT LastName, FirstName, Title

FROM Employees

ORDER BY LastName, FirstName

SELECT ProductName, UnitPrice

FROM Product

ORDER BY UnitPrice DESC

186 of 372

Examples of simple queries

  • What is the weekly salary of each electrician (40-hour work week)?

SELECT Full name,

  'Weekly salary = ' ,

40 * Hourly_rate

FROM WORKERS

WHERE SPECIALTY = 'electrician'

ORDER BY NAME

DQL

187 of 372

A short list of relational algebra operations

Operation

Marking

Application area

Sample

Defines a resulting relation that contains only those tuples (rows) from relation R that satisfy a given condition (predicate)

Projection

Defines a new relation containing a vertical subset of relation R, created by extracting the values of the specified attributes and excluding duplicate rows from the result.

Association

Defines a new relation that includes all tuples contained only in R, only in S, both in R and S, with all duplicate tuples excluded. The relations R and S must be union compatible.

Difference

The difference of two relations R and S consists of tuples that are present with respect to R but absent with respect to S. Moreover, the relations R and s must be compatible with respect to the union

Crossing

Defines a relation that contains tuples present in both R and S. The relations R and S must be union compatible.

Cross connection

Defines a new relation that results from concatenating (i.e., joining) each tuple from relation R with each tuple from relation S

Theta connection

Defines a relation that contains tuples from the Cartesian product of relations R and S that satisfy the predicate F

Connection by equivalence

Defines a relation that contains tuples from the Cartesian product of relations R and S that satisfy the predicate F (the predicate must only assume equality comparisons)

Natural combination

A natural join is a join by equivalence of two relations R and S, performed on all common attributes x, from the results of which one instance of each common attribute is excluded.

(Left) outer join

A join in which tuples of a relation R that do not have matching values in the common columns of a relation S are also included in the resulting relation

Semi-union

Defines a relation containing those tuples of the relation R that are included in the union of the relations R and S

Divide

Defines a relation consisting of a set of relation tuples R defined on an attribute C that corresponds to the combination of all tuples of a relation S, where C is the set of attributes present in R but absent in S.

188 of 372

Sample

Name

Weight

Volkov

67

Dorohobid

70

Vystoropskaya

48

Persons

σ Weight ≥ 66 (Persons)

Name

Weight

Volkov

67

Dorohobid

70

189 of 372

Projection

Name

Name

Patronymic

Litvinov

Yaroslav

Oleksandrovich

Lukashiv

Dmitry

Serhiyovych

Ryabtsev

Nikita

Serhiyovych

Persons

π Surname, First Name (Persons)

Name

Name

Litvinov

Yaroslav

Lukashiv

Dmitry

Ryabtsev

Nikita

190 of 372

Subtraction (difference)

Name

Guobadia

Yen

Karpenko

Name

Guobadia

Star

Tkachenko

Bank A clients ( R1 )

Bank B clients ( R2 )

Name

Yen

Karpenko

R = R1 – R2

Customers of Bank A who are not customers of Bank B ( R )

191 of 372

Association

Name

Guobadia

Yen

Karpenko

Name

Guobadia

Star

Tkachenko

Bank A clients ( R1 )

Bank B clients ( R2 )

Name

Guobadia

Yen

Karpenko

Star

Tkachenko

R = R1 U R2

Customers ( R )

all duplicate tuples are excluded

192 of 372

Crossing

Name

Guobadia

Yen

Karpenko

Name

Guobadia

Star

Tkachenko

Bank A clients ( R1 )

Bank B clients ( R2 )

Name

Guobadia

R = R1 ∩ R2

Customers of two banks at the same time ( R )

193 of 372

Cross connection

Name

Guobadia

Yen

Karpenko

Name

Guobadia

Star

Tkachenko

Bank A clients ( R1 )

Bank B clients ( R2 )

Name

Name

Guobadia

Guobadia

Yen

Guobadia

Karpenko

Guobadia

Guobadia

Star

Yen

Star

R = R1 X R2

All variants of their combinations ( R )

Name

Name

Karpenko

Star

Guobadia

Tkachenko

Yen

Tkachenko

Karpenko

Tkachenko

194 of 372

Divide

Name

Quality

Bondarev

Humor

Ryabtsev

Persistence

Forelock

Self-criticism

Quality

Humor

Alcoholism

Sociability

Persons( R1 )

Qualities ( R2 )

Name

Bondarev

R = R1 ÷ R2

From table R1, the values of the rows are taken for which all combinations of values from table R2(R) are present.

195 of 372

Table joins

  • As we have already seen, you can create queries that allow you to retrieve data from multiple tables.
  • Note that without table joins , the query will return a dataset containing all possible combinations of rows from each of the source tables (also known as a Cartesian product ):

  • while the query below returns a list of products with an indication of which category the product belongs to:

DQL

SELECT ProductName, CategoryName FROM Product, Categories

SELECT ProductName, CategoryName

FROM Product, Categories

WHERE Product.CategoryID = Categories.CategoryID

196 of 372

197 of 372

INNER JOIN

  • In the case of an inner join of tables , the resulting dataset contains records in which the values of the related fields match:

  • INNER JOIN is equivalent to just JOIN

DQL

SELECT ProductName, CategoryName

FROM Product

INNER JOIN Categories

ON Product.CategoryID = Categories.CategoryID

This is useful: in principle, there is no critical difference between using WHERE or JOIN to join tables, but it is believed that writing a query using JOIN is more correct due to readability.

198 of 372

LEFT JOIN

  • So-called outer joins allow us to include all rows from one table and corresponding rows from another table in the query result.
  • Left outer join:

  • LEFT OUTER JOIN is equivalent to LEFT JOIN

DQL

SELECT ProductName, CategoryName

FROM Product

LEFT OUTER JOIN Categories

ON Product.CategoryID = Categories.CategoryID

199 of 372

RIGHT JOIN

  • There are also right outer joins that return all rows from the second (i.e., right) table and their corresponding rows from the other table:

  • RIGHT OUTER JOIN is equivalent to RIGHT JOIN

DQL

SELECT ProductName, CategoryName

FROM Product

RIGHT OUTER JOIN Categories

ON Product.CategoryID = Categories.CategoryID

200 of 372

FULL JOIN

  • By combining a left and right outer join, you can get a full outer join that returns all data from both tables:

  • FULL OUTER JOIN is equivalent to FULL JOIN

DQL

SELECT ProductName, CategoryName

FROM Product

FULL OUTER JOIN

ON Product.CategoryID = Categories.CategoryID

201 of 372

Joins

202 of 372

CROSS JOIN

  • CROSS JOIN keyword without specifying the fields to be joined:

  • CROSS JOIN is equivalent to "comi":

DQL

SELECT ProductName, CategoryName

FROM Product

CROSS JOIN Categories

SELECT ProductName, CategoryName

FROM Product,

Categories

203 of 372

Joining three tables

  • Pseudocode of a query that joins 3 tables

SELECT (fields) field_1 , field_2 , field_3

FROM (tables) table_1

JOIN (with table) table_2

Software (conditions) (foreign_key_field_of_table_1 = primary_key_field_of_table_2 )

JOIN (with table) table_3

Software (conditions) (foreign_key_field_of_table_2 = primary_key_field_of_table_3 );

Example of a query that joins 3 tables

SELECT hst_name, sit_name, vis_timestamp

FROM hosts

JOIN visits ON (hst_pcode = vis_hstcode)

JOIN websites ON (vis_sitcode = sit_pcode);

DQL

204 of 372

Natural joins �(NATURAL JOIN)

  • A natural join is a join between two tables in which the DBMS connects the tables by a column(s) with the same name in both tables (naturally!).

  • A natural join is performed if the NATURAL keyword is specified .

DQL

Similarly

The only one

a matching column

for tables emp and dept

there is a column

department no .

SELECT ename, deptno, dname

FROM emp NATURAL JOIN dept;

SELECT ename,

emp.deptno, dname

FROM emp

JOIN department

ON emp.deptno = dept.deptno

205 of 372

Examples of complex queries

  • What specialties are workers assigned to facility No. 435?

DQL

SELECT SPECIALTY

FROM WORKERS,

WORK

WHERE WORKERS.SCHEDULE_NUMBER = JOB.SCHEDULE_NUMBER

  AND OBJECT_NUMBER = 435

206 of 372

ALL and DISTINCT keywords

  • To control the output of duplicate rows in the result set, you can use the ALL or DISTINCT keywords in the SELECT clause.

  • DISTINCT keyword specifies that the rows in the resulting dataset must be unique , while the ALL keyword specifies that all rows should be returned.

  • For example, to get the names of countries that have customers, you can use the following query:

  • Note that the ALL keyword is used by definition.

DQL

SELECT DISTINCT Country FROM Customers

207 of 372

Examples of complex queries

  • Show employees assigned to objects of type "office".

DQL

SELECT Full name

FROM WORKERS,

WORK,

OBJECTS

WHERE WORKERS.SCHEDULE_NUMBER = JOB.SCHEDULE_NUMBER

    AND WORK.OBJECT_NUMBER = OBJECTS.OBJECT_NUMBER

    AND OBJECT_TYPE = ' office '

Example of duplicate surnames

DISTINCT

There are no duplicate rows.

208 of 372

DATA GROUPING AND COMPLEX QUERIES

209 of 372

GROUP BY

  • To calculate total values based on data from one or more tables, you can use the GROUP BY clause , which has the following syntax:

  • CustomerID field, creates one row for each CustomerID value in the resulting dataset, and counts the number of values in the field. OrderID for each CustomerID value (i.e. the number of customer orders):

SELECT Customers.CustomerID, COUNT (Orders.OrderID)

FROM Customers

INNER JOIN Orders

ON Customers.CustomerID = Orders.CustomerID

GROUP BY Customers.CustomerID

GROUP BY {column1} [, …]

DQL

210 of 372

HAVING

  • HAVING clause has a purpose similar to the WHERE clause, but is used with aggregate data:

  • This query is similar to the previous one, but the resulting dataset includes only customers who placed ten or more orders.

SELECT Customers.CustomerID, COUNT (Orders.OrderID)

FROM Customers

INNER JOIN Orders

ON Customers.CustomerID = Orders.CustomerID

GROUP BY Customers.CustomerID

HAVING COUNT (Orders.OrderID) >= 10

DQL

211 of 372

HAVING

  • For each type of facility where more than one worker works , calculate the maximum duration of work. Consider only those facilities where work began after 10/20/2001.

SELECT OBJECT_NUMBER, MAX ( NUMBER_DAYS)

FROM WORK

WHERE START_DATE >= 20 / 10 / 01

GROUP BY OBJECT_NUMBER

HAVING COUNT ( TABLE_NUMBER) >= 2

212 of 372

HAVING

  • Output all honors students (average score > 90):

SELECT Student_Surname, Student_name, avg (Mark) AS AVG_Ball

FROM student NATURAL JOIN mark

GROUP BY student.idStudent

HAVING AVG_Ball > 90 ;

213 of 372

TOP

  • The TOP keyword can be used to return the first n rows or the first n percent of a table.
  • Get the first 10 products from the table:

  • Get the first quarter of the table entries.

SELECT TOP 10 * FROM Product ORDER BY ProductName

SELECT TOP 25 PERCENT * FROM Product ORDER BY ProductName

DQL

Only in Transact-SQL

214 of 372

LIMIT

  • LIMIT keyword can be used to return the first n rows or the first n percent of a table.
  • Get the first 10 students from the table:

  • Offset parameter (offset) specifies the number of records to skip. For example, you can output 5 students, starting with the third:

DQL

Only in MySQL

SELECT * FROM student LIMIT 10 ;

SELECT * FROM student LIMIT 5 offset 2 ;

215 of 372

Aliases

  • Full names containing table and column names are often very cumbersome.

  • Aliases, which are usually shorter and more expressive, can be used instead of long names within a single SQL command, for example:

SELECT long_field_names_are_annoying AS myfield # Column alias

FROM table_name

WHERE myfield = 'Joe'

SELECT people.names, tests.score

FROM tests, really_long_people_table_name

# Table alias in MySQL

DQL

216 of 372

217 of 372

AGGREGATE FUNCTIONS

218 of 372

Aggregate functions

  • When statistically analyzing databases, it is necessary to obtain information such as:
  • total number of records,
  • the largest or smallest value of a given record field,
  • averaged field value.
  • This is done using queries containing so-called aggregate functions.

DQL

219 of 372

Aggregate functions

Function

Appointment

AVG

Calculates the average

COUNT

Counts the number of non-empty values in this column

MAX

Calculates the largest value in a column

MIN

Calculates the smallest value in a column

SUM

Calculates the sum of values in a column

ABS

Returns the absolute value of a number

CEIL

Rounds a fractional number.

FLOOR

Removes the fractional part of a number

MOD

Returns the remainder when dividing one number by another

POWER

Returns a value equal to one number raised to the power of another number

ROUND

Rounds a number to the specified decimal place.

DQL

220 of 372

Aggregate functions

Function

Appointment

SIGN

Returns –1 if the number is negative, and 1 if it is positive

SQRT

Calculates the square root of a number

UPPER

Replaces all letters in a string with uppercase.

LOWER

Replaces all letters in a string with lowercase.

INITCAP

Capitalizes the beginning of words in a string.

LENGTH

Calculates the number of characters in a string

LPAD

Adds the specified character to the left of a string as many times as necessary to make the string the specified length.

RPAD

Adds the specified character to the right of the string as many times as necessary to make the string the specified length.

SUBSTR

Extracts a substring of the desired length from a string, starting at the specified position

DQL

221 of 372

Counting lines

  • To determine the total number of records in the Product table, we use the following query:

  • To determine the number of records in the ProductName field of the Product table , we use the following query:

  • count (*) – counts all rows.
  • count (<Field name>) – ignores rows with NULL.

DQL

SELECT count(*) FROM Product ;

SELECT count(ProductName) FROM Product ;

222 of 372

count(*) vs count(1)

  • These options should be equivalent:

  • However, it is recommended to use count(*) .

  • In PostgreSQL, count(*) is 10% faster.

SELECT count ( * ) FROM Table_name;

SELECT count ( 1 ) FROM Table_name;

223 of 372

Examples of simple queries

  • The GROUP BY clause means that rows should be divided into groups with common values for the specified column(s).
  • The GROUP BY clause is used in practice when statistical information is needed not about a single object, but about each group.
  • For employees of each specialty, calculate the maximum hourly rate.

SELECT SPECIALTY, MAX ( HOURLY_RATE)

FROM WORKERS​

GROUP BY SPECIALTY

DQL

224 of 372

Examples of simple queries

  • For each type of facility, calculate the average duration of work, starting from 10/20/2001.

SELECT OBJECT_NUMBER, AVG ( NUMBER_DAYS)

FROM WORK

WHERE START_DATE >= 20 / 10 / 01

GROUP BY OBJECT_NUMBER

DQL

225 of 372

Examples of simple queries

  • For each type of facility where more than one worker is employed, calculate the maximum duration of work. Consider only those facilities where work began after 10/20/2001.

SELECT OBJECT_NUMBER,

    MAX ( NUMBER_DAYS)

FROM WORK

WHERE START_DATE >= 20 / 10 / 01

GROUP BY OBJECT_NUMBER

HAVING COUNT ( TABLE NUMBER ) >= 2

DQL

226 of 372

Test

  • Is the following query logically correct?

  1. Yes;
  2. No.

SELECT EMP_NAME,

       SUM (SAL)

FROM EMPLOYEE;

227 of 372

Age calculation

  • MySQL provides several functions for retrieving parts of dates :
  • YEAR(),
  • MONTH(),
  • DAYOFMONTH().

  • You can determine your age by calculating the difference between the current year and the year of birth, and subtracting one from the result if the current day is closer to the beginning of the calendar than the birthday.

DQL

228 of 372

Age calculation

  • The query below outputs the name, date of birth, age, and current date.

  • In this example, the YEAR() function extracts the year from the date, and RIGHT() extracts the rightmost five characters representing the calendar day (MM-DD).
  • The part of the expression that compares dates returns 1 or 0, which allows you to reduce the result by one if the current day (CURRENT_DATE) is closer to the beginning of the calendar than the birthday.
  • age) is displayed in the header of the corresponding column of the results. - "Age").

SELECT name ,

birth,

CURRENT_DATE,

( YEAR (CURRENT_DATE) - YEAR (birth)) – (RIGHT(CURRENT_DATE, 5 ) < RIGHT(birth, 5 )) AS age

FROM pet

ORDER BY age;

DQL

229 of 372

Nested subquery

  • Nested subquery is a subquery enclosed in parentheses and nested in the WHERE clause of a SELECT statement or other statements that use WHERE.

  • In other words, it's a query within a query.

DQL

230 of 372

Subqueries

  • What specialties are workers assigned to facility No. 435?

SELECT SPECIALTY

FROM WORKERS​

WHERE LICENSE_NUMBER IN

( SELECT LICENSE_NUMBER

FROM WORK

WHERE OBJECT_NUMBER = 435 )

Nested subquery

Result of a nested subquery

Result of the entire query

DQL

231 of 372

Subqueries

  • Student with the highest score:

SELECT Surname , Name

FROM Student

WHERE Ball = MAX (Ball)

SELECT Surname, Name

FROM Student

WHERE Ball = (

        SELECT MAX (Ball)

        FROM Student

    );

DQL

232 of 372

SELECT

SELECT

[DISTINCT] column_names ,...

FROM table_name

[ INNER JOIN | LEFT JOIN … ] table name… ]

[ WHERE selection_conditions ]

[ GROUP BY grouping_field ]

[ HAVING grouping_conditions ]

[ ORDER BY sort_field } [ASC | DESC], ...] ;

DQL

233 of 372

Subqueries

  • Which employee has a higher than average hourly rate?

SELECT

    Full_name AS 'Last name' ,

    Speciality AS 'Work' ,

    Salary_per_hour AS 'Hourly pay'

FROM worker

WHERE Salary_per_hour > (

        SELECT avg (Salary_per_hour)

        FROM worker

);

DQL

234 of 372

Subqueries

  • Which employee has an hourly rate higher than the average hourly rate among subordinates of the same manager?

SELECT A.NAME

FROM EMPLOYEES A

WHERE A.HOURLY_RATE >

( SELECT AVG(B.HOURLY_RATE )

FROM EMPLOYEES B

WHERE B.BRIGADIR = A.BRIGADIR)

DQL

235 of 372

Subqueries

  • Find all restaurants with ratings from 3 to 9 (not inclusive):

SELECT name, rating

FROM restaurant_ratings

WHERE rating IN (

        SELECT rating

        FROM restaurant_ratings

        WHERE rating > 3

            AND rating < 9

)

DQL

name

rating

The Shack

7

Ribs'n'More

5

name

rating

Pizza House

3

The Shack

7

Arthur's

9

Ribs'n'More

5

236 of 372

Keyword ALL

  • Find all restaurants with a rating higher than the rating in the set:

SELECT name , rating

FROM restaurant_ratings

WHERE rating > ALL (

        SELECT rating

        FROM restaurant_ratings

        WHERE rating > 3

            AND rating < 9

)

DQL

name

rating

Pizza House

3

name

rating

Pizza House

3

The Shack

7

Arthur's

9

Ribs'n'More

5

< ALL

name

rating

The Shack

7

Ribs'n'More

5

Subquery result:

> ALL

name

rating

Arthur's

9

>= and <= also work

237 of 372

Keyword ANY

  • Find all restaurants with a rating higher than the rating of at least one restaurant in the set:

SELECT name, rating

FROM restaurant_ratings

WHERE rating > ANY (

        SELECT rating

        FROM restaurant_ratings

        WHERE rating > 3

            AND rating < 9

)

DQL

name

rating

Pizza House

3

Ribs'n'More

5

name

rating

Pizza House

3

The Shack

7

Arthur's

9

Ribs'n'More

5

< ANY

name

rating

The Shack

7

Ribs'n'More

5

Subquery result:

> ANY

name

rating

The Shack

7

Arthur's

9

238 of 372

Keyword SOME

  • SOME means the same as ANY in standard SQL syntax.

  • To find out if this keyword works in your RDBMS, consult the documentation.

DQL

239 of 372

CROSS JOIN + WHERE vs JOIN ON

  • If you add a join condition to the WHERE clause, that is, a restriction on combining tuples, the result is equivalent to the INNER JOIN operation with the same condition:

  • The CROSS JOIN + WHERE syntax for the join operation is considered obsolete and is not recommended by the ANSI SQL standard.

DQL

SELECT *

FROM

Person ,

City

WHERE Person.CityId = City.Id

SELECT *

FROM

Person

    INNER JOIN City

    ON Person.CityId = City.Id

240 of 372

JOIN USING vs JOIN ON

DQL

241 of 372

Comment syntax

  • Single-line comment ( MySQL , Oracle/PLSQL , Transact-SQL , SQLite ):

  • Multi-line comment ( MySQL , Oracle/PLSQL , Transact-SQL , SQLite ):

-- comment

/*

comment

*/

# single-line comment only in MySQL

242 of 372

List of key words SQL

ABSOLUTE

ACTION

ADD

ADMIN

AFTER

AGGREGATE

ALIAS

ALL

ALLOCATE

ALTER

AND

ANY

ARE

ARRAY

AS

ASC

ASSERTION

ASSERTION

AT

ATOMIC

AUTHORIZATION

BEFORE

BEGIN

BIGINT

BINARY

BIT

BLOB

BOOLEAN

BOTH

BREADTH

BY

CALL

CASCADE

CASCADED

CASE

CAST

CATALOGUE

CHAR

CHARACTER

CHECK

CLASS

CLOB

CLOSE

COLLATE

COLLECTION

COLLECT

COLUMN

COMMIT

COMPLETION

CONDITION

CONNECT

CONNECTION

CONSTRAINT

CONSTRAINTS

CONSTRUCTOR

CONTAINS

CONTINUE

CORRESPONDING

CREATE

CROSS

CUBE

CURRENT

CURRENT_DATE

CURRENT_PATH

CURRENT_ROLE

CURRENT_TIME

CURRENT_TIMESTAMP

CURRENT_USER

CURSOR

CYCLE

DATA

DATALINK

DATE

DAY

DEALLOCATE

DEC

DECIMAL

DECLARE

DEFAULT

DEFERRABLE

DELETE

DEPTH

DEREF

DESC

DESCRIPTOR

DESTRUCTOR

DIAGNOSTICS

DICTIONARY

DISCONNECT

DO

DOMAIN

DOUBLE

DROP

ELEMENT

END-EXEC

EQUALS

ESCAPE

EXCEPT

EXCEPTION

EXECUTE

EXIT

EXPAND

EXPANDING

FALSE

FIRST

FLOAT

FOR

FOREIGN

FREE

FROM

FUNCTION

FUSION

GENERAL

GET

GLOBAL

GO

GROUP

GROUPING

HANDLER

HASH

HOUR

IDENTITY

IF

IGNORE

IMMEDIATELY

IN

INDICATOR

INITIALIZE

INITIALLY

INNER

INOUT

INPUT

INSERT

INT

INTEGER

INTERSECT

INTERSECTION

INTERVAL

INTO

IS

ISOLATION

ITERATE

JOIN

KEY

LANGUAGE

LARGE

LAST

LATERAL

LEADING

LEAVE

LEFT

LESS

LEVEL

LIKE

LIMIT

LOCAL

LOCALTIME

LOCALTIMESTAMP

LOCATOR

LOOP

MATCH

MEMBER

MEETS

MERGE

MINUTE

MODIFIES

MODIFY

MODULE

MONTH

MULTISET

NAMES

NATIONAL

NATURAL

NCHAR

NCLOB

NEW

NEXT

NO

NONE

NORMALIZE

NOT

NULL

NUMERIC

OBJECT

OF

OFF

OLD

ON

ONLY

OPEN

OPERATION

OPTION

OR

ORDER

ORDINALITY

OUT

OUTER

OUTPUT

PAD

PARAMETER

PARAMETERS

PARTIAL

PATH

PERIOD

POSTFIX

PRECEDES

PRECISION

PREFIX

PREORDER

PREPARE

PRESERVE

PRIMARY

PRIOR

PRIVILEGES

PROCEDURE

PUBLIC

READ

READS

REAL

RECURSIVE

REDO

REF

REFERENCES

REFERENCE

RELATIVE

REPEAT

RESIGNATION

RESTRICTED

RESULT

RETURN

RETURNS

REVOKE

RIGHT

ROLE

ROLLBACK

ROLLUP

ROUTINE

ROW

ROWS

SAVEPOINT

SCHEMA

SCROLL

SEARCH

SECOND

SECTION

SELECT

SEQUENCE

SESSION

SESSION_USER

SET

SETS

SIGNAL

SIZE

SMALLINT

SPECIFIC

SPECIFICTYPE

SQL

SQLEXCEPTION

SQLSTATE

SQLWARNING

START

STATE

STATIC

STRUCTURE

SUBMULTISET

SUCCEEDS

SUM

SYSTEM_USER

TABLE

TABLESAMPLE

TEMPORARY

TERMINATE

THAN

THEN

TIME

TIMESTAMP

TIMEZONE_HOUR

TIMEZONE_MINUTE

TO

TRAILING

TRANSACTION

TRANSLATION

TREAT

TRIGGER

TRUE

ESCAPE

UNDER

UNDO

UNION

UNIQUE

UNKNOWN

UNTIL

UPDATE

USAGE

USER

USING

VALUE

VALUES

VARCHAR

VARIABLE

VARYING

VIEW

WHEN

WHENEVER

WHERE

WHILE

WITH

WRITE

YEAR

ZONE

243 of 372

Only in Transact-SQL

244 of 372

EXCEPT and INTERSECT

  • EXCEPT - excludes data from the second set from the first set.

  • INTERSECT - finds the intersection of the data of the first set with the data of the second set.

SELECT * FROM goods

EXCEPT

( SELECT * FROM goods WHERE Price > 150 )

SELECT * FROM goods WHERE Amount = 0

INTERSECT

( SELECT * FROM goods WHERE Price > 150 )

Only in Transact-SQL

245 of 372

SELECT

Subqueries

Let's solve a simple problem that requires comparing with the value returned from a subquery.

This example assumes that we want to request all customer order details, but we are only interested in the customer's most recent order.

246 of 372

SELECT

GROUP BY

Rows that have the same value in the columns specified in the list will be collected into a group.

Let's try using the GROUP BY clause to see what calculations can be performed with information about contract prices for groups of pop artists.

247 of 372

Text conversion

  • Often, text values are filled in differently by software users: some write the full name with a capital letter, some do not; some write everything in capital letters.
  • Many reporting forms require a unified approach, and not only reporting forms.
  • There are two functions in SQL to solve this problem:
  • UCASE
  • LCASE

248 of 372

Text conversion

SELECT UCASE(D_STAFF.S_NAME) AS [UCASE(S_NAME)],

LCASE(D_STAFF.S_NAME) AS [LCASE(S_NAME)] FROM D_STAFF

249 of 372

Text conversion

  • Returns the string str with all letters in it converted to lowercase.

  • Returns the string str with all letters in it converted to uppercase.

Examples:

mysql> select LCASE('QUADRATICALLY');

-> 'quadratically'

mysql> select LOWER('QUADRATICALLY');

-> 'quadratically'

mysql> select UCASE('Hey');

-> 'HEY'

mysql> select UPPER('Hey');

-> 'HEY'

string LCASE( str string)

string LOWER ( str string) -- in MySQL

string UCASE( str string)

string UPPER ( str string) -- in MySQL

250 of 372

SQL .TCL​

251 of 372

Transaction control operators

Operator

Sense

Action

COMMIT

Complete the transaction

Complete complex interconnected information processing combined into a transaction

ROLLBACK

Roll back transaction

Roll back changes made during a transaction

SAVEPOINT

Save an intermediate point of transaction execution

Save the intermediate state of the database, mark it so that you can return to it

TCL

252 of 372

Transactions

  • A transaction is an indivisible, from the point of view of a DBMS, sequence of data manipulation operations.
  • The user transaction is executed on an "all or nothing" basis, i.e. either the transaction is executed in its entirety and transfers the database from one coherent state to another coherent state, or, if for some reason one of the transaction actions cannot be performed, or some system failure occurs, the database returns to the original state it was in before the transaction began (a transaction rollback occurs).

253 of 372

Transaction completion

  • The transaction is completed in one of four possible ways:
  • COMMIT statement signifies the successful completion of a transaction ; its use makes the changes made to the database within the current transaction permanent;
  • ROLLBACK statement aborts a transaction, undoing the changes made to the database within that transaction; a new transaction is started immediately after using ROLLBACK;
  • successful completion of the program in which the current transaction was initiated means successful completion of the transaction (as if the COMMIT statement had been used);
  • an erroneous program termination aborts the transaction (as if a ROLLBACK statement had been used).

254 of 372

autocommit

  • By default, MySQL runs in autocommit mode . This means that as soon as a modification is made, MySQL will save it to disk.
  • You can put MySQL into non-autocommit mode with the following command:

  • If non -transactionally safe tables are used, changes will be saved immediately, regardless of the state of autocommit mode .

SET AUTOCOMMIT = 0

255 of 372

MySQL . Tables and Transactions

  • MySQL supports two different types of tables:
  • transactional ( InnoDB and BDB)
  • without transaction support (HEAP, ISAM, MERGE and MyISAM ).

256 of 372

MySQL . Memory Types

Property

MyISAM

Memory

InnoDB

Archive

NDB

Memory limitations

256TB

RAM

64TB

No

384 EB

Transactions

No

No

Yes

No

Yes

Feature

Previously it was the default

Stores the database in RAM. Energy-dependent

Slower, but more reliable

Does not support DELETE and UPDATE.

For distributed databases

EB = exabyte ( exabyte = 1024 * 1024 terabyte).

257 of 372

Setting the memory type

  • When creating a table, you can specify the storage type using the ENGINE option in the CREATE TABLE statement:

  • If ENGINE or TYPE is omitted, the default memory type is used.

  • By default, this is InnoDB .

CREATE TABLE t (i INT ) ENGINE = INNODB;

258 of 372

DCL

259 of 372

Data management operators�

Operator

Sense

Action

CREATE DATABASE

Create a database

Create a new database by defining the basic parameters for it

DROP DATABASE

Delete database

Delete existing database (if you have sufficient rights )

ALTER DATABASE

Change database

Change the set of basic objects in the database, restrictions that apply to the entire database

GRANT

Grant rights

Grant access rights to a number of actions on a certain database object

REVOKE

Deprive rights

Revoke access rights to a number of actions on a certain database object

CHANGE PASSWORD

Change password

Change password for the entire database

DCL

260 of 372

GRANT

  • The general form of granting SQL access rights with the GRANT command looks like this:

  • For example:

GRANT < list of powers >

ON < and ' me data or table section >

TO < user list >

GRANT READ ACCESS

ON 'Table_Estimates'

TO 'Shelepova'

DCL

261 of 372

REVOKE

  • Prohibition on insertion:

  • Remove all rights:

  • Override root password:

REVOKE INSERT ON * . * FROM 'jeffrey' @ 'localhost' ;

UPDATE user SET Password = PASSWORD ( 'new_password' )

WHERE user = 'root' ;

REVOKE ALL PRIVILEGES, GRANT OPTION FROM user [, user] FLUSH PRIVILEGES;

DCL

262 of 372

Interview. �Beginner Programmer – SQL Basics

  • What is the difference between INNER JOIN and OUTER ? JOIN?
  • Why are indexes needed? in the database?
  • Describe what the output data set would look like if the SELECT statement included two column names after ORDER BY .
  • What predicate will be used to check if the field value is NULL ?
  • What will the query SELECT 6 FROM TableName return if the query SELECT count (*) FROM TableName returns 13?

263 of 372

Interview. �Beginner programmer – basics SQL

  • TOP keyword used in the SELECT statement? and where is it placed?
  • What is the difference between the data types char (n), varchar (n), and nvarchar (n)? What is the maximum value that n can have?
  • What is a data slice ? Can data be changed through a data slice?
  • select count (*) from table return ? What if you specify a field name instead of an asterisk?

264 of 372

Stored procedures �and triggers

265 of 372

Plan

  1. Stored procedures
  2. Triggers
  3. Data slice ( VIEW)

266 of 372

Stored procedures

  • Stored procedures are database objects, in the form of a set of SQL statements, that are stored and executed on the server.

  • The structure of the stored procedure is as follows:

CREATE PROCEDURE procedure_name[(parameters)]

-- Procedure code

Alternative definition

Stored procedures are procedures and functions that are stored directly in the database in a compiled form and can be run by users or applications working with the database.

267 of 372

Advantages of stored procedures

  1. Sharing logic with other programs. Stored procedures encapsulate functionality; this provides consistency in data access and management between different programs.
  2. Isolating users from database tables. This allows you to grant access to stored procedures, but not to the table data itself.
  3. Provides a security mechanism . As per the previous point, if you can only access the data through stored procedures, no one else will be able to erase your data through the SQL DELETE command.
  4. Improved performance due to reduced network traffic . Using stored procedures, multiple queries can be combined.

268 of 372

Disadvantages of stored procedures

  1. Increased load on the database server due to the fact that most of the work is performed on the server side, and less on the client side.

  • There is a lot to learn . You will need to learn the syntax of MySQL expressions to write your stored procedures.

  • It gets complicated. the process of manipulating data .

  • Migrating from one DBMS to another (DB2, SQL Server, etc.) can lead to problems .

269 of 372

Stored procedures

  • The declaration of variables looks like this:

  • The operator block consists of the command:

  • The assignment operator looks like this:

DECLARE variable_name variable_type [(length)];

BEGIN

END

SET variable = value;

270 of 372

Stored procedures

  • The conditional operator has the form:

  • There are several loop operators, the most common of which is:

IF condition THEN

Operator1 or Operator Group1

[ ELSE

Operator2 or Operator Group2]

END IF ;

WHILE condition DO

Operator or Operator Group

END WHILE ;

271 of 372

Stored Procedures . Example

  • Example:

CREATE DEFINER = 'root' @ 'localhost'

PROCEDURE decanat.GetMark1( in ball INT , out mark INT )

BEGIN

IF ball BETWEEN 60 AND 75 THEN

SET mark = 3;

ELSEIF ball BETWEEN 75 AND 90 THEN

SET mark = 4;

ELSEIF ball BETWEEN 90 AND 100 THEN

SET mark = 5;

ELSE SET mark = 2;

END IF;

END

272 of 372

Stored procedures

  • CASE is a multiple-choice operator.

CREATE PROCEDURE proc_CASE( IN param1 INT )

BEGIN

  DECLARE variable1 INT ;

  SET variable1 = param1 + 1 ;

  CASE variable1

  WHEN 0 THEN

    INSERT INTO table1 VALUES (param1);

  WHEN 1 THEN

    INSERT INTO table1 VALUES (variable1);

  ELSE

    INSERT INTO table1 VALUES ( 99 );

  END CASE ;

END

273 of 372

Stored procedures

  • Loop with postcondition :

  • Infinite loop (almost):

REPEAT

 /*loop body*/

UNTIL condition END REPEAT

Label: LOOP

 /*loop body*/

 LEAVE Label; /*leave loop */

END LOOP Tag

274 of 372

Selecting a separator

  • A delimiter is a character or string of characters used to close an SQL statement.

  • By default, the semicolon ( ; ) is used as the separator.

  • But this causes problems in stored procedures and triggers. Because they can have many statements, and each must end with a semicolon.

SELECT count(*) FROM Student;

275 of 372

Selecting a separator

  • You can choose ANY separator.
  • For example, we will use a double dollar sign – $$ – as a separator.

  • Now the commands should be written like this:

  • To use ";" as a separator again later, run the command:

DELIMITER ;

DELIMITER $$

SELECT * FROM user$$

276 of 372

MySQL

  • Let's create a simple MySQL procedure called student_data , when executed it will output all the data from the " student " table:

  • CREATE PROCEDURE command creates a stored procedure.
  • The next part is the procedure name student_data ;
  • Procedure names are not case sensitive, so student_data is equivalent to STUDENT_DATA .

mysql > DELIMITER $$;

mysql > CREATE PROCEDURE student_data()

-> SELECT * FROM student; $$

277 of 372

Example of using DELIMITER

DELIMITER $$

CREATE FUNCTION `sakila` . `inventory_held_by_customer` (p_inventory_id INT ) RETURNS INT READS SQL DATA

BEGIN

  DECLARE v_customer_id INT ;

  DECLARE EXIT HANDLER FOR NOT FOUND RETURN NULL ;

  SELECT customer_id INTO v_customer_id

  FROM rental

  WHERE return_date IS NULL

    AND inventory_id = p_inventory_id;

  RETURN v_customer_id;

END $$

DELIMITER ;

278 of 372

Calling a stored procedure

  • To call a stored procedure, you must type the keyword CALL , followed by the name of the procedure, and specify the parameters (variables or values) in parentheses.

  • Parentheses are required.

CALL p2();

CALL stored_procedure_name (param1, param2, ....)

CALL procedure1( 10 , 'string parameter' , @parameter_var);

279 of 372

Deleting a stored procedure

  • IF EXISTS statement catches an error if such a procedure does not exist.

  • MySQL has an ALTER PROCEDURE statement for altering procedures, but it is only suitable for changing some characteristics. If you need to change the parameters or the body of a procedure, you must drop it and recreate it.

DROP PROCEDURE IF EXISTS p2;

280 of 372

Variables

  • Variables must be declared explicitly at the beginning of a BEGIN/END block, along with their data types.

CREATE PROCEDURE var_proc( IN paramstr VARCHAR ( 20 ))

BEGIN

  DECLARE a, b INT DEFAULT 5 ;

  DECLARE str VARCHAR ( 50 );

  DECLARE today TIMESTAMP DEFAULT CURRENT_DATE;

  DECLARE v1, v2, v3 TINYINT ;

�  INSERT INTO table1 VALUES (a);

  SET str = 'I am a string' ;

 

  SELECT CONCAT ( str , paramstr), today

  FROM table2 WHERE b >= 5 ;

END

281 of 372

MySQL

  • You cannot use two procedures with the same name in the same database.
  • You can use names in the format �" procedure- name.database - name ", for example, " hr.job_data ".
  • Procedure names can be split. If the name is split, it can contain spaces .
  • The maximum length of a procedure name is 64 characters .
  • Avoid using MySQL built-in function names .

282 of 372

Saved function

  • The CREATE FUNCTION statement is used to create a stored function and user functions.
  • A stored function is a set of SQL statements that perform some operation and return a single value.

CREATE [ AGGREGATE ] FUNCTION [ IF NOT EXISTS ] function_name

RETURNS { STRING | INTEGER | REAL | DECIMAL }

SONAME shared_library_name

283 of 372

Stored function. Example

  • We need to find out how many years the employee has worked at the company.

DELIMITER //

CREATE FUNCTION no_of_years(date1 date ) RETURNS int DETERMINISTIC

BEGIN

 DECLARE date2 DATE ;

  Select current_date() into date2;

  RETURN year (date2) - year (date1);

END //

�DELIMITER ;

Select emp_id, fname, lname, no_of_years( start_date ) as 'years'

from employee;

284 of 372

Triggers

  • Triggers are a special case of a stored procedure that is executed automatically when data update commands (INSERT, DELETE, UPDATE) are executed.

  • Triggers are tied to specific database tables.

  • Each team should have its own triggers.

285 of 372

Triggers. Example

  • A trigger that will remove extra spaces and change the first letter in the last name, first name, and middle name to uppercase.
  • For example, for a last name it looks like this:

SET NEW.`Surname` = TRIM(NEW. `Surname`);

SET NEW.`Surname` = concat(upper(left(NEW.`Surname`, 1)), substr(NEW.`Surname`, 2));

286 of 372

Data slice (VIEW)

  • VIEW is a data object that does not contain any data.

  • This is a type of table whose contents are selected from other tables by executing a query. They work in queries and DML statements in the same way as base tables, but they do not contain any data of their own.

  • The data slice can be changed by DML modification commands, but they do not affect the data slice itself. The commands will actually redirects to the base table.

DDL

287 of 372

Data slice ( VIEW)

  • Advantages of using data slicing:
    1. Allows for flexible configuration of data access rights due to the fact that rights are given not on the table, but on view. This is very convenient if you need to give the user rights to individual rows of a table or the ability to obtain not the data itself, but the result of some actions on it.
    2. Allows you to separate the logic of data storage and software. You can change the data structure without touching the program code, you only need to create a data slice similar to the tables that the applications previously accessed. This is very convenient when there is no possibility to change the program code or several applications with different requirements for the data structure access the same database.
    3. Ease of use due to automatic execution of actions such as accessing a certain part of rows and/or columns, retrieving data from multiple tables, and transforming it using various functions.

DDL

288 of 372

Data slice ( VIEW)

  • Create a data view that shows all information about an employee except their hourly rate:

CREATE VIEW EMPLOYEES AS

SELECT

LICENSE_PLATE_NUMBER,

Fullname,

SPECIALTY,

BRIGADIER

FROM WORKERS

DDL

289 of 372

Cursor

  • A cursor is a database object that allows programs to work with records one at a time, rather than many at once, as is done in regular SQL commands.

  • The procedure for working with the cursor is as follows:
  • Define cursor (DECLARE)
  • Open cursor (OPEN)
  • Get a record from a cursor (FETCH)
  • Process record
  • Close cursor (CLOSE)
  • Remove the cursor from memory (DEALLOCATE).

A cursor (in Oracle, Microsoft SQL Server) is a result set and its associated pointer to the current record, which is obtained when a query is executed.

Only in Oracle, Microsoft SQL Server

290 of 372

CREATE PROCEDURE `employees` . `f_p` ()

BEGIN

DECLARE id_p INT ;

DECLARE name_p TINYTEXT;

DECLARE is_end INT DEFAULT 0 ; -- end flag

/*declare a cursor for a query that selects from a table

Record ID and directory name, written in uppercase letters*/

DECLARE cur_p_name CURSOR FOR

SELECT id_project,

concat (ucase( substring (project, 1 , 1 )), substring (project, 2 ))

FROM projects;

/*declare a handler for the situation when the cursor reaches

  end of the resulting table*/

DECLARE CONTINUE HANDLER FOR NOT FOUND SET is_end = 1 ;

OPEN cur_p_name;

wet: LOOP /*in a loop, read cursor data and edit records*/

  FETCH cur_p_name into id_p, name_p;

IF is_end THEN LEAVE wet;

END IF ;

UPDATE projects SET project = name_p WHERE id_project = id_p;

END LOOP wet;

CLOSE cur_p_name;

END $$

291 of 372

SQL Injection and ACID

292 of 372

CRUD

Create

Insert

Retrieve

Select

Update

Update

Delete

Delete

Create

293 of 372

ACI D

  • A transaction has four important properties, known as ACI D properties:
  • (A) Atomicity.
  • (C) Consistency.
  • (I) Insulation.
  • ( D ) Durability.

294 of 372

Atomicity

  • Transactions are important in both multi-user and single-user systems. In single-user systems, transactions are logical units of work that, after completion, leave the database in a consistent state. Transactions are also units of data recovery after failures. When the system recovers, it eliminates traces of transactions that did not complete successfully due to a software or hardware failure. These two properties of transactions determine atomicity (indivisibility) of the transaction.

  • Atomicity . A transaction is executed as an atomic operation – either the entire transaction is executed in its entirety, or it is not executed in its entirety.

295 of 372

Atomicity

start transaction;

UPDATE Student SET Money = Money - 100 WHERE Surname="Orlov";

UPDATE Student SET Money = Money + 100 WHERE Surname="Gasimov";

commit;

296 of 372

Insulation

  • In multi-user systems, transactions also serve to ensure the isolation of individual users - users working with the same database at the same time seem to be working as if in a single-user system and do not interfere with each other.

  • Isolation : Transactions of different users should not interfere with each other (for example, if they were executed one after the other).

297 of 372

Isolation property ( Isolation)

  • A set of transactions is isolated if the effect of the system running them is the same as if it ran them one at a time.
  • The technical definition of isolation is serialization.
  • An execution is serialized (i.e., isolated) if its effect is the same as executing transactions sequentially, one after the other, in sequence, without overlapping when any two of them are executed. This has the same effect as running transactions one at a time.
  • A classic example of non-isolated execution is a banking system where two transactions each try to withdraw the last $100 to an account. If both transactions read the account balance before either one updates it, both transactions will determine that there is enough money to satisfy their requests, and both will withdraw the last $100. Clearly, this is an incorrect result .

298 of 372

Transaction isolation level

  • The term "transaction isolation level" refers to the degree of protection provided by the internal mechanisms of the DBMS (i.e., does not require special programming) against all or some of the types of data inconsistencies listed above that arise during parallel execution of transactions.
  • The SQL-92 standard defines a scale of four isolation levels:
    1. Read uncommitted,
    2. Read committed,
    3. Repeatable reading,
    4. Serializable.
  • The first is the weakest, the last is the strongest, each subsequent one includes the previous ones.

299 of 372

Read uncommitted �(Reading uncommitted data)

  • The lowest (first) isolation level. It only guarantees that there are no lost updates.
  • Lost update is a situation where, when the same block of data is changed simultaneously by different transactions, one of the changes is lost.
  • Suppose there are two transactions running simultaneously:

  • As a result, the value of the f2 field after the completion of both transactions may increase not by 45, but by 20 or 25, i.e. one of the transactions that change the data will “disappear”.

Transaction 1

Transaction 2

UPDATE tbl1 SET f2=f2+20 WHERE f1=1;

UPDATE tbl1 SET f2=f2+25 WHERE f1=1;

300 of 372

Read committed (reading committed data)

  • Most industrial DBMSs, including Microsoft SQL Server, PostgreSQL , and Oracle , use this level by default. This level provides protection against rough, dirty reads.
  • "Dirty" read - reading data added or changed by a transaction that is subsequently not confirmed (rolled back).

  • Transaction 1 changes the value of field f2, and then transaction 2 selects the value of this field. After that, transaction 1 is rolled back. As a result, the value received by the second transaction will differ from the value stored in the database.

Transaction 1

Transaction 2

UPDATE tbl1 SET f2=f2+1 WHERE f1=1;

SELECT f2 FROM tbl1 WHERE f1=1;

ROLLBACK WORK;

301 of 372

Repeatable read�

  • The level at which a reading transaction "does not see" changes to data it has previously read. In this case, no other transaction can change the data that the current transaction is reading until it completes.
  • Non-repeatable read - a situation where, when read again within the same transaction, previously read data turns out to be changed.

  • Transaction 2 selects the value of field f2, then transaction 1 modifies the value of field f2. If you try to select the value from field f2 again in transaction 2, a different result will be obtained. This situation is especially unacceptable when data is read with the aim of partially changing it and writing it back to the database.

Transaction 1

Transaction 2

SELECT f2 FROM tbl1 WHERE f1=1;

UPDATE tbl1 SET f2=f2+1 WHERE f1=1;

COMMIT;

SELECT f2 FROM tbl1 WHERE f1=1;

302 of 372

Serializable​​

  • The highest level of isolation; transactions are completely isolated from each other, each executing sequentially, as if there were no parallel transactions. Only at this level are parallel transactions not subject to the "phantom read" effect.
  • " Phantom " reads are situations where , when read repeatedly within the same transaction, the same sample yields different sets of rows.

  • Transaction 2 executes an SQL statement that uses the value of field f2. Transaction 1 then inserts a new row, which causes the SQL statement to be re-executed in transaction 2 to produce a different result. This is called a phantom read. It differs from a non-repeated read in that the result of a repeated access to the data is not due to the change/deletion of the data itself, but due to the appearance of new (phantom) data.

Transaction 1

Transaction 2

SELECT SUM(f2) FROM tbl1;

INSERT INTO tbl1 (f1, f2) VALUES (15,20);

COMMIT;

SELECT SUM(f2) FROM tbl1;

303 of 372

Consistency

  • A database is in a consistent state if all integrity constraints are satisfied for that state .

  • An integrity constraint is a specific statement that can be true or false depending on the state of the database.

304 of 372

Classification of integrity constraints

1. By methods of implementation.

    • 1.1 Declarative support of integrity constraints - by means of the data definition language (DDL).
    • 1.2 Procedural support of integrity constraints - by means of triggers and stored procedures.

2. At the time of inspection.

    • 2.1 Immediately verifiable restrictions.
    • 2.2 Limitations with deferred verification.

3. By scope.

    • 3.1 Domain restrictions.
    • 3.2 Limitations of the attribute.
    • 3.3 Tuple restrictions.
    • 3.4 Limitations of the relationship.
    • 3.5 Limitations of the database.

305 of 372

Limitation

  • The SQL language standard supports declarative integrity constraints, implemented as:
  • Domain restrictions.
  • Constraints included in the table definition.
  • database as independent assertions .

306 of 372

Examples of integrity constraints

  1. The employee's age cannot be less than 18 or more than 65 years.
  2. Each employee has a unique personnel number.
  3. An employee must be registered in one department.
  4. The invoice amount must be equal to the sum of the products of the prices of the goods and the quantity of goods of all the goods included in the invoice.

307 of 372

SQL CHECK

  • The CHECK option allows you to set a condition that a value entered into a table must satisfy before it is accepted.
  • Any attempt to modify or insert a field value that would make this predicate invalid will be rejected.
  • SQL Server / Oracle / MS Access:

  • CHECK is not supported in MySQL (a trigger should be used instead).

CREATE TABLE Persons(

ID int NOT NULL ,

LastName varchar ( 255 ) NOT NULL ,

FirstName varchar ( 255 ),

Age int CHECK (Age >= 18 )

);

DDL

308 of 372

SQL CHECK

  • To check multiple columns at once SQL Server / Oracle / MS Access:

  • Adding a condition to a table:

  • Delete a condition:

CREATE TABLE Persons (

ID int NOT NULL ,

LastName varchar ( 255 ) NOT NULL ,

FirstName varchar ( 255 ),

Age int ,

City varchar ( 255 ),

    CONSTRAINT CHK_Person CHECK (

Age >= 18

        AND City = 'Sandnes' )

);

ALTER TABLE Persons ADD CHECK (Age >= 18 );

ALTER TABLE Persons DROP CONSTRAINT CHK_PersonAge;

DDL

309 of 372

SQL CHECK

  • Checking for inclusion in a set SQL Server / Oracle / MS Access:

CREATE TABLE Student (

Kod_stud integer NOT NULL PRIMARY KEY ,

Fam char ( 30 ) NOT NULL UNIQUE,

Address char ( 50 ),

Ball decimal CHECK (Ball > 0 ),

Form_ob char ( 10 ) CHECK (Form_ob IN ( 'Day' , 'Extracurricular' , 'Evening' )

);

DDL

310 of 372

Consistency

  • Consistency: A transaction moves a database from one consistent (integer) state to another consistent (integer) state.
  • Within a transaction, database consistency may be violated.

311 of 372

Consistency property

  • For example, let's say we have a transaction that moves $100 from account A to account B.

  • If the transaction is not agreed upon (and non-atomic), there is a chance that Volkov's money will be written off and Chupryna's will not be credited .
  • Atomicity is ensured by the DBMS, consistency by the programmer.

UPDATE Person

SET cash_account = cash_account - 100

WHERE client_surname = 'Volkov' ;

UPDATE Person

SET cash_account = cash_account + 100

WHERE client_surname = 'Hair' ;

COMMIT ;

312 of 372

Durability

  • If a transaction is executed, the results of its operation must be saved in the database, even if the system crashes at a later time.

313 of 372

ACID – transaction properties

The property of atomicity ( Atomicity )

    • it is expressed in the fact that the transaction must be completed in its entirety or not at all.

Consistency property​​

    • guarantees that as the transactions are executed, the data goes from one consistent state to another — the transaction does not destroy the mutual consistency of the data.

Isolation property​​

    • means that transactions competing for access to the database are physically processed sequentially, isolated from each other, but to users it looks as if they are executed in parallel.

Durability​​​

    • is interpreted as follows: if the transaction is completed successfully, then the changes in the data that were made by it cannot be lost under any circumstances (even in case of subsequent errors).

314 of 372

SQL injection

  • SQL injection is the placement of malicious code in SQL statements through data entry on a web page.
  • This is one of the most common hacking techniques on the Internet.

315 of 372

SQL injection

  • SQL injection usually occurs when you request a username, and instead of the name, the user gives you an SQL statement that you execute against the database.

  • Consider the following example, which creates a SELECT statement by adding a variable ( txtUserId ) to the select string. The variable is selected from the user input (getRequestString):

txtUserId = getRequestString( "UserId" );

txtSQL = "SELECT * FROM Users WHERE UserId = " + txtUserId;

316 of 372

SQL injection. 1=1

  • Suppose the user enters:

  • Then the SQL query will look like this:

  • SQL injections can be more dangerous:

105 OR 1 = 1

SELECT * FROM Users WHERE UserId = 105 OR 1 = 1 ;

105 ; DROP TABLE Suppliers

317 of 372

318 of 372

SQL injection protection

  • PHP

$stmt = $dbh -> prepare( "INSERT INTO Customers (CustomerName, Address, City) VALUES (:nam, :add, :cit)" );

$stmt -> bindParam( ':nam' , $txtNam);

$stmt -> bindParam( ':add' , $txtAdd);

$stmt -> bindParam( ':cit' , $txtCit);

$stmt -> execute ();

319 of 372

  • Display a list of employees who receive a salary higher than their immediate supervisor.

SELECT a. *

FROM employee a,

employee b

WHERE b.id = a.chief_id

  AND a.salary > b.salary

320 of 372

  • Display a list of employees who receive the maximum salary in their department.

SELECT a. *

FROM employee a

WHERE a.salary = (

        SELECT max (Salary)

        FROM employee b

        WHERE b.department_id = a.department_id

)

321 of 372

  • Display a list of department IDs with no more than 3 employees.

SELECT department_id

FROM employee

GROUP BY department_id

HAVING count ( * ) <= 3

322 of 372

  • Display a list of employees who have a manager working in the same department.

SELECT a. *

FROM employee a

LEFT JOIN employee b

ON (b.id = a.chief_id

      AND b.department_id = a.department_id

    )

WHERE b.chief_id IS NULL

323 of 372

  • Find a list of department IDs with the maximum total employee salary.

WITH sum_salary AS (

    SELECT department_id,

        sum (Salary) Salary

    FROM employee

    GROUP BY department_id

)

SELECT department_id

FROM sum_salarya

WHERE a.salary = (

    SELECT max (Salary)

    FROM sum_salary

)

324 of 372

Practice

  • 1. Find the second highest salary of the employee.

  • 2. Find the maximum salary of each department.

  • 3. Write an SQL query to display the current date.

SELECT MAX (Salary) FROM Employee

WHERE Salary NOT IN ( SELECT MAX (Salary) FROM Employee);

SELECT DeptID, MAX (Salary) FROM Employee GROUP BY DeptID;

SELECT DeptName, MAX (Salary) FROM Employee e

RIGHT JOIN Department d ON e.DeptId = d.DeptID

GROUP BY DeptName;

SELECT GetDate ();

Transact-SQL :

CURRENT_DATE()

MySQL :

SELECT SYSDATE FROM DUAL;

Oracle PL/SQL :

325 of 372

Practice

  • 5. Name of the employee born from 01/02/1965 to 31/11/1970.

  • 6. Find the number of employees, depending on gender, who were born from 01/02/1965 to 31/11/1970.

  • 7. Find an employee whose salary is equal to or exceeds 20,000.

  • 8. Find the name of an employee whose name starts with 'M'

SELECT DISTINCT EmpName FROM Employees

WHERE DOB BETWEEN '01/02/1965' AND '31/11/1970' ;

SELECT COUNT ( * ), sex FROM Employees

WHERE DOB BETWEEN '01/02/1965' AND '31/11/1970'

GROUP BY sex;

SELECT EmpName FROM Employees WHERE Salary >= 20000 ;

SELECT * FROM Employees WHERE EmpName like 'M%' ;

326 of 372

Practice

  • 9. Find all employee records containing the word “Roman,” regardless of whether it was spelled as Romane, ROMAN, or novel.

  • 10. There is a table which contains two columns Student and Marks . You need to find all the students whose marks are more than the average marks, i.e. list of students above average.

SELECT * FROM Employees WHERE UPPER (EmpName) like '%NOVEL%' ;

SELECT

student,

marks

FROM table

WHERE marks >

( SELECT AVG (marks) FROM table )

327 of 372

Practice

  • Knowledge of this course is manifested in the ability to:
    1. form a data schema;
    2. normalize it;
    3. write an SQL query for it.

  • These questions are asked during the interview.

328 of 372

INSERT INTO SELECT

  • Inserts the result of an SQL query into a table.

  • For example, insert fields from the Suppliers table into the Customers table :

INSERT INTO Customers (CustomerName, City, Country) SELECT SupplierName, City, Country FROM Suppliers;

329 of 372

SQL SELECT INTO

  • Inserts the result of an SQL query into a table.
  • Not supported in MySQL .

  • For example, to create a backup of the Customers table :

  • Create a backup of the Customers table and save it to the Backup.mdb file:

SELECT * INTO CustomersBackup2017

FROM Customers;

SELECT * INTO CustomersBackup2017 IN 'Backup.mdb'

FROM Customers;

Only in Oracle, Microsoft SQL Server

DQL

330 of 372

SQL CASE

  • Data selection with condition checking:

SELECT OrderID, Quantity,

CASE WHEN Quantity > 30 THEN "The quantity is greater than 30"

WHEN Quantity = 30 THEN "The quantity is 30"

ELSE "The quantity is under 30"

END AS Quantity_Comparison FROM OrderDetails;

SELECT CustomerName, City, Country FROM Customers ORDER BY ( CASE WHEN City IS NULL THEN Country ELSE City END );

DQL

331 of 372

Test

  • Row order in SQL query results
  • specified after the SORTED BY keyword;
  • can be ascending or descending for each field;
  • is accepted as descending by default;
  • impossible to predict unless specified in the request;
  • can only be specified for fields that are included in the query result list.

332 of 372

Test

  • Specify the correct INSERT commands from the following:
  • INSERT INTO students (id, first_name, last_name) SELECT id, first_name FROM new_students WHERE last_name IS NULL
  • INSERT INTO students SELECT VALUES ( 95 , ' Yulia ' , ' Kirichenko ' )
  • INSERT INTO students (id, first_name, last_name)
  • INSERT INTO students (id, first_name, last_name) ( 118 , ' Ivan ' , ' Rumyantsev ' )
  • INSERT INTO students (id, first_name, last_name) VALUES ( 100 , ' Anna' , ' Cherednyk ' )

333 of 372

Number rows in a table

  • Sometimes it is convenient to display row numbers, but a primary key is not suitable for this:

SELECT

@rownum: = @rownum + 1 AS rank , t.

FROM

student t,

( SELECT @rownum: = 0 ) r;

334 of 372

Test

  • This table was created using the SQL statement:

  • Is it guaranteed to delete all records from this table by executing the following query:

  1. Yes
  2. No

CREATE TABLE people (

id INT PRIMARY KEY ,

name VARCHAR ( 45 ) )

DELETE FROM people WHERE id >= 0

335 of 372

Test

  • What keywords are required when selecting data with a table join (excluding cartesian) product (Cartesian join) and NATURAL JOIN)?
    1. WHERE
    2. FROM
    3. JOIN
    4. ON
    5. USING
    6. SELECT

336 of 372

Database "Computer Company"

  • Find the model number, speed, and hard drive size for all PCs under $500. Output : model , Speed and hd

SELECT model, speed, hd

FROM PC

WHERE price < 500 ;

Product table represents the manufacturer ( maker ), model number ( model ), and type (PC, Laptop , or Printer). The model numbers in this table are assumed to be unique across manufacturers and product types. The PC table lists the processor speed ( MHz ) , the total amount of RAM ( MB), the disk size ( GB ), the CD drive speed (e.g., '4x'), and the price ( price ). The Laptop table is similar to the PC table except that instead of the CD drive speed, it lists the screen size ( in inches). The Printer table lists for each printer model whether it is color ( color ('y' if color), the printer type ( Laser , Jet , or Matrix ) , and the price ( price) .

337 of 372

Database "Computer Company"

  • Find the most expensive printers. Output : model , price .

SELECT model, MAX(DISTINCT price)

FROM Printer

GROUP BY model;

SELECT model, price

FROM Printer

WHERE price =

(SELECT MAX(price) FROM Printer);

SELECT MAX(price)

FROM Printer;

338 of 372

SQL. Select from a table

  • There is a table table1 with columns id and datetime .
  • Write a query that returns the maximum value of id and the date value for that id .

SELECT id, datetime FROM table1

WHERE id = ( SELECT max (id) FROM table1);

SELECT id, datetime FROM table1 ORDER BY id DESC LIMIT 1

SELECT id, datetime FROM table1

WHERE id IN ( SELECT max (id) FROM table1);

339 of 372

Revising the Select Query I

  • Find all columns for all US cities in the CITY table with a population greater than 100,000 .
  • Country code for America: USA.
  • The CITY table is described as follows:

SELECT *

FROM City

WHERE CountryCode = 'USA'

AND    Population > 100000 ;

1E5

340 of 372

Weather Observation Station 3

  • Find a list of CITY names from STATION for cities that have even ID numbers . Print the results in random order, but exclude duplicates from the answer.
  • The STATION table is described as follows:

  • where LAT_N is the northern latitude and LONG_W is the western longitude.

SELECT DISTINCT City

FROM Station

WHERE ID % 2 = 0 ;

MOD(ID, 2)

341 of 372

Weather Observation Station 4

  • Find the difference between the total number of CITY records in the table and the number different records CITY in the table.
    • For example, if the table has 3 records with CITY values " New York ", " New York ", " Bengaluru ", then there are 2 different city names: " New York " and " Bengaluru ". The query returns 1 because 3 - 2 = 1 .

SELECT COUNT(City) - COUNT(DISTINCT City)

FROM Station;

342 of 372

Weather Observation Station 5

  • Query the two cities in STATION with the shortest and longest CITY names , as well as their respective lengths (i.e.: number of characters in the name).
    • If there is more than one smallest or largest city, choose the one that comes first in alphabetical order.

SELECT City, LENGTH (City)

FROM Station

WHERE    LENGTH (City) =

( SELECT MIN ( LENGTH (City)) FROM Station)

ORDER BY City

LIMIT    1 ;

SELECT City, LENGTH (City)

FROM Station

WHERE    LENGTH (City) =

( SELECT MAX ( LENGTH (City)) FROM Station)

ORDER BY City

LIMIT    1 ;

(SELECT City, LENGTH(City)

FROM Station

WHERE    LENGTH(City) =

(SELECT MIN(LENGTH(City)) FROM Station)

ORDER BY City

LIMIT    1)

UNION ALL

(SELECT City, LENGTH (City)

FROM Station

WHERE    LENGTH (City) =

(SELECT MAX(LENGTH (City)) FROM Station)

ORDER BY City

LIMIT    1);

343 of 372

UNION ALL

  • The UNION ALL operator is used to combine the result sets from two or more SELECT statements.
  • Each SELECT UNION ALL statement must have the same number of fields in the result sets with the same data types.

What is the difference between UNION and UNION ALL?

  • UNION removes duplicate lines.
  • UNION ALL does not remove duplicate rows

344 of 372

Weather Observation Station 6

  • Find a list of CITY names that start with vowels (i.e. a, e, i, o, or u) in STATION. The result cannot contain duplicates.

SELECT DISTINCT City

FROM Station

WHERE City LIKE 'A%'

  OR City LIKE 'E%'

  OR City LIKE 'I%'

  OR City LIKE 'O%'

  OR City LIKE 'U%' ;

SELECT City

FROM Station

WHERE SUBSTR(City, 1 , 1 )

IN ( 'A' , 'E' , 'I' , 'O' , 'U' );

SELECT DISTINCT (City)

FROM Station

WHERE City REGEXP '^[a,e,i,o,u]'

345 of 372

Test

  • What command is used to create a new virtual table based on the results of a previously executed SQL query?
    1. CREATE VIRTUAL TABLE
    2. CREATE VIEW
    3. ALTER VIEW

346 of 372

Test

  • Query that allows you to rename the LastName column in Surname in the Employees table
    1. RENAME LastName in Surname FROM Employees
    2. ALTER TABLE Employees CHANGE LastName Surname varchar(50)
    3. ALTER TABLE Surname( LastName ) FROM Employees

347 of 372

Rename a column

  • To change the employee_email column to email , you would use the ALTER TABLE statement with CHANGE:

ALTER TABLE employees

CHANGE employee_email email VARCHAR ( 45 );

348 of 372

Test

  • A query that returns the first names, last names, and dates of birth of employees from the " Employees" table.
  • Condition – the surname contains the combination “se”.
    1. SELECT * from Employees WHERE LastName like "_se_"
    2. SELECT FirstName, LastName , BirthDate from Employees WHERE LastName LIKE "%se%"

349 of 372

Test

  • What is a query that returns all values from the " Countries" table except the country with ID=8?
    1. SELECT * FROM Countries EXP ID=8
    2. SELECT * FROM Countries WHERE ID !=8
    3. SELECT ALL FROM Countries LIMIT 8

350 of 372

Test

  • The table " Emlpoyees " contains data about the first names, last names, and salaries of employees. Select the query that will change the salary value from 2000 to 2500 for employee ID=7.
    1. SET Salary=2500 FROM Salary=2000 FOR ID=7 FROM Employees
    2. ALTER TABLE Employees Salary=2500 FOR ID=7
    3. UPDATE Employees SET Salary=2500 WHERE ID=7

351 of 372

Test

  • What should be in place of the blanks in the query " SELECT ___, Country FROM ___", which returns the names of customers and the country where they are located, from the " Customers" table?
    1. NULL, Customers
    2. Name, Customers

352 of 372

Test

  • Are there UNPIVOT operators and PIVOT?
    1. Yes
    2. No

353 of 372

PIVOT and UNPIVOT

  • PIVOT transforms rows into columns.
  • UNPIVOT performs the reverse operation of PIVOT.

SELECT DaysToManufacture, AVG (StandardCost) AS AverageCost

FROM Production.Product

GROUP BY DaysToManufacture;

DaysToManufactureAverageCost

------------------ ------------

0 5.0885

1 223.88

2 359.1082

4 949.4105

-- Pivot table with one row and five columns

SELECT 'AverageCost' AS CostSortedByProductionDays,

[0], [1], [2], [3], [4]

FROM (

    SELECT DaysToManufacture,

StandardCost

    FROM Production.Product

) AS SourceTable

PIVOT (

    AVG (StandardCost) FOR DaysToManufacture IN

([0], [1], [2], [3], [4])

) AS PivotTable;

CostSortedByProductionDays 0 1 2 3 4

--------------------------- ----------- ----------- ----------- ----------- -----------

AverageCost 5.0885 223.88 359.1082 NULL 949.4105

Only in Oracle, Microsoft SQL Server

DQL

354 of 372

Test

  • What is a transaction?
    1. Constraints that ensure data integrity
    2. Data transfer by packet method
    3. Temporary storage area used for intermediate data processing
    4. A set of operations performed as a single unit
    5. A function that calculates aggregated values

355 of 372

Test

  • Does any SQL query remove duplicates by default?
    1. Yes
    2. No

356 of 372

Test

  • Which operator is used to successfully complete a transaction and apply the changes?
    1. END TRANSACTION
    2. FINALIZE
    3. END QUERY
    4. DISPOSE
    5. COMMIT
    6. SAVE TRANSACTION
    7. RETURN

357 of 372

Test

  • What will be the result of executing the query DROP DATABASE Users?
    1. Users" database
    2. Blocking changes to the " Users" database
    3. Deleting the Users table from the current database

358 of 372

Test

  • What keyword is used to get maximum value?
    1. BRIDGE
    2. TOP
    3. UPPER
    4. MAX

359 of 372

Test

  • What is the purpose of ROLLBACK in SQL?
    1. Commit changes made by the transaction
    2. Save changes made by the transaction
    3. Undo changes made by a transaction
    4. Finalize changes made by the transaction

360 of 372

Interview No. 2

  • What is SQL?
  • What are the types of JOINs ? Briefly describe each type.
  • What are LEFT JOIN, RIGHT JOIN? How are they different?
  • Why is the word HAVING used?
  • What is DDL?
  • What is DML?

361 of 372

Interview No. 2

  • What is TCL?
  • What is DCL?
  • What is the general syntax of SELECT?

362 of 372

What is the difference between COUNT(*) and COUNT(column)?

  • COUNT(column) counts the number of values in "column".
  • ignores NULL values when counting the number of values in a column .
  • COUNT(*) counts the number of rows in a table, it does not ignore NULL values because this function operates on rows, not columns.

363 of 372

Which is better to use, joins or subqueries?

  • It is usually better to use JOIN because in most cases it is clear and is better optimized with Database Engine , but not always. Joins take precedence over subqueries when the SELECT list in a query contains columns from more than one table.

  • Subqueries are best when you need to calculate aggregate values and use them in outer queries for comparisons.

364 of 372

SQL

  • Make sure the name is unique and not on the list of reserved keywords.
  • Limit the name length to 30 bytes (that's 30 characters unless you're using a multibyte character set).
  • Start names with a letter and do not end with an underscore.
  • Use only letters, numbers, and the underscore character in names.
  • Avoid multiple consecutive underscore characters.
  • Use the underscore character where you would put a space in real life (e.g. first name will become first_name ).
  • Avoid abbreviations. If you must use them, make sure they are universally understood.

365 of 372

Useful links

366 of 372

Test

  • Publisher table contains id and name columns. Id is auto incremented and name is of varchar(40) type. Which of the following queries will execute correctly?
    1. insert into publisher (name) values ('O-Reilly');
    2. insert into publisher (name) values ('O Reilly');
    3. insert into publisher (name) values ('O'Reilly');
    4. insert into publisher (name) values ('OREILLY');

367 of 372

Test

  • Which is the correct order for a proper SQL query?
    1. SELECT, FROM, WHERE, HAVING, GROUP BY, ORDER BY
    2. SELECT, FROM, WHERE, HAVING, ORDER BY, GROUP BY
    3. SELECT, FROM, GROUP BY, HAVING, ORDER BY, WHERE
    4. SELECT, FROM, WHERE, GROUP BY, HAVING, ORDER BY
    5. SELECT, FROM, WHERE, GROUP BY, ORDER BY, HAVING

368 of 372

Test

  • What is executed first in SQL?
    1. WHERE
    2. ON
    3. FROM
    4. SELECT
    5. TOP

369 of 372

Test

  • Which of the following DROP statements is incorrect?
    1. DROP TABLE
    2. DROP DATABASE
    3. DROP INDEX
    4. DROP ROW

370 of 372

LearnSQL

  • Find doctors whose first name is Mark and whose last name starts with the letter D. Display two columns:
    • the doctor's last name,
    • the specialization of the doctor.

SELECT last_name, specialization FROM doctor

WHERE first_name = 'Mark' AND last_name LIKE 'D%';

371 of 372

LearnSQL

  • Display information about clinic visits. For each visit display five columns:
    • the first name of the patient,
    • the patient's last name,
    • the date of the visit,
    • the first name of the doctor,
    • the doctor's last name.

SELECT patient.first_name, patient.last_name, visit_date, doctor.first_name, doctor.last_name

FROM patient

INNER JOIN visit

ON patient.id = patient_id

INNER JOIN doctor

ON doctor_id = doctor.id;

372 of 372

LearnSQL

  • For each employee find out who their manager is. Display two columns:
    • the name of the employee; label it employee
    • the name of their manager; label it manager.
  • For employees who don't have a manager (for example the CEO) display NULL in the manager

SELECT

E1.full_name AS employee,

E2.full_name AS manager

FROM employee E1

LEFT JOIN employee E2 ON E1.manager_id = E2.id;