Mining Data Streams
UNIT-5
What is Data Stream?
Sources of Data Stream
There are so many sources of the data stream, and a few widely used sources are listed below:
What are Data Streams in Data Mining?
Characteristics of Data Stream in Data Mining
Data Stream in Data Mining should have the following characteristics:
Time-series data mining
Time Series Data Mining
Is the database play a vital role in Time Series mining?
The database is the collection of data retrieved from a different source in which the data are stored in a structural, nonstructural format on their respective columns.
Time Series database consists of a sequence of values or events changing with time. Data are recorded at regular intervals.
Application of Time Series Mining:
1. Financial:
2. Industry:
3. Scientific:
4. Meteorological:
Characteristic of time series components:
1. Trend
2. Cycle
3.Seasonal
4. Irregular
1. Long-term or trend movements :
The general direction in which a time series is moving over a long interval of time. It shows the general tendency of the data to increase or decrease a long period of time.It will be represented using Trend Curve.
2. Cyclic movements or cycle variations:
Long term oscillations about a trend line or curve. For example, business cycles. This oscillatory movement has a period of oscillation of more than a year.
3. Seasonal movements or seasonal variations:
Almost identical patterns that a time series appears to follow during corresponding months of successive years. This variation will be present in a time series if the data are recorded hourly, daily, weekly or monthly. Sudden increase of sale of cakes during christmas and new year time.
4. Irregular or random movements:
These fluctuations are unforeseen, uncontrollable and unpredictable. They are not regular variations and are purely random or irregular. Such as labor disputes, floods or announced personal changes in company.
Example 1: Weather conditions
Example 2: Stock exchange
Example 3: Cluster monitoring in Network operation of Usage of data
Example 4: Health monitoring(ECG Report)
Sequence pattern mining
Introduction
What is Sequence Pattern Mining?
When you are performing Sequence Pattern Mining, you are essentially:
Applications of Sequence Pattern Mining:
Sequence Pattern Mining finds applications in multiple fields ranging from science, business, and finance to meteorology and geology. Some of them are listed below:
Types of Sequence Pattern Mining Problems
Sequence Database
A Sequence Pattern Mining Database is an ordered collection of elements or events. It is represented as a set of tuples <SID, S> where SID is the Sequence ID and S is the Sequence.
GSP (Generalized Sequential Pattern Mining)