1 of 9

DATA MINING

2 of 9

Data Discretization and Concept Hierarchy

  • Binning
  • Histogram
  • Clustering
  • Discretization by Intuitive Partitioning.

1. Specification of a partial ordering of attributes explicitly at the schema level by users or experts

2. Specification of a portion of a hierarchy by explicit data grouping

3. Specification of a set of attributes, but not of their partial ordering

4. Specification of only a partial set of attributes

3 of 9

Classification of Data Mining Systems

Data mining is an interdisciplinary field ,the confluence of a set of disciplines, including database systems, statistics, machine learning, visualization, and information science.

4 of 9

Classification of Data Mining Systems

Data mining systems can be categorized according to various criteria, as follows:

i)Classification according to the kinds of databases mined:

  • Database systems can be classified according to data models, we may have a relational, transactional, object-relational, or data warehouse mining system.
  • Each of which may require its own data mining technique.

ii) Classification according to the kinds of knowledge mined:

Data mining systems can be categorized according to the kinds of knowledge they mine, that is, based on data mining functionalities,

5 of 9

Classification of Data Mining Systems

such as characterization, discrimination, association and correlation analysis, classification, prediction, clustering, outlier analysis, and evolution analysis.

iii)Classification according to the kinds of techniques utilized:

  • Data mining systems can be categorized according to the underlying data mining techniques employed.
  • These techniques can be described according to the degree of user interaction involved (e.g., autonomous systems, interactive exploratory systems, query-driven systems)

6 of 9

Classification of Data Mining Systems

IV) Classification according to the applications adapted:

  • Data mining systems can also be categorized according to the applications they adapt.
  • For example, data mining systems may be tailored specifically for finance, telecommunications, DNA, stock markets, e-mail, and so on.

7 of 9

Data Mining Task Primitives

  • A data mining task can be specified in the form of a data mining query, which is input to the data mining system.
  • A data mining query is defined in terms of data mining task primitives.

  • The data mining primitives specify the following:
  • The set of task-relevant data to be mined:
  • This specifies the portions of the database or the set of data in which the user is interested.
  • This includes the database attributes or data warehouse dimensions of interest

8 of 9

Data Mining Task Primitives

ii)The kind of knowledge to be mined:

  • This specifies the data mining functions to be performed, such as characterization, discrimination, association or correlation analysis, classification, prediction, clustering, outlier analysis, or evolution analysis.

iii) The background knowledge to be used in the discovery process:

  • This knowledge about the domain to be mined is useful for guiding the knowledge discovery process and for evaluating the patterns found.
  • Concept hierarchies are a popular form of background knowledge

9 of 9

Data Mining Task Primitives

iv)The interestingness measures and thresholds for pattern evaluation:

  • They may be used to guide the mining process or, after discovery, to evaluate the discovered patterns.
  • Different kinds of knowledge may have different interestingness measures

v)The expected representation for visualizing the discovered

patterns:

  • This refers to the form in which discovered patterns are to be

displayed, which may include rules, tables, charts, graphs,

decision trees, and cubes