�
Data Mining_Anoop Chaturvedi
1
Swayam Prabha
Course Title
Multivariate Data Mining- Methods and Applications
Lecture 02
Data Mining Machine Learning and Artificial �Intelligence
By
Anoop Chaturvedi
Department of Statistics, University of Allahabad
Prayagraj (India)
Slides can be downloaded from https://sites.google.com/view/anoopchaturvedi/swayam-prabha
Data Mining
Objective
Extract useful information from large datasets.
Use it to make predictions or better decision-making.
Data Mining_Anoop Chaturvedi
2
Predictive data mining ⇒ Predicting future outcomes based on historical data.
Descriptive data mining ⇒ Summarizing and interpreting data to understand its underlying patterns and relationships.
Data Mining_Anoop Chaturvedi
3
Data Mining Activities
Descriptive data mining
Predictive data mining
Descriptive data mining: Discover the locations of unexpected structures or relationships, patterns, trends, clusters, and outliers in the massive data sets.
Predictive data mining: Build models and procedures for regression, classification, pattern recognition, or machine learning tasks. Provide the predictive accuracy of the models and procedures when applied to fresh data.
In machine-learning terminology, descriptive data mining is unsupervised learning, whereas predictive data mining is supervised learning.
Data Mining_Anoop Chaturvedi
4
Data mining methods ⇒ Related to methods developed in statistics and machine learning such as regression, classification, clustering, and visualization.
Enormous sizes of the data sets ⇒ Data mining tools focus on dimensionality-reduction techniques, variable selection, handling situations when high-dimensional data concentrate on lower-dimensional hyperplanes or on nonlinear surfaces or manifolds.
Data Mining_Anoop Chaturvedi
5
An important issues in data mining is scalability
Scalability ⇒ Algorithm’s ability to handle and process large data sets efficiently and effectively.
Can be expressed as a function of the data size, (linear, logarithmic, polynomial, or exponential).
Ideally should be a linear or sublinear function, i.e., time and memory grow proportionally or slower than the data size.
Data Mining_Anoop Chaturvedi
6
Potential Applications of Data mining ⇒ Used in the fields where a large amount of data is stored and processed.
Marketing:
Data Mining_Anoop Chaturvedi
7
Banking:
Data Mining_Anoop Chaturvedi
8
Financial Markets:
Data Mining_Anoop Chaturvedi
9
Insurance and Health Care:
Data Mining_Anoop Chaturvedi
10
Molecular Biology:
Data Mining_Anoop Chaturvedi
11
Forensic:
Data Mining_Anoop Chaturvedi
12
Transportation:
Sports:
Data Mining_Anoop Chaturvedi
13
Astronomy:
Data Mining_Anoop Chaturvedi
14
Data Mining_Anoop Chaturvedi
15
Knowledge discovery includes the entire process of discovering useful knowledge from data, including data mining as a key component.
KDD is composed of six primary activities:
1. Selecting the target data set (data set/ variables/ cases used for data mining)
2. Data cleaning (removal of noise, identification of outliers, imputing missing data)
Data Mining_Anoop Chaturvedi
16
3. Preprocessing the data (data transformations, tracking time-dependent information)
4. Deciding appropriate data-mining tasks (regression, classification, clustering, etc.)
5. Analyzing the cleaned data using data-mining software (algorithms for data reduction, dimensionality reduction, fitting models, prediction, extracting patterns etc.)
6. Interpreting and assessing the knowledge obtained from data-mining results.
Data Mining_Anoop Chaturvedi
17
Artificial Intelligence (AI) Different Definitions
Data Mining_Anoop Chaturvedi
18
The approaches to AI can be organized into four categories:
Data Mining_Anoop Chaturvedi
19
Acting humanly: Turing Test approach for intelligent behavior
Turning Test (proposed by Alan Turing in 1950)
⇒ The measure of a machine’s ability to demonstrate human-like intelligence
AI is required to pass the Turning Test.
Thinking humanly requires the cognitive modeling approach (How human thinks)
Thinking rationally requires the laws of thought approach (indisputable reasoning processes)
Data Mining_Anoop Chaturvedi
20
Turning Test: The objective is to evaluate a machine's ability to demonstrate human-like intelligence.
A human evaluator engages in a natural language conversation with a human and a machine through a computer interface, without knowing which is which.
Evaluator’s task ⇒ Distinguish between the human and machine.
If the machine can fool the evaluator into believing that it is a human a significant portion of the time, then it passes the Turing Test.
Data Mining_Anoop Chaturvedi
21
The Turing Test led to debates about what constitutes intelligence, consciousness, and the nature of human-computer interaction.
Limitations:
The machine may rely on superficial tricks or patterns rather than genuine comprehension.
Focuses primarily on linguistic abilities and may not capture other aspects of intelligence, such as creativity or emotional intelligence.
Data Mining_Anoop Chaturvedi
22