NEURAL NETWORKS AND DEEP LEARNING
by
Dr. Vikrant Chole
Amity School of Engineering & Technology
MODULE - III�Probabilistic Neural Network
Amity School of Engineering & Technology
Probabilistic Neural Network
A Probabilistic Neural Network (PNN) is a type of supervised machine learning algorithm used for classification tasks. It's based on Bayesian theory and uses Parzen window estimators to approximate the probability density/distribution function (PDF) of each class.
Probabilistic Neural Networks (PNNs) is a type of neural network architecture designed for classification tasks mainly due to the use of principles from Bayesian statistics and probability theory.
A Probabilistic Neural Network (PNN) is a type of feed-forward ANN in which the computation-intensive backpropagation is not used.
It’s a classifier that can estimate the pdf of a given set of data. PNNs are a scalable alternative to traditional backpropagation neural networks in classification and pattern recognition applications.
When used to solve problems on classification, the networks use probability theory to reduce the number of incorrect classifications.
Amity School of Engineering & Technology
Probabilistic Neural Networks (PNNs) is a type of neural network architecture designed for classification tasks mainly due to the use of principles from Bayesian statistics and probability theory.
The structure of PNNs consists of four layers:
Amity School of Engineering & Technology
Architecture of PNN
The below image describes the architecture of PNN, which consists of four significant layers:
Amity School of Engineering & Technology
1. Input Layer
The Input layer is the first stage of the network where external data is injected. There is one neuron in this layer that corresponds to one input feature. Input layer is responsible for taking the data in and passing it through for further processing.
2. Pattern Layer:
The Pattern layer is the part where the PNN architecture is distinguished from the others. Each Pattern layer neuron corresponds to each training example from the given dataset. The neurons from this layer use RBFs (Radial basis functions) to compute pattern similarity by example. The RBF calculates the distance between the input pattern and the training example in the feature space and outputs an activation value based on this distance.
Amity School of Engineering & Technology
3. Summation Layer:
The outputs of the Pattern layer are summarized and presented on the Summation layer. Each neuron of the layer Summation represents a class and this neuron sums up outputs from the Pattern neurons that correspond to this class. Basically, this layer is a weighted sum of the RBF activations for each class.
4. Output Layer:
The Output layer is the end portion of the network where the computed probabilities are normalized as the final result. Each neuron in the Output layer stands for a class, and this layer employs the soft max function to find the normalized probability of the corresponding class. The soft max function makes the probabilities across all classes add up to one, which gives a valid probability distribution over the classes.
Amity School of Engineering & Technology
Working Principle of Probabilistic Neural Networks
The core operation of Probabilistic Neural Networks (PNNs) revolves around the concept of the Parzen window, a non-parametric approach for estimating probability density functions (PDFs). This methodology is central to PNNs' ability to handle uncertainties and variabilities in input data, enabling them to make highly accurate decisions.
Parzen Window Estimation
The Parzen window method, also known as kernel density estimation (KDE), is used in PNNs to estimate the PDF of a random variable in a non-parametric way. This method does not assume any underlying distribution for the data, which is particularly useful in real-world scenarios where the data may not follow known or standard distributions.
How it Works:
Amity School of Engineering & Technology
Example Problem: Classifying Animals
Let’s say we want to classify animals into three categories:
Based on these simplified features:
Animal | Temp | Feathers | Birth | Fly | Class |
Human | 1 | 0 | 1 | 0 | Mammal |
Bat | 1 | 0 | 1 | 1 | Mammal |
Eagle | 1 | 1 | 0 | 1 | Bird |
Penguin | 1 | 1 | 0 | 0 | Bird |
Lizard | 0 | 0 | 0 | 0 | Reptile |
Snake | 0 | 0 | 0 | 0 | Reptile |
Dataset (Training Samples)
Amity School of Engineering & Technology
How a PNN Works Here
Amity School of Engineering & Technology
Example Classification
Given an unknown animal with:
Let’s compare it using Gaussian distance to:
The PNN would assign the highest probability to "Mammal", likely classifying this animal as a mammal.
Amity School of Engineering & Technology
Applications of Probabilistic Neural Networks (PNNs)
Amity School of Engineering & Technology
Advantages of Probabilistic Neural Networks
Amity School of Engineering & Technology
Limitations of Probabilistic Neural Networks
Amity School of Engineering & Technology
Convolutional Neural Network
Amity School of Engineering & Technology
Architecture of a basic Convolutional Neural Network (CNN)
Amity School of Engineering & Technology
Layers in CNN Architecture
CNNs consist of multiple layers like the input layer, Convolutional layer, pooling layer, and fully connected layers.
Amity School of Engineering & Technology
Advantages of CNNs
Disadvantages of CNNs
Amity School of Engineering & Technology
Applications:
Challenges:
�
Amity School of Engineering & Technology
Recurrent Neural Network
Amity School of Engineering & Technology
Below is how you can convert a Feed-Forward Neural Network into a Recurrent Neural Network:
The nodes in different layers of the neural network are compressed to form a single layer of recurrent neural networks. A, B, and C are the parameters of the network.
Amity School of Engineering & Technology
Why Recurrent Neural Networks?
RNN were created because there were a few issues in the feed-forward neural network:
The solution to these issues is the RNN.
An RNN can handle sequential data, accepting the current input data, and previously received inputs. RNNs can memorize previous inputs due to their internal memory.
Amity School of Engineering & Technology
How Does Recurrent Neural Networks Work?
Amity School of Engineering & Technology
How RNN Differs from Feedforward Neural Networks?
Amity School of Engineering & Technology
Types Of Recurrent Neural Networks
There are four types of RNNs based on the number of inputs and outputs in the network:
1. One-to-One RNN
This is the simplest type of neural network architecture where there is a single input and a single output. It is used for straightforward classification tasks such as binary classification where no sequential data is involved.
�
Amity School of Engineering & Technology
2. One-to-Many RNN
In a One-to-Many RNN the network processes a single input to produce multiple outputs over time. This is useful in tasks where one input triggers a sequence of predictions (outputs). For example in image captioning a single image can be used as input to generate a sequence of words as a caption.
3. Many-to-One RNN
The Many-to-One RNN receives a sequence of inputs and generates a single output. This type is useful when the overall context of the input sequence is needed to make one prediction. In sentiment analysis the model receives a sequence of words (like a sentence) and produces a single output like positive, negative or neutral.
Amity School of Engineering & Technology
4. Many-to-Many RNN
The Many-to-Many RNN type processes a sequence of inputs and generates a sequence of outputs. In language translation task a sequence of words in one language is given as input, and a corresponding sequence in another language is generated as output.
Amity School of Engineering & Technology
Variants of Recurrent Neural Networks (RNNs)
There are several variations of RNNs, each designed to address specific challenges or optimize for certain tasks:
1. Vanilla RNN
This simplest form of RNN consists of a single hidden layer where weights are shared across time steps. Vanilla RNNs are suitable for learning short-term dependencies but are limited by the vanishing gradient problem, which hampers long-sequence learning.
2. Bidirectional RNNs
Bidirectional RNNs process inputs in both forward and backward directions, capturing both past and future context for each time step. This architecture is ideal for tasks where the entire sequence is available, such as named entity recognition and question answering.
Amity School of Engineering & Technology
3. Long Short-Term Memory Networks (LSTMs)
Long Short-Term Memory Networks (LSTMs) introduce a memory mechanism to overcome the vanishing gradient problem. Each LSTM cell has three gates:
4. Gated Recurrent Units (GRUs)
Gated Recurrent Units (GRUs) simplify LSTMs by combining the input and forget gates into a single update gate and streamlining the output mechanism. This design is computationally efficient, often performing similarly to LSTMs, and is useful in tasks where simplicity and faster training are beneficial.
Amity School of Engineering & Technology
Advantages of Recurrent Neural Networks
Limitations of Recurrent Neural Networks (RNNs)
While RNNs excel at handling sequential data, they face two main training challenges i.e., vanishing gradient and exploding gradient problem:
These challenges can hinder the performance of standard RNNs on complex, long-sequence tasks.
Amity School of Engineering & Technology
Applications of Recurrent Neural Networks
RNNs are used in various applications where data is sequential or time-based:
Amity School of Engineering & Technology
Amity School of Engineering & Technology
Amity School of Engineering & Technology