1 of 71

Artificial Intelligence and its Applications

by

Dr. Vikrant Chole

Amity School of Engineering & Technology

2 of 71

MODULE - II….

Amity School of Engineering & Technology

3 of 71

Machine Learning

  • Machine Learning (ML) is a subset of Artificial Intelligence (AI) that focuses on developing algorithms and statistical models that enable computers to perform tasks without explicit instructions, relying instead on patterns and inference. It has become a transformative technology across industries, driving advancements in automation, prediction, and decision-making

Amity School of Engineering & Technology

4 of 71

Core Concepts of Machine Learning

  • Data: The foundation of ML. Algorithms learn from data, which can be structured (e.g., databases) or unstructured (e.g., images, text).
  • Model: A mathematical representation of a real-world process, trained using data.
  • Training: The process of teaching a model to recognize patterns by feeding it data.
  • Inference: Using the trained model to make predictions or decisions on new, unseen data.

Amity School of Engineering & Technology

5 of 71

Types of Machine Learning

A. Supervised Learning

  • Definition: The model is trained on labeled data (input-output pairs).
  • Goal: Learn a mapping from inputs to outputs.
  • Examples:
    • Classification (e.g., spam detection, image recognition).
    • Regression (e.g., predicting house prices, stock prices).

B. Unsupervised Learning

  • Definition: The model is trained on unlabeled data to find hidden patterns or structures.
  • Goal: Discover intrinsic relationships in the data.
  • Examples:
    • Clustering (e.g., customer segmentation, anomaly detection).
    • Dimensionality reduction (e.g., PCA for data visualization).

Amity School of Engineering & Technology

6 of 71

C. Reinforcement Learning

  • Definition: The model learns by interacting with an environment and receiving feedback in the form of rewards or penalties.
  • Goal: Maximize cumulative rewards over time.
  • Examples:
    • Game playing (e.g., AlphaGo, chess engines).
    • Robotics (e.g., autonomous navigation).

D. Semi-Supervised and Self-Supervised Learning

  • Semi-Supervised: Combines a small amount of labeled data with a large amount of unlabeled data.
  • Self-Supervised: Models generate their own labels from the data (e.g., predicting missing parts of an image).

E. Deep Learning

  • Definition: A subset of ML that uses neural networks with multiple layers to model complex patterns.
  • Applications:
    • Computer vision (e.g., facial recognition, object detection).
    • Natural language processing (e.g., chatbots, translation).
    • Speech recognition (e.g., virtual assistants).

Amity School of Engineering & Technology

7 of 71

Future Trends in Machine Learning

  • AutoML: Automating the process of model selection, training, and tuning.
  • Federated Learning: Training models across decentralized devices while preserving data privacy.
  • Explainable AI (XAI): Making ML models more transparent and interpretable.
  • Edge AI: Running ML models on edge devices (e.g., smartphones, IoT devices) for real-time processing.
  • Quantum Machine Learning: Leveraging quantum computing to solve complex ML problems faster.
  • AI Ethics and Regulation: Developing frameworks to ensure fairness, accountability, and transparency in ML systems.

Amity School of Engineering & Technology

8 of 71

Supervised Learning

In supervised learning, the algorithm is trained on a labeled dataset. This means that the training data contains both the input (features) and the correct output (labels). The goal of the model is to learn the mapping from inputs to outputs so that it can predict the output for new, unseen data.

Key Characteristics:

  • Labeled Data: The dataset contains pairs of inputs and corresponding correct outputs (labels).
  • Goal: The algorithm’s task is to learn the relationship between the input data and the target output, so it can predict the output for new, unseen data.
  • Training Process: The model is trained by comparing its predictions to the actual output, and adjustments are made to minimize the error.

Amity School of Engineering & Technology

9 of 71

Examples of Supervised Learning Applications:

  • Classification:
    • Email spam detection: Predicting whether an email is spam or not based on features like the subject line, sender, etc.
    • Image recognition: Classifying images (e.g., identifying whether an image contains a dog or a cat).

  • Regression:
    • Predicting stock prices: Forecasting future stock prices based on historical data.
    • Predicting housing prices: Estimating the price of a house based on features like size, location, and age.

Amity School of Engineering & Technology

10 of 71

Unsupervised Learning

In unsupervised learning, the algorithm is given data that is not labeled. The goal here is to find patterns, structures, or relationships in the data without prior knowledge of the output. The algorithm tries to discover the underlying structure in the input data by itself.

Key Characteristics:

  • Unlabeled Data: The data does not have labeled output, meaning the algorithm only has the input features without any known target.
  • Goal: The goal is to find patterns, groupings, or structure in the data. The model tries to organize the data in a meaningful way or discover hidden relationships.
  • Training Process: Since there are no labels to compare against, the algorithm must learn from the structure of the data itself.

Amity School of Engineering & Technology

11 of 71

Examples of Unsupervised Learning Applications:

  • Clustering:
    • Customer Segmentation: Grouping customers based on purchasing behavior for targeted marketing campaigns.
    • Anomaly Detection: Identifying unusual data points in credit card transactions, network traffic, or medical records that might indicate fraud or errors.

  • Dimensionality Reduction:
    • Data Preprocessing: Reducing the number of features in datasets before applying other machine learning algorithms, improving performance and reducing overfitting.
    • Visualization: Visualizing high-dimensional data (e.g., reducing a 1000-dimensional dataset to 2 or 3 dimensions to visualize it).

Amity School of Engineering & Technology

12 of 71

Differences Between Supervised and Unsupervised Learning

Aspect

Supervised Learning

Unsupervised Learning

Data Type

Labeled data (input-output pairs)

Unlabeled data (only inputs, no known outputs)

Goal

Learn a mapping from input to output to make predictions or classifications

Find patterns or structure in the data (e.g., clusters or groups)

Examples of Tasks

Classification, Regression

Clustering, Dimensionality Reduction, Association Rules

Algorithms

Linear Regression, SVM, Decision Trees, Neural Networks

k-Means, Hierarchical Clustering, PCA, DBSCAN

Application Areas

Fraud detection, stock prediction, medical diagnosis

Customer segmentation, anomaly detection, data compression

Output

Specific label or numeric value

Groupings, patterns, reduced features

Amity School of Engineering & Technology

13 of 71

Statistical learning models

Statistical learning models are a broad class of techniques used for making predictions, understanding relationships between variables, and drawing inferences from data. They rely heavily on statistical principles and are used across various fields, including economics, machine learning, biology, and engineering. These models can be broadly categorized into supervised and unsupervised learning methods.

Statistical learning models are a class of algorithms and techniques rooted in statistics and probability theory that are used to analyze and interpret data. These models form the foundation of many machine learning approaches and are widely used for prediction, classification, clustering, and inference.

Amity School of Engineering & Technology

14 of 71

1. Supervised Learning Models:

In supervised learning, the model is trained on labeled data (i.e., each input is paired with a correct output or label). The goal is to learn a mapping from inputs to outputs.

a) Linear Regression:

  • Purpose: Predicts a continuous outcome based on one or more input variables.
  • Model: Assumes a linear relationship between the input variables and the output.
  • Example: Predicting house prices based on features like square footage, number of bedrooms, etc.

b) Logistic Regression:

  • Purpose: Used for binary classification tasks (predicting one of two classes).
  • Model: Estimates the probability of the default class using the logistic function.
  • Example: Predicting whether a customer will buy a product (yes/no).

Amity School of Engineering & Technology

15 of 71

c) Support Vector Machines (SVM):

  • Purpose: Classifies data by finding a hyperplane that maximizes the margin between different classes.
  • Model: Uses a kernel function for non-linear classification.
  • Example: Handwriting recognition, image classification.

d) Decision Trees:

  • Purpose: Splits the data into subsets based on the value of input variables to predict an output.
  • Model: Creates a tree-like structure where each internal node represents a decision on an attribute, and each leaf node represents an outcome.
  • Example: Loan approval based on income, credit score, and debt history.

Amity School of Engineering & Technology

16 of 71

e) Random Forests:

  • Purpose: An ensemble method that combines multiple decision trees to improve prediction accuracy.
  • Model: Aggregates predictions from several decision trees (typically trained with bootstrapped samples) to reduce overfitting and variance.
  • Example: Predicting stock market trends.

f) K-Nearest Neighbors (KNN):

  • Purpose: A simple classification or regression technique based on proximity.
  • Model: Assigns a label to a data point based on the majority class (or average) of its k-nearest neighbors.
  • Example: Classifying new data points into categories based on labeled historical data.

g) Neural Networks:

  • Purpose: Mimics the human brain to model complex relationships.
  • Model: Composed of layers of interconnected nodes (neurons) to learn non-linear mappings from input to output.
  • Example: Image and speech recognition, deep learning applications.

Amity School of Engineering & Technology

17 of 71

2. Unsupervised Learning Models:

Unsupervised learning is used when the output labels are not available. The goal is to identify patterns or groupings in the data.

a) K-Means Clustering:

  • Purpose: Groups data into a specified number of clusters based on similarity.
  • Model: Iteratively assigns each data point to the nearest centroid, adjusting centroids to minimize intra-cluster variance.
  • Example: Customer segmentation based on purchasing behavior.

b) Hierarchical Clustering:

  • Purpose: Builds a hierarchy of clusters in a tree-like structure.
  • Model: Agglomerative or divisive methods are used to merge or split clusters based on distance metrics.
  • Example: Taxonomy creation in biology.

Amity School of Engineering & Technology

18 of 71

c) Principal Component Analysis (PCA):

  • Purpose: Reduces the dimensionality of data while retaining as much variance as possible.
  • Model: Identifies the most important directions (principal components) in the data.
  • Example: Image compression, reducing feature space in high-dimensional datasets.

d) Independent Component Analysis (ICA):

  • Purpose: Similar to PCA but assumes that components are statistically independent rather than just uncorrelated.
  • Model: Finds components that maximize statistical independence.
  • Example: Signal separation, such as separating sound sources in an audio recording.

e) Gaussian Mixture Models (GMM):

  • Purpose: Models data as a mixture of several Gaussian distributions.
  • Model: Each Gaussian component represents a cluster or group in the data.
  • Example: Anomaly detection, density estimation.

Amity School of Engineering & Technology

19 of 71

3. Semi-supervised and Reinforcement Learning Models:

a) Semi-supervised Learning:

  • Purpose: Combines both labeled and unlabeled data for training, which is useful when labeled data is scarce.
  • Example: Labeling images based on few known labels while leveraging a large amount of unlabeled data.

b) Reinforcement Learning:

  • Purpose: Models decision-making problems where an agent learns by interacting with an environment and receiving feedback (rewards or punishments).
  • Example: Robotics, game playing (e.g., AlphaGo).

Amity School of Engineering & Technology

20 of 71

Learning

  • Learning denotes “ changes in the system that are adaptive in the sense that they enable the system to do the same task or tasks drawn from the same population more efficiently and more effectively the next time”
  • Learning is the improvement of performance with experience over time

Amity School of Engineering & Technology

21 of 71

Knowledge Acquisition

  • Knowledge acquisition is the process of adding new knowledge to a knowledge base and refining or otherwise improving knowledge that was previously acquired
  • Acquired knowledge may consists of facts,rules,concepts,procedures,heuristics,formulas,relationships,statistics or other useful information
  • Sources of knowledge includes Experts in the domain of interest, Textbooks, Technical papers, Databases, Reports, Environment etc

Amity School of Engineering & Technology

22 of 71

Types of Learning

We all learn new knowledge through different methods, depending on the type of material to be learned, the amount of relevant knowledge we already possess, and the environment in which the learning takes place.

There are five methods of learning .

1. Memorization (rote learning)

2. Direct instruction (by being told)

3. Analogy

4. Induction

5. Deduction

Amity School of Engineering & Technology

23 of 71

1. Memorization

Learning by memorizations is the simplest from of learning.

It requires the least amount of inference and is accomplished by simply copying the knowledge in the same form that it will be used directly into the knowledge base.

Example:- Memorizing multiplication tables, formulate , etc.

Amity School of Engineering & Technology

24 of 71

2. Direct instruction

Direct instruction is a complex form of learning. This type of learning requires more inference

than role learning since the knowledge must be transformed into an operational form before

learning

E.g. when a teacher presents a number of facts directly to us in a well organized manner.

Amity School of Engineering & Technology

25 of 71

3. Analogy

Analogical learning is the process of learning a new concept or solution through the use of similar known concepts or solutions.

We use this type of learning when solving problems on an exam where previously learned examples serve as a guide or when make frequent use of analogical learning.

This form of learning requires still more inferring than either of the previous forms Since difficult transformations must be made between the known and unknown situations.

Amity School of Engineering & Technology

26 of 71

4. Induction

Learning by induction is also one that is used frequently by humans .

it is a powerful form of learning like analogical learning which also require s more inferring than the first two methods.

This learning re quires the use of inductive inference, a form of invalid but useful inference.

We use inductive learning ofinstances of examples of the concept.

For example we learn the concepts of color or sweet taste after experiencing the sensations associated with several examples of colored objects or sweet foods.

Amity School of Engineering & Technology

27 of 71

5. Deduction

Deductive learning is accomplished through a sequence of deductive inference steps using

known facts.

From the known facts, new facts or relationships are logically derived.

Deductive learning usually requires more inference than the other methods.

Amity School of Engineering & Technology

28 of 71

General Learning Model

Amity School of Engineering & Technology

29 of 71

General learning model is depicted in figure where the environment has been included as a part of the overall learner system. The environment may be regarded as either a form of nature which produces random stimuli or as a more organized training source such as a teacher which provides carefully selected training examples for the learner component.

For some systems the environment may be a user working at a keyboard . Other systems will use program modules to simulate a particular environment. In even more realistic cases the system will have real physical sensors which interface with some world environment.

Inputs to the learner component may be physical stimuli of some type or descriptive , symbolic training examples. The information conveyed to the learner component is used to create and modify knowledge structures in the knowledge base.

This same knowledge is used by the performance component to carry out some tasks, such as solving a problem playing a game, or classifying instances of some concept.

Amity School of Engineering & Technology

30 of 71

Given a task, the performance component produces a response describing its action in performing the task. The critic module then evaluates this response relative to an optimal response.

Feedback , indicating whether or not the performance was acceptable , is then sent by the critic module to the learner component for its subsequent use in modifying the structures in the knowledge base.

If proper learning was accomplished, the system’s performance will have

improved with the changes made to the knowledge base.

The cycle described above may be repeated a number of times until the performance of the system has reached some acceptable level, until a known learning goal has been reached, or until changes ceases to occur in the knowledge base after some chosen number of training examples

have been observed.

Amity School of Engineering & Technology

31 of 71

Factors affecting learning performance

There are several important factors which influence a system’s ability to learn in addition to the form of representation used.

They include the types of training provided, the form and extent of

any initial background knowledge , the type of feedback provided, and the learning algorithms used.

The type of training used in a system can have a strong effect on performance, much the same as it does for humans. Training may consist of randomly selected instance or examples that have been carefully selected and ordered for presentation. The instances may be positive examples of some concept or task a being learned, they may be negative, or they may be mixture of both positive and negative.

The instances may be well focused using only relevant information, or

they may contain a variety of facts and details including irrelevant data.

Amity School of Engineering & Technology

32 of 71

There are Many forms of learning can be characterized as a search through a space of possible hypotheses or solutions. To make learning more efficient. It is necessary to constrain this search process or reduce the search space.

One method of achieving this is through the use of background knowledge which can be used to constrain the search space or exercise control operations which limit the search process.

Feedback is essential to the learner component since otherwise it would never know if the knowledge structures in the knowledge base were improving or if they were adequate for the performance of the given tasks.

The feedback may be a simple yes or no type of evaluation, or it may contain more useful information describing why a particular action was good or bad. Also ,

the feedback may be completely reliable, providing an accurate assessment of the performance or it may contain noise, that is the feedback may actually be incorrect some of the time. Intuitively , the feedback must be accurate more than 50% of the time; otherwise the system carries useful information, the learner should also to build up a useful corpus of knowledge quickly.

On the other hand, if the feedback is noisy or unreliable, the learning process may be very slow and the resultant knowledge incorrect.

Amity School of Engineering & Technology

33 of 71

Performance Measures

  • Generality: Generality is a measure of the ease with which the method can be adapted to different domains of application

  • Efficiency: The efficiency of a method is a measure of the average time required to construct the target knowledge structures from some specified initial structures

  • Robustness: Robustness is the ability of a learning system to function with unreliable feedback and with a variety of training examples, including noisy ones

Amity School of Engineering & Technology

34 of 71

  • Efficacy: The efficacy of a system is a measure of the overall power of the system. It is a combination of generality, efficiency and robustness

  • Ease of implementation: Ease of implementation relates to the complexity of the programs and data structures and the resources required to develop the given learning system

Amity School of Engineering & Technology

35 of 71

Learning

  • Skill refinement vs. knowledge acquisition
  • Rote learning
  • Taking advice
  • Learning through problem solving
  • Learning from examples

“… changes in the system that are adaptive in the sense that they enable the system to do the same task or tasks drawn from the same population more efficiently and more effectively the next time.” [Simon, 1983]

Amity School of Engineering & Technology

36 of 71

Rote Learning

Amity School of Engineering & Technology

37 of 71

Learning by Taking Advice

Amity School of Engineering & Technology

38 of 71

Analogy

  • Last month, the stock market was a roller coaster.

  • Bill is like a fire engine.

  • Problems in electromagnetism are just like problems in fluid flow.

Amity School of Engineering & Technology

39 of 71

Transformational Analogy

Amity School of Engineering & Technology

40 of 71

Solving a Problem by

Transformational Analogy

Amity School of Engineering & Technology

41 of 71

Derivational Analogy

Amity School of Engineering & Technology

42 of 71

Architectures of Neural Network

There exist five basic types of neuron connection architecture : 

 

  1. Single-layer feed-forward network
  2. Multilayer feed-forward network
  3. Single node with its own feedback
  4. Single-layer recurrent network
  5. Multilayer recurrent network

Amity School of Engineering & Technology

43 of 71

1. Single-layer feed-forward network 

 

       

In this type of network, we have only two layers input layer and the output layer but the input layer does not count because no computation is performed in this layer. The output layer is formed when different weights are applied to input nodes and the cumulative effect per node is taken. After this, the neurons collectively give the output layer to compute the output signals.

Amity School of Engineering & Technology

44 of 71

2. Multilayer feed-forward network 

 

          

This layer also has a hidden layer that is internal to the network and has no direct contact with the external layer. The existence of one or more hidden layers enables the network to be computationally stronger, a feed-forward network because of information flow through the input function, and the intermediate computations used to determine the output Z. There are no feedback connections in which outputs of the model are fed back into itself.

Amity School of Engineering & Technology

45 of 71

3. Single node with its own feedback � 

          

� 

When outputs can be directed back as inputs to the same layer or preceding layer nodes, then it results in feedback networks. Recurrent networks are feedback networks with closed loops. The above figure shows a single recurrent network having a single neuron with feedback to itself.

Amity School of Engineering & Technology

46 of 71

4. Single-layer recurrent network 

The below network is a single-layer network with a feedback connection in which the processing element’s output can be directed back to itself or to another processing element or both. A recurrent neural network is a class of artificial neural networks where connections between nodes form a directed graph along a sequence. This allows it to exhibit dynamic temporal behavior for a time sequence. Unlike feedforward neural networks, RNNs can use their internal state (memory) to process sequences of inputs.

Amity School of Engineering & Technology

47 of 71

5. Multilayer recurrent network � 

           

In this type of network, processing element output can be directed to the processing element in the same layer and in the preceding layer forming a multilayer recurrent network. They perform the same task for every element of a sequence, with the output being dependent on the previous computations. Inputs are not needed at each time step. The main feature of a Recurrent Neural Network is its hidden state, which captures some information about a sequence.

Amity School of Engineering & Technology

48 of 71

Learning Process in ANN

Neural networks learn through a process called “training.” During training, a neural network iteratively adjusts its internal parameters (weights and biases) to minimize the difference between its predicted output and the actual target output for a given set of training examples.

The learning process involves the following steps:

  1. Initialization: The weights and biases of the neural network are initialized randomly. These initial values set the starting point for the learning process.

  • Forward Propagation: In this step, the input data is fed into the neural network, and the activations of each neuron in each layer are computed through a series of matrix multiplications and activation functions. This process is known as forward propagation, and it produces the predicted output of the neural network.

Amity School of Engineering & Technology

49 of 71

3.Loss Calculation: The difference between the predicted output and the actual target output (ground truth) is computed using a loss function. The loss function quantifies how far off the predictions are from the true values.

4.Backpropagation: The backpropagation algorithm is used to calculate the gradients of the loss function with respect to the network’s weights and biases. It involves computing the derivative of the loss function with respect to each parameter, which indicates how the loss changes with respect to small changes in the parameters.

5.Gradient Descent: The gradients calculated during backpropagation indicate the direction of steepest ascent, which means the direction of increasing loss. To minimize the loss, the network updates its parameters by moving in the opposite direction of the gradients. The magnitude of this update is controlled by a learning rate hyperparameter.

6.Iteration: Steps 2 to 5 are repeated for each batch of training data multiple times (epochs) until the neural network’s performance on the training data reaches a satisfactory level or converges to a solution.

Amity School of Engineering & Technology

50 of 71

The training process continues until the neural network’s performance on the training data is satisfactory. The network is then evaluated on a separate set of data called the validation set to check for overfitting (the model performing well on the training data but poorly on unseen data). If necessary, the hyperparameters of the neural network (learning rate, architecture, etc.) can be adjusted based on the validation set performance.

After the training process is complete, the neural network is expected to have learned the underlying patterns and relationships in the training data, allowing it to make accurate predictions on new, unseen data.

It’s important to note that neural networks learn by iteratively adjusting their parameters based on the gradients of the loss function, and this process is often computationally intensive, especially for large and deep networks. Hence, training neural networks is typically performed on powerful hardware or specialized hardware accelerators (GPUs or TPUs) to speed up the process.

Amity School of Engineering & Technology

51 of 71

Error functions

Error functions (also called loss functions or cost functions) are fundamental components of neural network training that quantify how well or poorly the network is performing. They measure the difference between the network's predicted outputs and the true target values.

An error function measures how far the network's predictions are from the actual target values. During training, the goal is to minimize this error so that the predictions become as accurate as possible.

Error functions measure the discrepancy between predicted and actual outputs, enabling the ANN to learn and improve its performance.

During training, the error function is minimized using optimization algorithms like gradient descent, adjusting the network's parameters (weights and biases) to reduce the error.

Amity School of Engineering & Technology

52 of 71

Types of error functions

Different error functions are suitable for various tasks, Common examples include:

  • Mean Squared Error (MSE): Popular for regression tasks, it measures the average squared difference between predicted and actual values. 
  • Mean Absolute Error (MAE): Also used for regression, it measures the average absolute difference between predicted and actual values, less sensitive to outliers than MSE. 
  • Binary Cross-Entropy: Commonly used for binary classification tasks. 
  • Categorical Cross-Entropy: Used for multi-class classification tasks. 

Other Specialized Loss Functions

  • Huber Loss: Combines MSE and MAE for robust regression.
  • KL Divergence: Measures how one probability distribution diverges from another.
  • Hinge Loss: Common in SVMs (used for margin-based classification).

Amity School of Engineering & Technology

53 of 71

Back Propagation Neural network

What is Backpropagation?

Backpropagation is a technique used in deep learning to train artificial neural networks particularly feed-forward networks. It works iteratively to adjust weights and bias to minimize the cost function.

In each epoch the model adapts these parameters reducing loss by following the error gradient. Backpropagation often uses optimization algorithms like gradient descent or stochastic gradient descent. The algorithm computes the gradient using the chain rule from calculus allowing it to effectively navigate complex layers in the neural network to minimize the cost function.

Backpropagation is also known as "Backward Propagation of Errors" and it is a method used to train neural network . Its goal is to reduce the difference between the model’s predicted output and the actual output by adjusting the weights and biases in the network

Amity School of Engineering & Technology

54 of 71

Amity School of Engineering & Technology

55 of 71

Working of Backpropagation Algorithm

The Backpropagation algorithm involves two main steps: the Forward Pass and the Backward Pass.

How Does Forward Pass Work?

In forward pass the input data is fed into the input layer. These inputs combined with their respective weights are passed to hidden layers.. Before applying an activation function, a bias is added to the weighted inputs.

Each hidden layer computes the weighted sum (`a`) of the inputs then applies an activation function like ReLU (Rectified Linear Unit) to obtain the output (`o`). The output is passed to the next layer where an activation function such as softmax converts the weighted outputs into probabilities for classification.

Amity School of Engineering & Technology

56 of 71

How Does the Backward Pass Work?

In the backward pass the error (the difference between the predicted and actual output) is propagated back through the network to adjust the weights and biases. One common method for error calculation is the Mean Squared Error (MSE) given by:

MSE=(Predicted Output−Actual Output)2

Once the error is calculated the network adjusts weights using gradients which are computed with the chain rule. These gradients indicate how much each weight and bias should be adjusted to minimize the error in the next iteration. The backward pass continues layer by layer ensuring that the network learns and improves its performance. 

Amity School of Engineering & Technology

57 of 71

Amity School of Engineering & Technology

58 of 71

Amity School of Engineering & Technology

59 of 71

Amity School of Engineering & Technology

60 of 71

Amity School of Engineering & Technology

61 of 71

Amity School of Engineering & Technology

62 of 71

Amity School of Engineering & Technology

63 of 71

Amity School of Engineering & Technology

64 of 71

Amity School of Engineering & Technology

65 of 71

Amity School of Engineering & Technology

66 of 71

Amity School of Engineering & Technology

67 of 71

Amity School of Engineering & Technology

68 of 71

Amity School of Engineering & Technology

69 of 71

Amity School of Engineering & Technology

70 of 71

Amity School of Engineering & Technology

71 of 71

Amity School of Engineering & Technology