Deep Learning (DEEP-0001)�
Prof. André E. Lazzaretti
https://sites.google.com/site/andrelazzaretti/graduate-courses/deep-learning-cpgei/2025
4 – Deep Neural Networks
Deep neural networks
Deep neural networks
Composing two networks.
Network 1:
Network 2:
Composing two networks.
Network 1:
Network 2:
Comparing to shallow with six hidden units
Composing networks in 2D
Deep neural networks
Combine two networks into one
Network 1:
Network 2:
Hidden units of second network in terms of first:
Create new variables
Two-layer network
1
1
Two-layer network as one equation
Deep neural networks
Hyperparameters
Deep neural networks
Notation change #1
Notation change #1
Notation change #1
Notation change #1
Notation change #2
Notation change #3
Notation change #3
Bias vector
Weight matrix
General equations for deep network
Example
Deep neural networks
Shallow vs. deep networks
The best results are created by deep networks with many layers.
All use deep networks. But why?
Shallow vs. deep networks
Both obey the universal approximation theorem.
Argument: One layer is enough.
Shallow vs. deep networks
2. Number of linear regions per parameter
Number of linear regions per parameter
5 layers
10 hidden units per layer
471 parameters
161,501 linear regions
Shallow vs. deep networks
2. Number of linear regions per parameter
Shallow vs. Deep Networks
3. Depth efficiency
Shallow vs. Deep Networks
4. Large structured networks
Shallow vs. Deep Networks
5. Fitting and generalization
Shallow vs. Deep Networks
5. Fitting and generalization
Where are we going?