Deep Networks
(DenseNet & ResNet)
DenseNet
Huang, Gao, et al. "Densely connected convolutional networks Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
Problem of
Vanishing Gradient
Gradients vanish when computed at shallow layers due to the nature of backprop that uses chain rule
A variable that is dependent on another variable that is small is also small. This smallness propages to the shallow layers until it vanishes.
DenseNet (>100 layers)
Solution
DenseNet Input
DenseNet
DenseNet vs ResNet
DenseNet 2 Issues to Address
Feature Maps Growth Rate
BottleNeck Layer Controls Growth Rate
Transition Layer Bridges Blocks w/ Different Feature Map Sizes
CIFAR10
10 categories
50k train set
10k test set
100-layer DenseNet Architecture
(Accuracy is > 93.55%)
DenseNet on CIFAR10
ResNet v1 & v2
He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016a.
He, Kaiming, et al. "Identity mappings in deep residual networks." European Conference on Computer Vision. Springer International Publishing, 2016b.
RESNET
RESNET
RESNET
ResNet Residual
Block
ResNet v1 on CIFAR 10
ResNet
on
CIFAR10
ResNet v2
Accuracy of ResNet v1/v2 on CIFAR10
END