1 of 83

Machine Learning - �Basic Principles & Practice�11. Complex Networks

Cong Li 李聪

机器学习 - 基础原理与实践

11. 复杂网络

2 of 83

Should You Still Remember�如果你还记得

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

Methods 方法

Error rate 错误率

Nearest neighbor

最近邻

4.98%

Linear perceptron

线性感知器

8.52%

Perceptron – RBF

感知器 – 径向基函数核计算

4.43%

Perceptron – Polynomial

感知器 – 多项式核计算

4.24%

Handwritten ZIP code recognition

手写邮政编码识别

3 of 83

Nonlinear Classifier �非线性分类器

  • Nonlinear Classification 非线性分类
    • Nonlinear classifiers or kernel methods outperform linear ones on the problem �在这个问题上非线性分类器或核计算方法比线性方法强
  • General Purpose Methods 通用方法
    • Those methods are relatively universally applicable 这些方法相对比较通用
      • Lack of application-specific adaptation �缺少针对特定应用的适应
    • Perceptron + RBF kernel as an example�以感知器和径向基核计算为例

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

4 of 83

Do You Still Remember?�你还记得吗?

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

No assumption, no learning!

无假设,不学习!

Now where is the assumption?

那么假设在哪里?

5 of 83

2D Example 两维的例子

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

 

 

 

 

Something like an advanced version of nearest neighbor

如同最近邻的进阶版

 

 

6 of 83

Application Specific 特定应用

  • Adaptation 适应于
    • To a specific type of applications�某一类具体问题
      • Leveraging their characteristics�利用这类问题的特性
  • Less Universally Applicable�不那么通用
    • Characteristics do not match 特性不匹配
  • Complex Neural Networks 复杂神经网络
    • Flexible in adaptation 可灵活适应

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

7 of 83

Perceptron as a Neuron�作为神经细胞的感知器

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

 

 

 

 

 

 

?

8 of 83

Logistic Regression as a Neuron�作为神经细胞的算术回归

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

 

 

 

 

 

 

 

 

9 of 83

Common Neuron �通常的神经细胞

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

 

 

 

 

 

Input (from other neurons)

(来自别的神经细胞的)输入

Weights

权重

Bias

偏向

 

 

 

 

10 of 83

Complex Neural Network�复杂神经网络

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

Neurons connected together 互连的神经细胞

Different types of neurons: different numbers of input connection, different activation function

不同的神经细胞:不同的输入连接数,不同的激活函数

11 of 83

Complex Neural Network�复杂神经网络

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

Neurons connected together 互连的神经细胞

Complex structure of connection 复杂的连接结构

12 of 83

Complex Neural Network�复杂神经网络

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

Neurons connected together 互连的神经细胞

Flexible in adapting to specific applications

能够灵活适应于特定的应用

13 of 83

Handwritten ZIP Code Recognition 手写邮政编码识别

  • Input: 2D Image 输入:两维图像
    • Motivation in a 2D processing�两维处理的动机
      • Previous methods taking an 1D vector�以前的方法输入的都是一维向量
  • Handwritten Numbers 手写数字
    • Strokes compose of local patterns�由局部模式组合而成的笔画
    • Motivation in construction of high level patterns from local patterns�从局部模式中构建高层次模式的动机

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

14 of 83

Convolutional Neural Network�(CNN) 卷积神经网络

  • Complex Networks w/ Many Feed-Forward Layers 多层前向复杂网络
    • Different types of neurons/layers�多种不同的神经细胞/网络层
    • Hierarchical construction of patterns�逐层递进地构建模式
  • Commonly Used in 常用于
    • Detecting/recognizing objects in images�在图像中检测/识别物体

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

15 of 83

2D Operator 二维算子

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

3x3 weights 权重 + bias 偏向

Negative weight 负权重

Positive weight 正权重

3x3 image part

图像局部

Light, negative value

浅色,负值

Dark, positive value

深色,正值

16 of 83

2D Operator 二维算子

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

3x3 image part

图像局部

Multiple each pair of elements accordingly

对应元素相乘

x

Negative x negative positive

负 x 负

3x3 weights 权重 + bias 偏向

17 of 83

2D Operator 二维算子

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

3x3 image part

图像局部

Multiple each pair of elements accordingly

对应元素相乘

x

Negative x negative positive

负 x 负

3x3 weights 权重 + bias 偏向

18 of 83

2D Operator 二维算子

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

3x3 image part

图像局部

Multiple each pair of elements accordingly

对应元素相乘

x

Positive x positive positive

正 x 正

3x3 weights 权重 + bias 偏向

19 of 83

2D Operator 二维算子

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

3x3 image part

图像局部

Multiple each pair of elements accordingly

对应元素相乘

Sum the 9 products together

把9个乘积加起来

Here a quite large positive number

这里和是个很大的正数

3x3 weights 权重 + bias 偏向

Then plus the bias

然后加上偏向

 

 

20 of 83

2D Operator 二维算子

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

3x3 image part

图像局部

Multiple each pair of elements accordingly

对应元素相乘

 

3x3 weights 权重 + bias 偏向

Here a quite large positive number is output

这里输出是一个很大的正数

21 of 83

2D Operator 二维算子

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

Another image part

另一个图像局部

x

Negative x positive negative

负 x 正

3x3 weights 权重 + bias 偏向

22 of 83

2D Operator 二维算子

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

Another image part

另一个图像局部

x

Negative x positive negative

负 x 正

3x3 weights 权重 + bias 偏向

23 of 83

2D Operator 二维算子

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

Another image part

另一个图像局部

x

Positive x negative negative

正 x 负

3x3 weights 权重 + bias 偏向

24 of 83

2D Operator 二维算子

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

Another image part

另一个图像局部

 

x

3x3 weights 权重 + bias 偏向

25 of 83

2D Operator 二维算子

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

Good match

很好的匹配

Positive output

正值输出

3x3 weights 权重 + bias 偏向

26 of 83

2D Operator 二维算子

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

Bad match

不佳的匹配

0 output

输出0

3x3 weights 权重 + bias 偏向

27 of 83

2D Operator 二维算子

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

We hope to learn many different operators

我们希望能够学习到不同的算子

Each corresponds to certain local pattern

每一个对应于某种局部模式

28 of 83

Convolution 卷积

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

16x16

14x14

29 of 83

Convolution 卷积

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

16x16

14x14

30 of 83

Convolution 卷积

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

16x16

14x14

31 of 83

Convolution 卷积

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

16x16

14x14

Operator moves horizontally

算子水平移动

32 of 83

Convolution 卷积

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

16x16

14x14

33 of 83

Convolution 卷积

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

16x16

14x14

Operator moves vertically

算子纵向移动

34 of 83

Convolution 卷积

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

16x16

14x14

Operator scans over the 2D array

算子扫过两维数组

This is called convolution

这叫做卷积

Spatial ‘strength’ of the local pattern

局部模式在不同位置的“强度”

35 of 83

Convolution 卷积

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

16x16

14x14

2nd operator

第二个算子

36 of 83

Convolution 卷积

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

16x16

14x14

37 of 83

Convolution 卷积

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

16x16

14x14

38 of 83

Convolution 卷积

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

16x16

14x14

3rd operator

第三个算子

39 of 83

Convolution 卷积

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

16x16

14x14

40 of 83

Convolution 卷积

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

16x16

14x14

41 of 83

Convolution 卷积

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

16x16

14x14

‘Strengths’ of different local patterns

不同局部模式的“强度”

42 of 83

Tensor Operator 张量算子

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

14x14

3x3xk weights 权重

12x12

x

Multiply accordingly, sum the products, activate through ReLU

对应位置相乘,把积加起来,进行线性整流

Extract high level patterns from local pattern strength map

从局部模式强度图中抽取高层次模式

43 of 83

Tensor Operator 张量算子

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

14x14

3x3xk weights 权重

12x12

x

44 of 83

Tensor Operator 张量算子

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

14x14

3x3xk weights 权重

12x12

x

Convolution w/ a tensor

用张量进行卷积

45 of 83

Tensor Operator 张量算子

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

14x14

Convolution w/ multiple tensor operators

进行多个张量算子卷积

12x12

46 of 83

Pooling 积聚

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

2x2 max pooling 最大值积聚

12x12

Take the maximum value between the 4

取4个中的最大值

6x6

Down-sample the pattern map: reduce dimensionality

模式图降采样:降低维度

47 of 83

Pooling 积聚

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

2x2 max pooling 最大值积聚

12x12

Move horizontally w/o overlapping

不重叠地横向移动

6x6

48 of 83

Pooling 积聚

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

2x2 max pooling 最大值积聚

12x12

Move vertically w/o overlapping

不重叠地纵向移动

6x6

No weights: not to be learned

没有权重:不需要进行学习

49 of 83

Fully-Connected Layer 全连通层

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

Each output neuron has its weights & bias

每个输出神经细胞有自己的权重和偏向

Input 输入

Output 输出

 

50 of 83

Fully-Connected Layer 全连通层

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

Each input neuron is connected w/ each output neuron

每个输入神经细胞都和每个输出神经细胞相连

Input 输入

Output 输出

 

51 of 83

Fully-Connected Layer 全连通层

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

So-called ‘fully-connected layer’, sometimes ‘dense layer’

所谓全连通层,有时又称密集层

Input 输入

Output 输出

 

52 of 83

Softmax Layer 指数归一化层

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

Similar to a fully-connected layer, but the output is different

和全连通层类似,但输出不同

Input 输入

Output 输出

53 of 83

Softmax Layer 指数归一化层

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

First apply a negative exponential function

首先应用一个负指数函数

Input 输入

Output 输出

 

 

 

54 of 83

Softmax Layer 指数归一化层

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

Input 输入

Output 输出

 

 

 

 

Normalize their sum to 1

将它们之和归一化为1

55 of 83

Softmax Layer 指数归一化层

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

Input 输入

Output 输出

 

 

 

 

One may regard the output as a probability value (recall the alternative margin)

可以把输出看成一个概率值(回忆一下别样的间距)

 

56 of 83

Softmax Layer 指数归一化层

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

Input 输入

Output 输出

 

 

 

 

Typically the last layer: each neuron corresponds to a class label

通常是最后一层:每个神经细胞对应与一个类别

 

 

57 of 83

CNN for USPS�用于USPS的卷积神经网络

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

3x3 convolution

卷积

16x16

14x14

58 of 83

CNN for USPS�用于USPS的卷积神经网络

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

16x16

32 convolution operations

32个卷积操作

14x14

14x14x32

3x3x32 convolution

卷积

12x12

59 of 83

CNN for USPS�用于USPS的卷积神经网络

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

16x16

64 convolution operations

64个卷积操作

14x14

14x14x32

12x12

12x12x64

2x2 max pooling

最大值积聚

6x6x64=2304

60 of 83

CNN for USPS�用于USPS的卷积神经网络

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

16x16

14x14

14x14x32

12x12

12x12x64

2304

0~9

Fully-connected

全连通

Softmax

指数归一化

128

61 of 83

CNN for USPS�用于USPS的卷积神经网络

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

16x16

14x14

14x14x32

12x12

12x12x64

2304

0~9

Fully-connected

全连通

Softmax

指数归一化

128

Convolution卷积

Convolution卷积

Pooling

积聚

Application-specific: considering hand-written ZIP code image characteristics

特定应用:考虑了手写邮政编码图像特性

62 of 83

CNN for USPS�用于USPS的卷积神经网络

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

16x16

14x14

14x14x32

12x12

12x12x64

2304

0~9

Fully-connected

全连通

Softmax

指数归一化

128

Convolution卷积

Convolution卷积

Pooling

积聚

General-purpose 通用

63 of 83

CNN Training 训练卷积神经网络

  • Similar to Logistic Regression�和算术回归类似
    • Minimize the loss (MLE); recall the alternative margin 损失最小化(极大似然估计);回忆一下别样的间距
      • Gradient descent 梯度下降
  • But It Is More Complicated�但它更复杂
    • Needs many additional tricks �需要很多额外技巧

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

64 of 83

Gradient calculation 梯度计算

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

16x16

14x14

14x14x32

12x12

12x12x64

2304

0~9

Fully-connected

全连通

Softmax

指数归一化

128

Convolution卷积

Convolution卷积

Pooling

积聚

Forward calculation of loss

前向计算损失

 

 

65 of 83

Gradient calculation 梯度计算

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

16x16

14x14

14x14x32

12x12

12x12x64

2304

0~9

Fully-connected

全连通

Softmax

指数归一化

128

Convolution卷积

Convolution卷积

Pooling

积聚

Back-propagation of gradient

梯度方向传播

 

 

 

 

66 of 83

Should You Still Remember�如果你还记得

Machine Learning – Basic Principles & Practice: 8. An Alternative Margin

机器学习 – 基础原理与实践:8. 别样的间距

Gradient descent in logistic regression

算术回归中的梯度下降

It works because there is only 1 local minimum

它能成功,因为那里只有一个局部最小值

67 of 83

Trap 陷阱

Machine Learning – Basic Principles & Practice: 8. An Alternative Margin

机器学习 – 基础原理与实践:8. 别样的间距

In CNN training, there are many local minima

在卷积神经网络的训练中,存在很多局部最优

Trivial gradient descent will likely be trapped in a local minimum

which is much worse than the global minimum

平凡的梯度下降很容易陷入一个比全局最小差很多的局部最小

Global minimum

全局最优

Trapped in

local minimum

陷入局部最优

68 of 83

Stochastic Gradient Descent�随机梯度下降

Machine Learning – Basic Principles & Practice: 8. An Alternative Margin

机器学习 – 基础原理与实践:8. 别样的间距

Randomly shuffle the training data

随机打乱训练数据的顺序

Many heuristics 很多试探式的方法

Each time take a small batch of data samples for gradient descent

每次取一小批数据进行梯度下降

Also incorporate other tricks, e.g., momentum, learning rate decay

还加入了其他的技巧,例如惯性和学习速度的衰减

We are not going through all the details,

but just taking an informed idea

我们不关注所有的细节了,只作一个简单的了解

69 of 83

Practice 11.1 实践11.1

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

How does it perform? 它的表现如何?

Practice time: try the convolutional neural network on handwritten ZIP code recognition�实践时刻:尝试用卷积神经网络进行手写邮政编码识别

70 of 83

Dropout 信息遗失

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

Part of a complex network

复杂网络中的一部分

Fully-connected

全连通

Softmax

指数归一化

71 of 83

Dropout 信息遗失

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

In training, in the middle layer, randomly disable 50% of the neurons

在训练时,在中间层随机禁用50%的神经细胞

Fully-connected

全连通

Softmax

指数归一化

72 of 83

Dropout 信息遗失

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

Forward calculation of loss

前向计算损失

Fully-connected

全连通

Softmax

指数归一化

73 of 83

Dropout 信息遗失

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

Back-propagation of gradient

梯度反向传播

Fully-connected

全连通

Softmax

指数归一化

74 of 83

Dropout 信息遗失

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

Fully-connected

全连通

Softmax

指数归一化

In a new batch, again randomly disable 50% of the neurons

在新的一批,再度随机禁用50%的神经细胞

75 of 83

Dropout 信息遗失

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

Fully-connected

全连通

Softmax

指数归一化

Forward calculation of loss

前向计算损失

76 of 83

Dropout 信息遗失

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

Fully-connected

全连通

Softmax

指数归一化

Back-propagation of gradient

梯度反向传播

77 of 83

Dropout 信息遗失

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

Restore all the neurons after finishing training

训练结束后恢复所有的神经细胞

Fully-connected

全连通

Softmax

指数归一化

78 of 83

Dropout 信息遗失

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

However, for each neuron in the last layer, the connection number changes from 2 to 4

然而,对最后一层的神经细胞而言,连接数从2变到4

Fully-connected

全连通

Softmax

指数归一化

79 of 83

Dropout 信息遗失

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

Therefore we need to scale the weights down by half 所以我们需要把权重减半

Fully-connected

全连通

Softmax

指数归一化

80 of 83

Why to Use Dropout�为什么要用信息遗失

  • Complex Networks 复杂网络
    • Strong classification capability �分类能力很强
      • Likely to overfit 容易过度拟合
  • Dropout 信息遗失
    • Implicitly create different learning algorithm instances �隐式地构造出不同的学习实例
    • Ensemble learning 合奏学习
      • Wash out unluckiness 消除坏运气

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

81 of 83

Practice 11.2 实践11.2

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

How does it perform? 它的表现如何?

Practice time: try convolutional neural network w/ dropout on handwritten ZIP code recognition�实践时刻:尝试用信息遗失卷积神经网络进行手写邮政编码识别

82 of 83

Results 结果

Machine Learning – Basic Principles & Practice: 11. Complex Networks

机器学习 – 基础原理与实践:11. 复杂网络

Methods 方法

Error rate 错误率

Nearest neighbor

最近邻

4.98%

Linear perceptron

线性感知器

8.52%

Perceptron – RBF

感知器 – 径向基函数核计算

4.43%

Perceptron – Polynomial

感知器 – 多项式核计算

4.24%

CNN

卷积神经网络

3.64%

CNN w/ dropout

带信息遗失的卷积神经网络

3.29%

83 of 83

The End