1 of 15

Scaling Up Your Kernels to 31x31 Revisiting Large Kernel Design in CNNS

2 of 15

Why large kernels

The reason why ViT performs well is because of its global view

Then what if CNN gets a global view?

3 of 15

Why before we use small kernels

Why we use small kernels?

Multiple small kernels can achieve the same receptive fields as large kernels

4 of 15

Key Points

Apply large kernels on CNN to gain the similar performance of Swin

1.Large depth-wise convolution

2.Identity shortcut

3.Re-parameterization with small kernels

5 of 15

Key Points

6 of 15

Illustration

1.Large ERF

2.Shape instead of texture

7 of 15

Model

8 of 15

Re-parametrization

RepVGG: Making VGG-style ConvNets Great Again

Why reparam?

Easy to train

Save memory usage

9 of 15

Re-parametrization

BN Param

Conv Param

Input/Output

Refined Conv and Bias

10 of 15

Re-parametrization

11 of 15

Ablation

12 of 15

Results - CLS

13 of 15

Results – Object Detection

14 of 15

Results - Seg

15 of 15

Thank You