Scaling Up Your Kernels to 31x31 Revisiting Large Kernel Design in CNNS
Why large kernels
The reason why ViT performs well is because of its global view
Then what if CNN gets a global view?
Why before we use small kernels
Why we use small kernels?
Multiple small kernels can achieve the same receptive fields as large kernels
Key Points
Apply large kernels on CNN to gain the similar performance of Swin
1.Large depth-wise convolution
2.Identity shortcut
3.Re-parameterization with small kernels
Key Points
Illustration
1.Large ERF
2.Shape instead of texture
Model
Re-parametrization
RepVGG: Making VGG-style ConvNets Great Again
Why reparam?
Easy to train
Save memory usage
Re-parametrization
BN Param
Conv Param
Input/Output
Refined Conv and Bias
Re-parametrization
Ablation
Results - CLS
Results – Object Detection
Results - Seg
Thank You