RFB | Less is More

RFB: Receptive Field Block Net for Accurate and Fast Object Detection

动机
- RF block：Receptive Fields
- strengthen the lightweight features using a hand-crafted mechanism：轻量，特征表达能力强
- assemble RFB to the top of SSD
论点
- lightweight
  - enhance feature representation
- 人类
  - 群智感受野（pRF）的大小是其视网膜图中偏心率的函数
  - 感受野随着偏心率而增加
  - 更靠近中心的区域在识别物体时拥有更高的比重或作用
  - 大脑在对于小的空间变化不敏感
- fixed sampling grid (conv)
  - probably induces some loss in the feature discriminability as well as robustness
- inception
  - RFs of multiple sizes
  - but at the same center
- ASPP
  - with different atrous rates
  - the resulting feature tends to be less distinctive
- Deformable CNN
  - sampling grid is flexible
  - but all pixels in an RF contribute equally
- RFB
  - varying kernel sizes
  - applies dilated convolution layers to control their eccentricities
  - 组合来模拟human visual system
  - concat
  - 1x1 conv for fusion
- main contributions
  - RFB module: enhance deep features of lightweight CNN networks
  - RFB Net: gain on SSD
  - assemble on MobileNet
方法
- Receptive Field Block
  - 类似inception的multi-branch
  - dilated pooling or convolution layer
- RFB Net
  - SSD-base
  - 头上有较大分辨率的特征图的conv层are replaced by the RFB module
  - 特别头上的conv层就保留了，因为their feature maps are too small to apply filters with large kernels like 5 × 5
  - stride2 module：每个conv stride2，那id path得变成1x1 conv？