PANet

PANet: Path Aggregation Network for Instance Segmentation

  1. 动机

    • boost the information flow
    • bottom-up path
      • shorten information path
      • enhance accurate localization
    • adaptive feature pooling
      • aggregate all levels
      • avoiding arbitrarily assigned results
    • mask prediction head
      • fcn + fc
      • captures different views, possess complementary properties
    • subtle extra computational
  2. 论点

    • previous skills: fcn, fpn, residual, dense
    • findings
      • 高层特征类别准,底层特征定位准,但是高层和底层特征之间的path太长了,不利于双高
      • past proposals make predictions based on one level
    • PANet
      • bottom-up path
        • shorten information path
        • enhance accurate localization
      • adaptive feature pooling
        • aggregate all levels
        • avoiding arbitrarily assigned results
      • mask prediction head
        • fcn + fc
        • captures different views, possess complementary properties
  3. 方法

    • framework

      • b: bottom-up path
      • c: adaptive feature pooling
      • e: fusion mask branch
    • bottom-up path

      • fpn’s top-down path:
        • to propagate strong semantical information
        • to ensure reasonable classification capability
        • long path: red line, 100+ layers
      • bottom-up path:
        • enhances the localization capability
        • short path: green line, less than 10 layers
      • for each level $N_l$

        • input: $N_{l+1}$ & $P_l$
        • $N_{l+1}$ 3x3 conv & $P_l$ id path - add - 3x3 conv
        • channel 256
        • ReLU after conv

    • adaptive feature pooling

      • pool features from all levels, then fuse, then predict

      • steps

        • map each proposal to all feature levels
        • roi align
        • go through one layer of the following sub-networks independently
        • fusion operation (element-wise max or sum)
        • 例如,box branch是两个fc层,来自各个level的roi align之后的proposal features,先各自经过一个fc层,再share the following till the head,mask branch是4个conv层,来自各个level的roi align之后的proposal features,先各自经过一个conv层,再share the following till the head

      • fusion mask branch

        • fc layers are location sensitive
        • helpful to differentiate instances and recognize separate parts belonging to the same object
        • conv分支
          • 4个连续conv+1个deconv:3x3 conv,channel256,deconv factor=2
          • predict mask of each class:output channel n_classes
        • fc分支
          • from conv分支的conv3输出
          • 2个连续conv,channel256,channel128
          • fc,dim=28x28,特征图尺寸,用于前背景分类
        • final mask:add

  4. 实验

    • heavier head
      • 4 consecutive 3x3 convs
      • shared among reg & cls
      • 在multi-task的情况下,对box的预测有效