Generalized Focal Loss

Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection

动机
- one-stage detectors
  - dense prediction
  - three fundamental elements
    - class branch
    - box localization branch
    - an individual quality branch to estimate the quality of localization
- current problems
  - the inconsistent usage of the quality estimation in train & test
  - the inflexible Dirac delta distribution：将box regression的value建模成真值附近的脉冲分布，用来描述边界不清晰/遮挡的case可能不准确
- we design new representations for these three elements
  - merge quality estimation into class prediction：将objectness/centerness整合进cls prediction，直接用作NMS score
  - continout labels
  - propose GFL(Generalized Focal Loss) that generalizes Focal Loss from discrete form into continous version
- test on COCO
  - ResNet-101-?-GFL: 45.0% AP
  - defeat ATSS
论点
- inconsistent usage of localization quality estimation and classification score
  - 训练的时候quality和cls branch是independent branch
  - box branch的supervision只作用在positive样本上：which is unreliable on predicting negatives
  - 测试阶段将quality和cls score乘起来有可能拉高负样本的分数，以至于在NMS阶段把低分正样本挤掉
- inflexible representation of bounding boxes
  - most method建模成脉冲分布：只在IoU大于一定阈值的格子上有响应，别的格子都是0
  - some recent work建模成高斯分布
  - in fact the real distribution can be more arbitrary and flexible，连续且不严格镜像
- thus we propose
  - merge the quality representation into the class branch：
    - class vector的每个元素代表了格子的localization quality(如IoU score)
    - 在inference阶段也是直接用作cls score
  - propose arbitrary/general distribution
    - 有明确边界的目标的边的分布是比较sharp的
    - 没有明显边界的边分布就是flatten一点
  - Generalized Focal Loss (GFL)
    - joint class representation是continuous IoU label (0∼1)
    - imbalance问题仍然存在，但是standart Focal Loss仅支持[0,1] sample
    - 修改成continuous形式，同时specialized into Quality Focal Loss (QFL) and Distribution Focal Loss (DFL)
      - QFL for cls branch：focuses on a sparse set of hard examples
      - DFL for box branch： focus on learning the probabilities of values around the continuous target locations
方法
- Focal Loss (FL)
  - standard CE part：$-log(p_t)$
  - scaling factor：down-weights the easy examples，focus on hard examples
- Quality Focal Loss (QFL)
  - soft one-hot label：正样本在对应类别上有个(0,1]的float score，负样本全0
  - float score定义为预测框和gt box的IoU score
  - we adopt multiple binary classification with sigmoid
  - modify FL
    - CE part 改成complete form：$-ylog(\hat y)-(1-y)log(1-\hat y)$
    - scaling part用vector distance替换减法：$|y-\hat y |^{\beta}$
    - $\beta$ controls the down-weighting rate smoothly & $\beta=2$ works best
- Distribution Focal Loss (DFL)
  - use relative offsets from the location to the four sides of a bounding box as the regression targets
  - 回归问题formulation
    - 连续：$\hat y = \int_{y_0}^{y_n}P(x)xdx$
    - 离散化：$\hat y = \sum_{i=0}^n P(y_i)y_i$
    - P(x) can be easily implemented through a softmax layer containing n+1 units：
  - DFL
    - force predictions to focus values near label $y$：explicitly enlarge the probabilities of $y_i$和$y_{i+1}$，given $y_i \leq y \leq y_{i+1}$
    - $log(S_i)$ force the probabilities
    - gap balance the 上下限，使得$\hat y$的global mininum solution无限逼近真值$y$，如果真值接近的是$\hat y_{i+1}$，可以看到$log(S_i)$那项被downscale了
- Generalized Focal Loss (GFL)
  - 以前的cls preditions在测试阶段要结合quality predictions作为NMS score，现在直接就是
  - 以前regression targets每个回归一个值，现在是n+1个值
  - overall
    - 第一项cls loss，就是QFL，dense on 所有格子，用正样本数去norm
    - 第二项box loss，GIoU loss + DFL，$\lambda_0$默认2，$\lambda_1$默认1/4，只计算有IoU的格子
    - we also utilize the quality scores to weight $L_B$ and $L_D$ during training
彩蛋
- IoU branch always superior than centerness-branch
  - centerness天生值较小，影响召回，IoU的值较大