Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection
动机
- one-stage detectors
- dense prediction
- three fundamental elements
- class branch
- box localization branch
- an individual quality branch to estimate the quality of localization
- current problems
- the inconsistent usage of the quality estimation in train & test
- the inflexible Dirac delta distribution:将box regression的value建模成真值附近的脉冲分布,用来描述边界不清晰/遮挡的case可能不准确
- we design new representations for these three elements
- merge quality estimation into class prediction:将objectness/centerness整合进cls prediction,直接用作NMS score
- continout labels
- propose GFL(Generalized Focal Loss) that generalizes Focal Loss from discrete form into continous version
- test on COCO
- ResNet-101-?-GFL: 45.0% AP
- defeat ATSS
- one-stage detectors
论点
inconsistent usage of localization quality estimation and classification score
- 训练的时候quality和cls branch是independent branch
- box branch的supervision只作用在positive样本上:which is unreliable on predicting negatives
- 测试阶段将quality和cls score乘起来有可能拉高负样本的分数,以至于在NMS阶段把低分正样本挤掉
inflexible representation of bounding boxes
- most method建模成脉冲分布:只在IoU大于一定阈值的格子上有响应,别的格子都是0
- some recent work建模成高斯分布
- in fact the real distribution can be more arbitrary and flexible,连续且不严格镜像
thus we propose
merge the quality representation into the class branch:
- class vector的每个元素代表了格子的localization quality(如IoU score)
在inference阶段也是直接用作cls score
propose arbitrary/general distribution
- 有明确边界的目标的边的分布是比较sharp的
没有明显边界的边分布就是flatten一点
Generalized Focal Loss (GFL)
- joint class representation是continuous IoU label (0∼1)
- imbalance问题仍然存在,但是standart Focal Loss仅支持[0,1] sample
- 修改成continuous形式,同时specialized into Quality Focal Loss (QFL) and Distribution Focal Loss (DFL)
- QFL for cls branch:focuses on a sparse set of hard examples
- DFL for box branch: focus on learning the probabilities of values around the continuous target locations
方法
Focal Loss (FL)
- standard CE part:$-log(p_t)$
- scaling factor:down-weights the easy examples,focus on hard examples
Quality Focal Loss (QFL)
soft one-hot label:正样本在对应类别上有个(0,1]的float score,负样本全0
float score定义为预测框和gt box的IoU score
we adopt multiple binary classification with sigmoid
modify FL
- CE part 改成complete form:$-ylog(\hat y)-(1-y)log(1-\hat y)$
- scaling part用vector distance替换减法:$|y-\hat y |^{\beta}$
$\beta$ controls the down-weighting rate smoothly & $\beta=2$ works best
Distribution Focal Loss (DFL)
use relative offsets from the location to the four sides of a bounding box as the regression targets
回归问题formulation
- 连续:$\hat y = \int_{y_0}^{y_n}P(x)xdx$
- 离散化:$\hat y = \sum_{i=0}^n P(y_i)y_i$
- P(x) can be easily implemented through a softmax layer containing n+1 units:
DFL
- force predictions to focus values near label $y$:explicitly enlarge the probabilities of $y_i$和$y_{i+1}$,given $y_i \leq y \leq y_{i+1}$
- $log(S_i)$ force the probabilities
gap balance the 上下限,使得$\hat y$的global mininum solution无限逼近真值$y$,如果真值接近的是$\hat y_{i+1}$,可以看到$log(S_i)$那项被downscale了
Generalized Focal Loss (GFL)
以前的cls preditions在测试阶段要结合quality predictions作为NMS score,现在直接就是
以前regression targets每个回归一个值,现在是n+1个值
overall
- 第一项cls loss,就是QFL,dense on 所有格子,用正样本数去norm
- 第二项box loss,GIoU loss + DFL,$\lambda_0$默认2,$\lambda_1$默认1/4,只计算有IoU的格子
- we also utilize the quality scores to weight $L_B$ and $L_D$ during training
彩蛋
IoU branch always superior than centerness-branch
centerness天生值较小,影响召回,IoU的值较大