soft teacher

keywords:semi-supervised, curriculum, pseudo labels,

End-to-End Semi-Supervised Object Detection with Soft Teacher

  1. 动机

    • end-to-end training:相比较于其他方法的multi-stage

    • semi-supervised:用外部unlabeled数据,以及pseudo-label based approach

    • propose two techniques

      • soft teacher mechanism:pseudo样本的classification loss用teacher model的prediction score来加权
      • box jittering mechanism:挑选reliable pseudo boxes
    • verified

      • use SWIN-L as baseline
      • metric on COCO:60.4 mAP
      • if pretrained with Object365:61.3 mAP

  2. 论点

    • we present this end-to-end pseudo-label based semi-supervised object detection framework
      • simultaneously performs
        • pseudo-labeling:teacher
        • training detector use the current pseudo-labels & a few training sample:student
      • teacher is an exponential moving average (EMA) of the student model
      • mutually enforce each other
      • soft teacher approach
        • teacher model的作用是给student model生成的box candidates打分,
        • 高于一定阈值的为前景,但是可能有部分前景被归类为背景,所以用这个score作为reliability measure,给标记为背景框的cls loss进行加权
      • reliability measure
  3. 方法

    • overview

      • 两个model:student和teacher
      • teacher model用来生成pseudo labels:two set of pseudo boxes,一个用于class branch,一个用于regression branch
      • student model用supervised&unsupervised sample的loss来更新
      • teacher model用student model的EMA来更新
      • two crucial designs
        • soft teacher
        • box jittering
      • 整体的工作流程就是,每个training iteration,先按照一定比例抽取labeled&unlabeled sample构成data batch,然后用teacher model生成unlabeled data的pseudo label(thousands of box candidates+NMS+score filter),然后将其作为unlabeled sample的ground truth,训练student model,overall loss是supervised loss和unsupervised loss的加权和
      • 在训练开始阶段,两个模型都是随机初始化的,teacher模型随着student模型的更新而更新
      • FixMatch:
        • 输入给teacher模型的样本使用weak aug
        • 输入给student模型的样本使用strong aug
    • soft teacher

      • detector的pseudo-label质量很重要

      • 所以用score thresh=0.9去定义box candidates的前/背景

      • 但是这时候如果用传统的IoU来定义student model的box candidates的pos/neg,会有一部分前景框被当作背景

      • to alleviate

        • assess the reliability of each student-generated box candidate to be a real background
        • given a student-generated box candidate,用teacher model的detection head去预测这个框的background score
      • overall unsupervised cls loss

        • $G_{cls}$是the set of boxes teacher generated for classification,就是teacher model预测的top1000经过nms和score filter之后的boxes
        • $b_i^{fg}$是student candidates中被assign为前景的框,$b_i^{bg}$是student candidates中被assign为背景的框,assign的原则就是score>0.9
        • $w_j$是对assign为背景的框的加权
        • $r_k$是reliability score,是student model通过hard score thresh assign为背景的框,用teacher model的detection head去预测的bg score
    • box jittering

      • fg score thresh和box iou并不呈现strong positive correlation,说明基于这个原则产生的框pseudo-labels并不一定适合box regression

      • localization reliability:

        • 衡量一个pseudo box的consistency
        • given a pseudo box,sample一系列jitter box around it,再用teacher model去预测这些jitter box得到refined boxes
        • refined box和pseudo box的variance越小,说明这个框的localization reliability越高
        • $\hat b_i$是refined boxes
        • $\sigma_k$是refine boxes的四个坐标基于原始box的标准差
        • $\hat \sigma_k$是上面那个标准差基于原始box的尺度进行归一化
        • $\overline\sigma$是refine boxes四个坐标的normed std的平均值
        • 只计算teacher box candidates里面,fg score>0.5的那部分
      • overall unsupervised reg loss

        • $b_i^{fg}$是student candidates中被assign为前景的框,即cls score>0.9那些预测框
        • $G_{cls}$是the set of boxes teacher generated for regression,就是jittered reliability大于一定阈值的candidates
    • overall unsupervised loss:cls loss和reg loss之和,然后用样本数进行norm

  4. 实验