ArcFace

一些metric loss特点的总结：

* margin-loss：样本与自身类簇心的距离要小于样本与其他类簇心的距离——标准center loss
* intra-loss：对样本和对应类簇心的距离做约束——小于一定距离
* inter-loss：对样本和其他类簇心的距离做约束——大于一定距离
* triplet-loss：样本与同类样本的距离要小于样本与其他类样本的距离

ArcFace: Additive Angular Margin Loss for Deep Face Recognition

动机
- 场景：人脸，
- 常规要素：
  - hypersphere：投影空间
  - metric learning：距离(Angles/Euclidean) & class centres
- we propose
  - an additive angular margin loss：ArcFace
  - has a clear geometric interpretation
  - SOTA on face & video datasets
论点
- face recognition
  - given face image
  - pose normalisation
  - Deep Convolutional Neural Network (DCNN)
  - into feature that has small intra-class and large inter-class distance
- two main lines
  - train a classifier：softmax
    - 最后的分类层参数量与类别数成正比
    - not discriminative enough for the open-set
  - train embedding：triplet loss
    - triplet-pair的数量激增，大数据集的iterations特别多
    - sampling mining很重要，对精度&收敛速度
- to enhance softmax loss
  - center loss：在分类的基础上，压缩feature vecs的类内距离
  - multiplicative angular margin penalty：类特别多的时候，center就不好更新了，用last fc weights能够替代center，但是会不稳定
  - CosFace：直接计算logit的cosine margin penalty，better & easier
- ArcFace
  - improve the discriminative power
  - stabilise the training meanwhile
  - margin-loss：Distance(类内)+m < Distance(类间)
  - 核心idea：normed feature和normed weights的dot product等价于在求他俩的 cosine distance，我们用arccos就能得到feature vec和target weight的夹角，给这个夹角加上一个margin，然后求回cos，作为pred logit，最后softmax
方法
- ArcFace
  - transitional softmax
    - not explicitly enforce intra-class similarity & inter-class diversity
    - 对于类内variations大/large-scale测试集的场景往往有performance gap
  - our modification
    - fix the bias $b_j=0$ for simplicity
    - transform the logit $W_j^T x=||W_j||\ ||x||cos\theta_j$，$\theta_j$是weight $W_j \in R^d$和样本feature $x \in R^d$的夹角
    - fix the $||W_j||$ by l2 norm：$||W_j||=1$
    - fix the embedding $||x||$ by l2 norm and rescale： $||x||=s$
    - thus only depend on angle：这使得feature embedding分布在一个高维球面上，最小化与gt class的轴（对应channel的weight vec，也可以看作class center）夹角
    - add an additive angular margin penalty：simultaneously enhance the intra-class compactness and inter-class discrepancy
    - 作用
      - softmax produce noticeable ambiguity in decision boundaries
      - ArcFace loss can enforce a more evident gap
  - pipeline
  - 实现