一些metric loss特点的总结:
* margin-loss:样本与自身类簇心的距离要小于样本与其他类簇心的距离——标准center loss
* intra-loss:对样本和对应类簇心的距离做约束——小于一定距离
* inter-loss:对样本和其他类簇心的距离做约束——大于一定距离
* triplet-loss:样本与同类样本的距离要小于样本与其他类样本的距离

ArcFace: Additive Angular Margin Loss for Deep Face Recognition
动机
- 场景:人脸,
 - 常规要素:
- hypersphere:投影空间
 - metric learning:距离(Angles/Euclidean) & class centres
 
 - we propose
- an additive angular margin loss:ArcFace
 - has a clear geometric interpretation
 - SOTA on face & video datasets
 
 
论点
- face recognition
- given face image
 - pose normalisation
 - Deep Convolutional Neural Network (DCNN)
 - into feature that has small intra-class and large inter-class distance
 
 - two main lines
- train a classifier:softmax
- 最后的分类层参数量与类别数成正比
 - not discriminative enough for the open-set
 
 - train embedding:triplet loss
- triplet-pair的数量激增,大数据集的iterations特别多
 - sampling mining很重要,对精度&收敛速度
 
 
 - train a classifier:softmax
 - to enhance softmax loss
- center loss:在分类的基础上,压缩feature vecs的类内距离
 - multiplicative angular margin penalty:类特别多的时候,center就不好更新了,用last fc weights能够替代center,但是会不稳定
 - CosFace:直接计算logit的cosine margin penalty,better & easier
 
 - ArcFace
- improve the discriminative power
 - stabilise the training meanwhile
 - margin-loss:Distance(类内)+m < Distance(类间)
 - 核心idea:normed feature和normed weights的dot product等价于在求他俩的 cosine distance,我们用arccos就能得到feature vec和target weight的夹角,给这个夹角加上一个margin,然后求回cos,作为pred logit,最后softmax
 
 
- face recognition
 方法
ArcFace
transitional softmax

- not explicitly enforce intra-class similarity & inter-class diversity
 - 对于类内variations大/large-scale测试集的场景往往有performance gap
 
our modification
fix the bias $b_j=0$ for simplicity
transform the logit $W_j^T x=||W_j||\ ||x||cos\theta_j$,$\theta_j$是weight $W_j \in R^d$和样本feature $x \in R^d$的夹角
fix the $||W_j||$ by l2 norm:$||W_j||=1$
fix the embedding $||x||$ by l2 norm and rescale: $||x||=s$
thus only depend on angle:这使得feature embedding分布在一个高维球面上,最小化与gt class的轴(对应channel的weight vec,也可以看作class center)夹角

add an additive angular margin penalty:simultaneously enhance the intra-class compactness and inter-class discrepancy

作用
- softmax produce noticeable ambiguity in decision boundaries
 ArcFace loss can enforce a more evident gap

pipeline

实现