verseg

  1. challenge

    Large Scale Vertebrae Segmentation Challenge

    • task1:Vertebra Labelling,关键点检测
    • task2:Vertebra Segmentation,多类别分割
  2. data

    1. variation:数据affine轴不统一,尺寸不统一,扫描范围不统一,FOV区域不统一
    2. nii的两大解析工具:nibabel库load data的xyz顺序与axcode的顺序一致,e.g.[‘R’,’A’,’S’]的orientation会得到xyz的array,而sitk的读取刚好反过来,sitk的arr会是zyx。我们之前在将dicom写入nii时,会指定一个不为np.eye(4)的affine,就是为了transpose这三个轴。
  3. model

    1. team paper \

      • 三阶段:第一阶段,due to large variation FOV of the dataset,粗分割定位脊柱位置,第二阶段,higher resolution多类别关键点定位center,获得each located vertebra,第三阶段,二类分割for each located vertebra。

      • keywords:1. uniform voxel spacing:不要随意resize,todo: trilinear interp;2. on-the-fly data augmentation:using SimpleITK

      • 第一阶段:Spine Localization

        • Unet
        • regress the Gaussian heatmap of spinal centerline
        • L2-loss
        • uniform voxel spacing of 8mm
        • input shape:[64,64,128],pad?

      • 第二阶段:Vertebrae Localization

        • SpatialConfiguration-Net
        • regress each located vertebra‘s heatmap in individual channel
        • resampling:bi/tricubic interpolation
        • norm:maxmin on the whole dataset
        • uniform voxel spacing of 2mm
        • input shape:[96,96,128],z-axis random crop,xy-plane use ROI from stage1

      • 第三阶段:Vertebrae Segmentation

        • Unet
        • binary segment the mask of each vertebrae
        • sigmoid ce-loss
        • uniform voxel spacing of 1mm
        • input shape:[128,128,96],crop origin image & heatmap image based on centroids

    2. reference paper\

      • 核心贡献:1.MIP:combines the information across reformations,3D to 2D,2. 基于判别器的训练机制:encodes local spine structure as an anatomical prior,加固椎块间类别&位置的spacial information

      • MIP:
        • localisation and identification rely on a large context
        • large receptive field
        • in full-body scans where spine is not spatially centred or is obstructed by the ribcage, such cases are handled with a pre-processing stage detecting the occluded spine
      • adversarial learning:

        • FCN用于分割
        • AE用于评估分割的好坏
        • do not ‘pre-train’ it (the AE)
        • loss:an anatomically-inspired supervision instead of the usual binary adversarial supervision (vanilla GAN)
      • 先说FCN——Btrfly Network

        • 建模成回归问题,每个关键点对应一个通道的高斯heatmap,背景channel为$1-max_i (y_i)$

        • 双输入双输出(sagittal & coronal)

        • 两个视角的feature map在网络深层做了融合,to learn their inter-dependency

        • Batch- normalisation is used after every convolution layer, along with 20% dropout in the fused layers of Btrfly

        • loss:l2 distance + weighted ce

          $\omega$ is the median frequency weighing map, boosting the learning of less frequent classes(ECB)

      • 再说判别器——Energy-based adversary for encoding prior

        • fully-convolutional:its predictions across voxels are independent of each other owing to the spatial invariance of convolutions

        • to impose the anatomical prior of the spine’s shape onto the Btrfly net

        • look at $\hat{Y}_{sag}$ and $\hat{Y}_{cor}$ as a 3D volume and employ a 3D AE with a receptive field covering a part of the spine

        • $\hat{Y}_{sag}$ consists of Gaussians:less informative than an image, avoid using max-pooling by resorting to average pooling

        • employ spatially dilated convolution kernels

        • mission of AE:predict the l2 distance of input and its reconstruction, it learns to discriminate by predicting a low E for real annotations, while G learns to generate annotations that would trick D

      • inference:

        • The values below a threshold (T) are ignored in order to remove noisy predictions
        • 用外积,$\hat{Y}=\hat{Y}_{sag}\otimes\hat{Y}_{cor}$
        • 每个channel的最大值作为centroids
      • experiments

        • 【IMPORTANT】10 MIPs are obtained from one 3D scan per view, each time randomly choosing half the slices of interest
        • 对于每个视角,每次随机抽取一半数目的slice用于计算MIP
  • similar local appearance:

  • strong spatial configuration:凡是涉及到椎块-wise的信息,从全局信息入手