Over the past few years, many efforts [8, 9, 10] have been proposed to segment weak boundary objects. Among which, active contour (AC) methods are powerful with the ability to adapt their geometry and incorporate prior knowledge about the structure of interest. For instance, Level Set (LS) [11], an implementation of AC using energy functional minimization [12] has been proven to overcome the limitations of uniquely gradient-based models, especially with datasets suffering from noise and low contrast. Recently, [1] utilized LS [11] into DL framework for medical images segmentation with the energy terms are computed using constants mean values for inside and outside contour on the entire image domain. Our proposed network makes use of LS as an attention gate on narrow band around the contour and the mean values of inside and outside contour are computed with the deep feature map. Besides, the unbalanced data problem in medical image segmentation has been getting serious attention [13]. One solution is a boundary loss, a distance metric on the space of contours [13] implemented between single pixel on the contour which is time consuming. Hence, our proposed NB-AC focuses on the contour length and narrow band energy, treating the object contour as a hyperplane and all data inside a narrow band supports information for the position and orientation of the hyperplane.
To address the above problems, we make use of the advantages of LS [11], propose a two-branch deep network explicitly taking into account both higher (object region via a classical CNN, an encoder-decoder network structure) and lower level features (object shape with narrow band around the contour). The proposed loss for our NB-AC attention model contains two fitting terms: (i) the length of the contour; (ii) the narrow band energy formed by homogeneity criterion in both inner and outer band neighboring the evolving curve or surface (Fig. 1). The higher level feature is connected to the lower level feature through our proposed transitional gates and both are designed in an end-to-end architecture processed simultaneously. Thus, our loss function pays attention to both region information and support information at the boundary’s narrow band. We consider the object contour as a hyperplane whereas information in the inner and outer bands as a supporter influencing the position and direction of the hyperplane. The keys to our architecture are: (1) Proposed NB-AC attention model extracts edge, makes use of the narrow band principle under an AC energy minimization, which has been proven to be efficient in the LS evolution and for the weak boundary object segmentation [11, 14]. (2) The NB-AC attention model focuses on a subset of supportive pixels within the narrow band as an under-sampling approach removing samples from the majority class to compensate for imbalanced classes distribution. (3) Propose a new type of transitional gate that allows the higher level feature to interact with the lower one in an end-to-end framework. To the best of our knowledge, this is of the first works taking both the imbalanced-class data and weak boundary problem into account by integrating the length of the boundary and minimizing the energy of the inner and outer bands around the curve or surface. We evaluate with both 2D and 3D networks on various challenging datasets: DRIVE [15] - retinal vessel, iSeg [16] - infant brain, MRBrainS [17] - adult brain, Brats [18] - brain tumor segmentation.
The first branch of the network is a standard segmentation CNN which can utilize any encoder-decoder network. Both Unet and FCN are chosen as the network backbones in our experiments. For a region segmentation of K classes, the first branch outputs the categorical distribution and the loss is: $\sc{L}_1 = - ∑↙{c=1}↖k y^c_o log p^c_o$ where $y^c_o$ is binary indicator (0 or 1) if class label 'c' is the correct classification for observation 'o' and $p^c_o$ is predicted probability observation 'o' is of class 'c'. Then a transition gate will transfer information from the first to the second branch. Let denote the output feature representation of the first branch as $F_\sc{H}$. The output from NB-AC attention model in the second branch is denoted as $F^C_{\sc{L}}$ and $F^N_{\sc{L}}$ (contour and narrow band feature map). The contour feature map $F_\sc{L}^C$ is obtained by applying edge extraction operator χ on the higher level feature map $F_\sc{H}$ and the narrow band feature map $F_\sc{L}^N$ is obtained by applying parallel curves operator ζ on $F_\sc{L}^C$. In our experiments, χ and ζ are chosen as the gradient operator and the dilation operator, respectively.
For the second branch, instead of dealing with the entire domains Ω defined by the evolving curve, we only consider the narrow band: inner band $B_{in}$ and outer band $B_{out}$ from two sides of the curve $C$ (note: $C$ is presented by $Φ = 0$, with $Φ$ is the signed distance map from every pixel on the entire image to the given contour). Our NB-AC loss of the second branch is: $\sc{L}_2=μ∫_w|Length(Φ)|dxdy+λ_1 ∫_{B_{in}}|p-b_{in}|^2 dxdy+λ_2 ∫_{B_{out}}|p-b_{out}|^2 dxdy$. Hence, the whole objective function for this Unet-like with NB-AC attention model is as follow: $\sc{L}_{NB-AC}=λ_1 \sc{L}_1+λ_2 \sc{L}_2$, where $λ_1$ and $λ_2$ are parameters controlling weights between losses.
It is clear that the proposed NB-AC loss function outperforms the other common losses under both UNet and FCN frameworks. Take DSC metric on CE loss as an instance, our loss gains 3.97%, 1.36%, 1.39%, 1.0% on DRIVE, MRBrainS18, Brats18, iSeg17 respectively using Unet framework and it gains 0.53%, 1.65%, 0.98%, 0.82% on DRIVE, MRBrainS18, Brats18, iSeg17 respectively using FCN framework.