Author Contributions
Conceptualization, Y.M. and L.C.; methodology, Y.M.; software, Y.M.; validation, Y.M., L.C., L.Z. and Q.Z.; formal analysis, Y.M.; investigation, Y.M.; resources, L.C. and Q.Z.; data curation, Y.M.; writing—original draft preparation, Y.M. and L.Z.; writing—review and editing, Y.M., L.C, L.Z and Q.Z.; visualization, Y.M and L.Z.; supervision, L.C. and Q.Z.; project administration, L.C. and Q.Z.; funding acquisition, L.C. and Q.Z. All authors have read and agreed to the published version of the manuscript.
Figure 1.
Different labels for the nuclei segmentation task. (a) Input image. (b) Pixel-level instance label. (c) Point annotation. (d) Binary mask. (e) Voronoi label. (f) Cluster label. (g) Distance map. In (b), different instances are marked with specific colors. In (c,d), white and black pixels, respectively, represent foreground and background regions, while in (e,f), green, red, and black pixels indicate foreground, background, and ignored areas, respectively. In (g), each pixel value represents the distance from that pixel to the nearest centroid, depicted as a grayscale image.
Figure 1.
Different labels for the nuclei segmentation task. (a) Input image. (b) Pixel-level instance label. (c) Point annotation. (d) Binary mask. (e) Voronoi label. (f) Cluster label. (g) Distance map. In (b), different instances are marked with specific colors. In (c,d), white and black pixels, respectively, represent foreground and background regions, while in (e,f), green, red, and black pixels indicate foreground, background, and ignored areas, respectively. In (g), each pixel value represents the distance from that pixel to the nearest centroid, depicted as a grayscale image.
Figure 2.
A bad case of segmenting adjacent nuclei. (a) Input image. (b) Ground truth. (c) Predicted instance map. As indicated within the yellow box, multiple closely adjacent nuclei are predicted as a single nucleus.
Figure 2.
A bad case of segmenting adjacent nuclei. (a) Input image. (b) Ground truth. (c) Predicted instance map. As indicated within the yellow box, multiple closely adjacent nuclei are predicted as a single nucleus.
Figure 3.
Overview of our proposed method. The Voronoi label , cluster label , and Gaussian masks in the figure are all generated from point annotations. The model outputs two maps: (1) the center-point prediction from the Gaussian branch and (2) the segmentation prediction from the segmentation branch. During post-processing, the center-point map refines the segmentation map for more precise instance segmentation.
Figure 3.
Overview of our proposed method. The Voronoi label , cluster label , and Gaussian masks in the figure are all generated from point annotations. The model outputs two maps: (1) the center-point prediction from the Gaussian branch and (2) the segmentation prediction from the segmentation branch. During post-processing, the center-point map refines the segmentation map for more precise instance segmentation.
Figure 4.
The issues arising in inference. The images in (a,b), respectively, depict the comparison result of segmentation before and after adding a centroid to an instance that does not correspond to a center point. The images in (c,d), respectively, illustrate the comparison result of nuclei segmentation before and after merging instances.
Figure 4.
The issues arising in inference. The images in (a,b), respectively, depict the comparison result of segmentation before and after adding a centroid to an instance that does not correspond to a center point. The images in (c,d), respectively, illustrate the comparison result of nuclei segmentation before and after merging instances.
Figure 5.
Pseudo-label update strategy. Black arrows denote label updates during a cycle of training, while green arrows represent updates after completing a training cycle.
Figure 5.
Pseudo-label update strategy. Black arrows denote label updates during a cycle of training, while green arrows represent updates after completing a training cycle.
Figure 6.
The structure of the point-guided attention module. (a) The interaction of the point-guided blocks within the model. (b) The processing of feature maps by the point-guided block in the segmentation branch and the Gaussian branches. (c) The main part of point-guided attention. Gray, blue, and yellow blocks indicate feature maps from the encoder, segmentation decoder, and Gaussian decoder, respectively. The abbreviations “Norm”, “Attn”, “PG”, and “FFN” refer to layer normalization, attention, point-guided block, and feed-forward network.
Figure 6.
The structure of the point-guided attention module. (a) The interaction of the point-guided blocks within the model. (b) The processing of feature maps by the point-guided block in the segmentation branch and the Gaussian branches. (c) The main part of point-guided attention. Gray, blue, and yellow blocks indicate feature maps from the encoder, segmentation decoder, and Gaussian decoder, respectively. The abbreviations “Norm”, “Attn”, “PG”, and “FFN” refer to layer normalization, attention, point-guided block, and feed-forward network.
Figure 7.
Segmentation results of different methods. The regions marked with yellow borders highlight the advantages of our approach. The test images are cropped from the Monuseg (first and second rows), CPM17 (third and fourth rows), and CoNSeP (fifth and sixth rows) datasets. Different nuclei are represented in different colors.
Figure 7.
Segmentation results of different methods. The regions marked with yellow borders highlight the advantages of our approach. The test images are cropped from the Monuseg (first and second rows), CPM17 (third and fourth rows), and CoNSeP (fifth and sixth rows) datasets. Different nuclei are represented in different colors.
Figure 8.
The results of ablation experiments. The letters below the figures correspond to the models listed in
Table 3.
Figure 8.
The results of ablation experiments. The letters below the figures correspond to the models listed in
Table 3.
Figure 9.
The results of using different and values across three datasets.
Figure 9.
The results of using different and values across three datasets.
Figure 10.
Quantitative results with different numbers of Gaussian branches on the Monuseg dataset. (a) Model complexity. (b) Training time. (c) Inference time.
Figure 10.
Quantitative results with different numbers of Gaussian branches on the Monuseg dataset. (a) Model complexity. (b) Training time. (c) Inference time.
Figure 11.
Visualization results of the pseudo-label updating strategy. Rows 2, 3, and 4 depict the ground truth, initial clustering labels, and updated clustering labels obtained after one round of training, where green, red, and black represent nuclei regions, non-nuclei regions, and uncertain regions, respectively.
Figure 11.
Visualization results of the pseudo-label updating strategy. Rows 2, 3, and 4 depict the ground truth, initial clustering labels, and updated clustering labels obtained after one round of training, where green, red, and black represent nuclei regions, non-nuclei regions, and uncertain regions, respectively.
Figure 12.
Some failure cases occur in our method when dealing with elongated and densely packed cells; a successful case is also shown. (a) Missed detection. (b) Failed segmentation of densely packed elongated cells. (c) A successful case.
Figure 12.
Some failure cases occur in our method when dealing with elongated and densely packed cells; a successful case is also shown. (a) Missed detection. (b) Failed segmentation of densely packed elongated cells. (c) A successful case.
Table 1.
Comparative experiments on the Monuseg, CPM17, and CoNSeP datasets.
Table 1.
Comparative experiments on the Monuseg, CPM17, and CoNSeP datasets.
Methods | Monuseg | CPM17 | CoNSeP |
---|
| AJI | DQ | SQ | PQ | | AJI | DQ | SQ | PQ | | AJI | DQ | SQ | PQ |
---|
| Fully Supervised |
Mask-RCNN [53] | 0.807 | 0.623 | 0.806 | 0.761 | 0.614 | 0.837 | 0.686 | 0.847 | 0.801 | 0.680 | 0.732 | 0.505 | 0.626 | 0.711 | 0.446 |
Micro-Net [18] | 0.793 | 0.603 | 0.760 | 0.757 | 0.576 | 0.845 | 0.674 | 0.838 | 0.782 | 0.657 | 0.756 | 0.531 | 0.609 | 0.747 | 0.455 |
HoverNet [25] | 0.817 | 0.618 | 0.770 | 0.773 | 0.597 | 0.848 | 0.705 | 0.854 | 0.814 | 0.697 | 0.807 | 0.571 | 0.702 | 0.778 | 0.547 |
| Point-Supervised |
Weak-Anno [40] | 0.729 | 0.546 | 0.714 | 0.717 | 0.513 | 0.771 | 0.607 | 0.750 | 0.716 | 0.542 | 0.605 | 0.353 | 0.458 | 0.679 | 0.312 |
PseudoEdgeNet [42] | 0.711 | 0.506 | 0.637 | 0.712 | 0.454 | 0.728 | 0.553 | 0.733 | 0.639 | 0.469 | 0.557 | 0.270 | 0.391 | 0.684 | 0.269 |
C2FNet [35] | 0.717 | 0.539 | 0.701 | 0.715 | 0.501 | 0.735 | 0.567 | 0.657 | 0.683 | 0.448 | 0.569 | 0.259 | 0.429 | 0.683 | 0.293 |
WSPP [45] | 0.733 | 0.546 | 0.724 | 0.697 | 0.506 | 0.746 | 0.561 | 0.742 | 0.689 | 0.517 | 0.609 | 0.358 | 0.460 | 0.683 | 0.315 |
SPN+IEN [43] | 0.748 | 0.578 | 0.746 | 0.718 | 0.537 | 0.775 | 0.612 | 0.775 | 0.726 | 0.565 | 0.635 | 0.405 | 0.484 | 0.669 | 0.326 |
SC-Net [44] | 0.738 | 0.562 | 0.730 | 0.712 | 0.521 | 0.759 | 0.598 | 0.746 | 0.706 | 0.530 | 0.609 | 0.372 | 0.436 | 0.695 | 0.305 |
Ours | 0.772 | 0.612 | 0.797 | 0.714 | 0.569 | 0.787 | 0.636 | 0.825 | 0.721 | 0.597 | 0.651 | 0.433 | 0.493 | 0.667 | 0.331 |
Table 2.
The segmentation results on the MonuSeg test set across different organs.
Table 2.
The segmentation results on the MonuSeg test set across different organs.
Organ | | AJI | DQ | SQ | PQ |
---|
Bladder | 0.787 | 0.635 | 0.819 | 0.721 | 0.590 |
Brain | 0.781 | 0.620 | 0.786 | 0.707 | 0.556 |
Breast | 0.754 | 0.591 | 0.762 | 0.696 | 0.530 |
Colon | 0.731 | 0.544 | 0.719 | 0.698 | 0.502 |
Kidney | 0.777 | 0.623 | 0.820 | 0.723 | 0.593 |
Lung | 0.776 | 0.624 | 0.834 | 0.718 | 0.599 |
Prostate | 0.774 | 0.610 | 0.785 | 0.726 | 0.570 |
Table 3.
Ablation study on the proposed methods, where “MG”, “PA”, and “LU” represent the multi-scale Gaussian kernel module, point-guided attention module, and pseudo-label updating module. Results are presented as mean ± standard deviation from 5 runs.
Table 3.
Ablation study on the proposed methods, where “MG”, “PA”, and “LU” represent the multi-scale Gaussian kernel module, point-guided attention module, and pseudo-label updating module. Results are presented as mean ± standard deviation from 5 runs.
| MG | PA | LU | Monuseg | CPM17 | CoNSeP |
---|
| AJI | DQ | SQ | PQ | | AJI | DQ | SQ | PQ | | AJI | DQ | SQ | PQ |
---|
A | ✕ | ✕ | ✕ | 0.726 ± 0.012 | 0.534 ± 0.022 | 0.695 ± 0.021 | 0.714 ± 0.003 | 0.496 ± 0.018 | 0.764± 0.007 | 0.597 ± 0.010 | 0.742 ± 0.008 | 0.712 ± 0.004 | 0.531 ± 0.011 | 0.582 ± 0.023 | 0.337 ± 0.016 | 0.440 ± 0.018 | 0.672 ± 0.007 | 0.296 ± 0.016 |
B | 🗸 | ✕ | ✕ | 0.742 ± 0.005 | 0.571 ± 0.003 | 0.729 ± 0.003 | 0.717 ± 0.002 | 0.523 ± 0.004 | 0.771 ± 0.003 | 0.612 ± 0.005 | 0.780 ± 0.003 | 0.716 ± 0.002 | 0.558 ± 0.003 | 0.629 ± 0.010 | 0.415 ± 0.005 | 0.467 ± 0.006 | 0.673 ± 0.003 | 0.316 ± 0.006 |
C | ✕ | ✕ | 🗸 | 0.739 ± 0.007 | 0.559 ± 0.007 | 0.733 ± 0.006 | 0.719 ± 0.004 | 0.528 ± 0.007 | 0.772 ± 0.004 | 0.613 ± 0.002 | 0.799 ± 0.004 | 0.715 ± 0.003 | 0.573 ± 0.006 | 0.633 ± 0.012 | 0.353 ± 0.014 | 0.446 ± 0.13 | 0.675 ± 0.006 | 0.302 ± 0.011 |
D | 🗸 | 🗸 | ✕ | 0.761 ± 0.003 | 0.593 ± 0.004 | 0.769 ± 0.005 | 0.717 ± 0.002 | 0.552 ± 0.006 | 0.776 ± 0.002 | 0.625 ± 0.002 | 0.804 ± 0.003 | 0.718 ± 0.002 | 0.579 ± 0.005 | 0.632 ± 0.006 | 0.425 ± 0.007 | 0.481 ± 0.005 | 0.666 ± 0.007 | 0.319 ± 0.002 |
E | 🗸 | ✕ | 🗸 | 0.753 ± 0.004 | 0.581 ± 0.005 | 0.738 ± 0.006 | 0.719 ± 0.003 | 0.531 ± 0.006 | 0.780 ± 0.003 | 0.625 ± 0.001 | 0.807 ± 0.003 | 0.715 ± 0.002 | 0.579 ± 0.005 | 0.641 ± 0.006 | 0.423 ± 0.007 | 0.454 ± 0.006 | 0.677 ± 0.004 | 0.308 ± 0.006 |
F | 🗸 | 🗸 | 🗸 | 0.770 ± 0.002 | 0.609 ± 0.003 | 0.793 ± 0.004 | 0.715 ± 0.003 | 0.565 ± 0.004 | 0.785 ± 0.002 | 0.634 ± 0.002 | 0.821 ± 0.004 | 0.718 ± 0.003 | 0.592 ± 0.005 | 0.647 ± 0.004 | 0.428 ± 0.005 | 0.488 ± 0.005 | 0.667 ± 0.004 | 0.327 ± 0.004 |
Table 4.
Ablation study on the number of Gaussian branches.
Table 4.
Ablation study on the number of Gaussian branches.
Num | Monuseg | CPM17 | CoNSeP |
---|
| AJI | DQ | SQ | PQ | | AJI | DQ | SQ | PQ | | AJI | DQ | SQ | PQ |
---|
1 | 0.742 | 0.561 | 0.723 | 0.719 | 0.520 | 0.772 | 0.596 | 0.758 | 0.713 | 0.541 | 0.619 | 0.376 | 0.463 | 0.674 | 0.313 |
2 | 0.746 | 0.572 | 0.728 | 0.718 | 0.523 | 0.773 | 0.614 | 0.781 | 0.716 | 0.559 | 0.639 | 0.419 | 0.472 | 0.676 | 0.320 |
3 | 0.746 | 0.573 | 0.730 | 0.718 | 0.523 | 0.774 | 0.616 | 0.781 | 0.717 | 0.560 | 0.640 | 0.421 | 0.475 | 0.676 | 0.322 |
4 | 0.747 | 0.574 | 0.732 | 0.719 | 0.527 | 0.774 | 0.617 | 0.783 | 0.718 | 0.561 | 0.639 | 0.420 | 0.473 | 0.676 | 0.322 |
5 | 0.746 | 0.572 | 0.731 | 0.719 | 0.525 | 0.774 | 0.616 | 0.782 | 0.717 | 0.561 | 0.641 | 0.422 | 0.476 | 0.675 | 0.323 |
6 | 0.746 | 0.571 | 0.731 | 0.719 | 0.525 | 0.775 | 0.617 | 0.782 | 0.718 | 0.562 | 0.640 | 0.419 | 0.472 | 0.676 | 0.321 |
Table 5.
Ablation study on the inference strategy. For each dataset, the first row represents the results using the coarse instance map obtained by directly multiplying the segmentation map S with the Voronoi partition P derived from the coarse center-point map C. The second row represents the results using the refined instance maps obtained from Algorithm 1.
Table 5.
Ablation study on the inference strategy. For each dataset, the first row represents the results using the coarse instance map obtained by directly multiplying the segmentation map S with the Voronoi partition P derived from the coarse center-point map C. The second row represents the results using the refined instance maps obtained from Algorithm 1.
Dataset | Method | | AJI | DQ | SQ | PQ |
---|
Monuseg | ✕ | 0.765 | 0.592 | 0.790 | 0.716 | 0.566 |
🗸 | 0.772 | 0.612 | 0.797 | 0.714 | 0.569 |
CPM17 | ✕ | 0.786 | 0.636 | 0.820 | 0.720 | 0.593 |
🗸 | 0.787 | 0.639 | 0.825 | 0.721 | 0.597 |
CoNSeP | ✕ | 0.642 | 0.402 | 0.488 | 0.669 | 0.329 |
🗸 | 0.651 | 0.433 | 0.493 | 0.667 | 0.331 |
Table 6.
The training and inference times on the Monuseg dataset. The parameter count in the table refers to the total number of parameters during the model training phase. The last two rows represent the times of our method in generating a coarse instance map and a refined instance map, respectively.
Table 6.
The training and inference times on the Monuseg dataset. The parameter count in the table refers to the total number of parameters during the model training phase. The last two rows represent the times of our method in generating a coarse instance map and a refined instance map, respectively.
Methods | Params | Training | Total | Average |
---|
| Time | Inference Time | Inference Time |
---|
[M] | [s/epoch] | [s] | [s/img] |
---|
HoverNet [25] | 45.68 | 16.25 | 27.69 | 1.978 |
Micro-Net [18] | 25.83 | 13.17 | 11.65 | 0.832 |
Mask-RCNN [53] | 43.98 | 26.58 | 38.82 | 2.773 |
Weak-Anno [40] | 24.91 | 11.78 | 3.402 | 0.243 |
WSPP [45] | 49.82 | 12.06 | 5.348 | 0.382 |
SC-Net [44] | 69.06 | 23.58 | 9.002 | 0.506 |
SPN+IEN [43] | 49.86 | 43.28 | 84.42 | 6.030 |
Ours(C) | 49.25 | 17.82 | 7.182 | 0.513 |
Ours(R) | 49.25 | 17.82 | 28.62 | 2.044 |