TongueNet: A Precise and Fast Tongue Segmentation System Using U-Net with a Morphological Processing Layer
Abstract
:1. Introduction
- (a)
- Tongue with an apparent gap in the mouth.
- (b)
- Tongue with abnormal color.
- (c)
- Tongue with abnormal texture.
- (d)
- Tongue with teeth showing.
- (e)
- Tongue with irregular poses.
- (f)
- Tongue not completely protruding.
- (g)
- Tongue with teeth imprints on the edges.
- (h)
- Tongue closely surrounded by the lips.
- A dedicated automatic tongue segmentation system is proposed.
- A deep architecture: U-net with a morphological processing layer is applied.
- The proposed tongue segmentation system is more precise and much faster than other state-of-the-art tongue segmentation methods.
2. Methodology
2.1. Overview of TongueNet
2.2. Network Architecture
2.3. The Morphological Processing Layer
Algorithm 1: Morphological reconstruction |
Algorithm 2: Morphological processing layer |
Input: Mask Image M, Threshold , Open operation Filter size , Close operation filter size Output: Binary Image G 1 Initialize structural elements: , . 2 Perform global binarization on M using Equation (1) to obtain connected components. 3 Generate binary image R according to Equation (2). 4 Perform edge detection using the prewitt operator, set S as the set of pixels from the edge. 5 Reserve pixels of S in R according to Equation (3). 6 Generating hole-filled image H by using using Equation (4). 7 Overlapping binary mask image I and hole-filled image H according to Equation (5). 8 Perform the open operation using according to Equation (6): . 9 Perform the close operation using according to Equation (7): . 10 Return G.; |
3. Experimental Results
3.1. Dataset Description
3.2. Experimental Setup
3.3. Qualitative Evaluation
3.4. Quantitative Evaluation
3.4.1. Metrics
3.4.2. IOU Results on 20 Test Samples
3.4.3. PA Results on 20 Test Samples
3.4.4. F-score, Precision, and Recall Results on 20 Test Samples
3.4.5. Overall Comparison with Other Segmentation Methods
3.5. Efficiency Comparison
4. Discussion
- The comparisons above (in the experimental results) show the effectiveness as well as the efficiency of TongueNet in tongue segmentation with the superiority of robustness in many complicated circumstances, which makes it a promising tool when integrated with TCTD. The proposed TongueNet achieves a better segmentation performance compared with other state-of-the-art methods on both qualitative and quantitative evaluations (refer to Figure 11 and Table 4). Furthermore, TongueNet uses less computation time compared with REF (see Table 5), indicating a much faster processing speed.
- Although TongueNet performs better at segmentation in general, the predictions of the pixels still need improvement as it is not complete in some occasion in terms of quantitative evaluation. From Table 4, even if the mean F-score of TongueNet (95.38%) is higher than other methods, the Recall values of some samples (from the 200 test images) are lower than REF (e.g., the 11th sample: TongueNet-97.94%, REF-99.77% Figure 14). This indicates indicating that the prediction of pixels cannot entirely cover all tongue regions. The cause of this is due to a higher loss of the boundary in the tongue. Therefore, more focus will be placed at the boundary loss as part of our future work.
- The parameters in the morphological layer determine the effectiveness of the refinement as well as the efficiency of the whole system. Since there are three filters (morphological reconstruction, open operation and close operation) (refer to Section 2.3) utilized to perform morphological processing for different purposes, the system sensitivity towards parameters is high. There exists a trade-off between the effectiveness and efficiency. To overcome the high sensitivity and trade-off, we fine-tune the system using different parameters and combinations, including the filter kernel size, type of operator, tolerance of morphological reconstruction, and the global threshold to generate the binary mask.
5. Conclusions
Author Contributions
Funding
Conflicts of Interest
Abbreviations
TCM | Traditional Chinese Medicine |
TCTD | Traditional Chinese Tongue Diagnosis |
BEDC | Bi-Elliptical Deformable Contour |
FCN | Fully Conolutional Neural Network |
BMP | Bitmap |
RGB | RGB color model |
REF | Region-based and Edge-based fusion approach |
IOU | Intersection over Union |
PA | Pixel Accuracy |
TP | True Positive |
FP | False Positive |
TN | True Negative |
FN | False Negative |
MIOU | Mean IOU |
MPA | Mean PA |
MPrecision | Mean Precision |
MRecall | Mean Recall |
MFscore | Mean F-score |
References
- Press, C.U. Cambridge Academic Content Dictionary; Cambridge University Press: Cambridge, UK, 2017. [Google Scholar]
- Maciocia, G. Tongue Diagnosis in Chinese Medicine; Eastland: Seattle, WA, USA, 1995. [Google Scholar]
- Kirschbaum, B. Atlas of Chinese Tongue Diagnosis; Eastland: Seattle, WA, USA, 2000. [Google Scholar]
- Zhao, Q.; Zhang, D.; Zhang, B. Digital tongue image analysis in medical applications using a new tongue ColorChecker. In Proceedings of the 2016 2nd IEEE International Conference on Computer and Communications (ICCC), Chengdu, China, 14–17 October 2016; pp. 803–807. [Google Scholar]
- Zhang, H.; Zhang, B. Disease detection using tongue geometry features with sparse representation classifier. In Proceedings of the 2014 International Conference on Medical Biometrics, Shenzhen, China, 30 May–1 June 2014; pp. 102–107. [Google Scholar]
- Zhang, B.; Nie, W.; Zhao, S. A novel Color Rendition Chart for digital tongue image calibration. Color Res. Appl. 2018, 43, 749–759. [Google Scholar] [CrossRef]
- David, Z.; Hongzhi, Z.; Bob, Z. Tongue Image Analysis; Springer: Berlin, Germany, 2017. [Google Scholar]
- Zhang, H.; Wang, K.; Zhang, D.; Pang, B.; Huang, B. Computer aided tongue diagnosis system. In Proceedings of the 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference, Shanghai, China, 17–18 January 2006; pp. 6754–6757. [Google Scholar]
- Zhang, B.; Zhang, H. Significant geometry features in tongue image analysis. Evid.-Based Complement. Altern. Med. 2015, 2015, 897580. [Google Scholar] [CrossRef] [PubMed]
- Wang, X.; Zhang, B.; Yang, Z.; Wang, H.; Zhang, D. Statistical analysis of tongue images for feature extraction and diagnostics. IEEE Trans. Image Process. 2013, 22, 5336–5347. [Google Scholar] [CrossRef] [PubMed]
- Zhang, B.; Wang, X.; You, J.; Zhang, D. Tongue Color Analysis for Medical Application. Evid.-Based Complement. Altern. Med. 2013, 2015, 264742. [Google Scholar] [CrossRef] [PubMed]
- Zhang, B.; Kumar, B.V.; Zhang, D. Detecting Diabetes Mellitus and Nonproliferative Diabetic Retinopathy Using Tongue Color, Texture, and Geometry Features. IEEE Trans. Biomed. Eng. 2014, 61, 491–501. [Google Scholar] [CrossRef] [PubMed]
- Pang, B.; Zhang, D.; Wang, K. The bi-elliptical deformable contour and its application to automated tongue segmentation in Chinese medicine. IEEE Trans. Med. Imaging 2005, 24, 946–956. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- McInerney, T.; Terzopoulos, D. Deformable models in medical image analysis: a survey. Med. Image Anal. 1996, 1, 91–108. [Google Scholar] [CrossRef]
- Kass, M.; Witkin, A.; Terzopoulos, D. Snakes: Active contour models. Int. J. Comput. Vis. 1988, 1, 321–331. [Google Scholar] [CrossRef]
- Ning, J.; Zhang, L.; Zhang, D.; Wu, C. Interactive image segmentation by maximal similarity based region merging. Pattern Recognit. 2010, 43, 445–456. [Google Scholar] [CrossRef]
- Ning, J.; Zhang, D.; Wu, C.; Yue, F. Automatic tongue image segmentation based on gradient vector flow and region merging. Neural Comput. Appl. 2012, 21, 1819–1826. [Google Scholar] [CrossRef]
- Wu, K.; Zhang, D. Robust tongue segmentation by fusing region-based and edge-based approaches. Expert Syst. Appl. 2015, 42, 8027–8038. [Google Scholar] [CrossRef]
- Liu, Z.; Yan, J.Q.; Zhang, D.; Li, Q.L. Automated tongue segmentation in hyperspectral images for medicine. Appl. Opt. 2007, 46, 8328–8334. [Google Scholar] [CrossRef] [PubMed]
- Zhang, D.; Zhang, H.; Zhang, B. A Snake-Based Approach to Automated Tongue Image Segmentation. In Tongue Image Analysis; Springer: Berlin, Germany, 2017; pp. 71–88. [Google Scholar]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436. [Google Scholar] [CrossRef] [PubMed]
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 431–3440. [Google Scholar]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention; Springer: Berlin, Germany, 2015; pp. 234–241. [Google Scholar]
- Badrinarayanan, V.; Kendall, A.; Cipolla, R. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef] [PubMed]
- Garcia-Garcia, A.; Orts-Escolano, S.; Oprea, S.; Villena-Martinez, V.; Martinez-Gonzalez, P.; Garcia-Rodriguez, J. A survey on deep learning techniques for image and video semantic segmentation. Appl. Soft Comput. 2018, 70, 41–65. [Google Scholar] [CrossRef]
- Otsu, N. A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man, Cybern. 1979, 9, 62–66. [Google Scholar] [CrossRef]
- Gonzalez, R.C.; Woods, R.E. Digital Image Processing, 3rd ed.; Prentice-Hall, Inc.: Upper Saddle River, NJ, USA, 2006. [Google Scholar]
- Torbert, S. Applied Computer Science; Springer: Berlin, Germany, 2012. [Google Scholar]
- Wada, K. labelme: Image Polygonal Annotation with Python. 2016. Available online: https://github.com/wkentaro/labelme (accessed on 31 July 2019).
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- Kosub, S. A note on the triangle inequality for the jaccard distance. Pattern Recognit. Lett. 2019, 120, 36–38. [Google Scholar] [CrossRef]
- Powers, D.M. Evaluation: From precision, recall and F-measure to ROC, informedness, markedness and correlation. J. Mach. Learn. Technol. 2011, 2, 37–63. [Google Scholar]
- Sasaki, Y. The truth of the F-measure. Teach Tutor Mater 2007, 1, 1–5. [Google Scholar]
Layer | Type | Kernel Size | Number of Kernels | Input Dimensions | Activation Function |
---|---|---|---|---|---|
1 | Convolution2D | 3 × 3 | 64 | 576 × 768 | ReLU |
2 | Convolution2D | 3 × 3 | 64 | 576 × 768 | ReLU |
3 | Maxpooling | 2 × 2 | - | 576 × 768 | - |
4 | Convolution2D | 3 × 3 | 128 | 288 × 384 | ReLU |
5 | Convolution2D | 3 × 3 | 128 | 288 × 384 | ReLU |
6 | Maxpooling | 2 × 2 | - | 288 × 384 | - |
7 | Convolution2D | 3 × 3 | 256 | 144 × 192 | ReLU |
8 | Convolution2D | 3 × 3 | 256 | 144 × 192 | ReLU |
9 | Maxpooling | 2 × 2 | - | 144 × 392 | - |
10 | Dropout | 2 × 2 | - | - | - |
11 | Convolution2D | 3 × 3 | 512 | 72 × 96 | ReLU |
12 | Convolution2D | 3 × 3 | 512 | 72 × 96 | ReLU |
13 | Maxpooling | 2 × 2 | - | 72 × 96 | - |
14 | Dropout | 2 × 2 | - | - | - |
15 | Convolution2D | 3 × 3 | 1024 | 36 × 48 | ReLU |
16 | Convolution2D | 3 × 3 | 1024 | 36 × 48 | ReLU |
17 | Up-convolution2D | 2 × 2 | 512 | 36 × 48 | - |
18 | Convolution2D | 3 × 3 | 512 | 72 × 96 | ReLU |
19 | Convolution2D | 3 × 3 | 512 | 72 × 96 | ReLU |
20 | Up-convolution2D | 2 × 2 | 256 | 72 × 96 | - |
21 | Convolution2D | 3 × 3 | 256 | 144 × 192 | ReLU |
22 | Convolution2D | 3 × 3 | 256 | 144 × 192 | ReLU |
23 | Up-convolution2D | 2 × 2 | 128 | 144 × 192 | - |
24 | Convolution2D | 3 × 3 | 128 | 288 × 384 | ReLU |
25 | Convolution2D | 3 × 3 | 128 | 288 × 384 | ReLU |
26 | Up-convolution2D | 2 × 2 | 64 | 288 × | - |
27 | Convolution2D | 3 × 3 | 64 | 576 × 768 | ReLU |
28 | Convolution2D | 3 × 3 | 64 | 576 × 768 | ReLU |
29 | Convolution2D | 1 × 1 | 1 | 576 × 768 | ReLU |
30 | Morphological processing | - | 1 | 576 × 768 | - |
Size | Resolution | Format | Bit Depth | Color Model |
---|---|---|---|---|
1000 | 576 × 768 | Bitmap | 24 | RGB |
Size | Resolution | Format | Bit Depth | Color Model |
---|---|---|---|---|
800 | 576 × 768 | Bitmap | 2 | Binary (255-foreground, 0-background) |
Methods | MIOU (%) | MPA (%) | MF-score (%) | MPrecision (%) | MRecall (%) |
---|---|---|---|---|---|
Snake | |||||
Flood fill | |||||
REF | |||||
TongueNet | |||||
p-value | 0.002 | 0.002 | 0.010 | 0.004 | 0.015 |
Methods | Computational Time (s) |
---|---|
REF | 0.302 |
TongueNet | 0.267 |
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhou, J.; Zhang, Q.; Zhang, B.; Chen, X. TongueNet: A Precise and Fast Tongue Segmentation System Using U-Net with a Morphological Processing Layer. Appl. Sci. 2019, 9, 3128. https://doi.org/10.3390/app9153128
Zhou J, Zhang Q, Zhang B, Chen X. TongueNet: A Precise and Fast Tongue Segmentation System Using U-Net with a Morphological Processing Layer. Applied Sciences. 2019; 9(15):3128. https://doi.org/10.3390/app9153128
Chicago/Turabian StyleZhou, Jianhang, Qi Zhang, Bob Zhang, and Xiaojiao Chen. 2019. "TongueNet: A Precise and Fast Tongue Segmentation System Using U-Net with a Morphological Processing Layer" Applied Sciences 9, no. 15: 3128. https://doi.org/10.3390/app9153128
APA StyleZhou, J., Zhang, Q., Zhang, B., & Chen, X. (2019). TongueNet: A Precise and Fast Tongue Segmentation System Using U-Net with a Morphological Processing Layer. Applied Sciences, 9(15), 3128. https://doi.org/10.3390/app9153128