Symmetry Encoder-Decoder Network with Attention Mechanism for Fast Video Object Segmentation
Abstract
:1. Introduction
- We introduce SAVOS, which requires only one forward pass through the symmetry encoder-decoder network to generate all parameters that are needed to adapt to the specific object instance.
- We design an attention module providing guidance to focus on the target object in the current frame, which helps to improve accuracy.
- Extensive experiments are conducted on three datasets, namely DAVIS 2016, DAVIS 2017, and SegTrack v2, to demonstrate that SAVOS achieves favorable performance compared to the state-of-the-art.
2. Related Work
2.1. Unsupervised Video Object Segmentation
2.2. Semi-Supervised Video Object Segmentation
2.3. Attention Mechanism
3. Motivation
3.1. Baselines
- Two-stage paradigm: A large number of CNN-based semi-supervised VOS methods adopt the two-stage paradigm (see Figure 2): firstly, a base CNN is trained to segment the target object; second, the trained network is fine-tuned based on the first frame of the test video to adapt to the object appearance, leading to the performance boost. Perazzi et al. [11] proposed a method combining offline and online learning strategies. The offline training phase feeds the coarsened previous frame mask into the trained network to predict the object mask in the current frame. Then, it further improves video segmentation quality by online fine-tuning. Caelles et al. [13] firstly trained a base CNN to segment the foreground object from the background and then used online fine-tuning to adapt to the specific object. Comparing with the aforementioned two-stage strategies, Voigtlaender et al. [12] added one more pre-training step on the PASCAL dataset in the first stage and further fine-tuned the model by online adaptation in the second stage.
- Post-processing: To promote accuracy, post-processing is also adopted in many VOS approaches. In [13], boundary snapping was used to snap the object mask to accurate contours, which resulted in more accurate results. Maninis et al. [34] employed two conditional classifiers for post-processing to better model different distributions, one for predicting instance foreground pixels and the other for predicting background pixels. Post-processing means such as conditional random fields (CRF) and optical flow have been proven to be helpful to further refine segmentation masks, achieving additional gains in many methods.
3.2. Challenges and Solutions
4. Network Architecture
4.1. Symmetry Encoder
4.2. Global Convolution Block
4.3. Decoder
5. Inference
6. Experiments
6.1. Implementation Details
6.1.1. Datasets
6.1.2. Data Augmentation
6.1.3. Implementation
6.1.4. Evaluation Measure
6.2. Comparing to the State-of-the-Art
6.2.1. DAVIS 2016 Dataset
6.2.2. DAVIS 2017
6.2.3. SegTrack v2
6.3. Ablation Study
6.3.1. Lucid Dream Augmentation
6.3.2. Attention Module
6.4. Add-On Study
6.4.1. Online Learning
6.4.2. CRF Refinement
7. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Ji, Z.; Xiong, K.; Pang, Y.; Li, X. Video Summarization with Attention-Based Encoder-Decoder Networks. IEEE Trans. Circuits Syst. Video Technol. 2019. [Google Scholar] [CrossRef]
- Bakkay, M.C.; Pizenberg, M.; Carlier, A. Protocols and software for simplified educational video capture and editing. J. Comput. Educ. 2019, 6, 257–276. [Google Scholar] [CrossRef] [Green Version]
- Pham, Q.H.; Nguyen, T.; Hua, B.S.; Roig, G.; Yeung, S.K. JSIS3D: Joint Semantic-Instance Segmentation of 3D Point Clouds with Multi-Task Pointwise Networks and Multi-Value Conditional Random Fields. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019. [Google Scholar]
- Faktor, A.; Irani, M. Video Segmentation by Non-Local Consensus voting. In Proceedings of the British Machine Vision Conference, Nottingham, UK, 1–5 September 2014. [Google Scholar]
- Jain, S.D.; Xiong, B.; Grauman, K. Fusionseg: Learning to Combine Motion and Appearance for Fully Automatic Segmention of Generic Objects in Videos. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; Volume 1. [Google Scholar]
- Keuper, M.; Andres, B.; Brox, T. Motion Trajectory Segmentation via Minimum Cost Multicuts. In Proceedings of the the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015. [Google Scholar]
- Papazoglou, A.; Ferrari, V. Fast Object Segmentation in Unconstrained Video. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Sydney, NSW, Australia, 1–8 December 2013. [Google Scholar]
- Cheng, J.; Tsai, Y.H.; Wang, S.; Yang, M.H. SegFlow: Joint Learning for Video Object Segmentation and Optical Flow. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 686–695. [Google Scholar]
- Jang, W.; Kim, C. Online Video Object Segmentation via Convolutional Trident Network. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 7474–7483. [Google Scholar]
- Khoreva, A.; Benenson, R.; Ilg, E.; Brox, T.; Schiele, B. Lucid Data Dreaming for Video Object Segmentation. Int. J. Comput. Vis. 2019, 127, 1175–1197. [Google Scholar] [CrossRef] [Green Version]
- Perazzi, F.; Khoreva, A.; Benenson, R.; Schiele, B.; Sorkine-Hornung, A. Learning Video Object Segmentation from Static Images. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 3491–3500. [Google Scholar]
- Voigtlaender, P.; Leibe, B. Online Adaptation of Convolutional Neural Networks for Video Object Segmentation. arXiv 2017, arXiv:1706.09364. [Google Scholar] [Green Version]
- Caelles, S.; Maninis, K.K.; Pont-Tuset, J.; Leal-Taixé, L.; Cremers, D.; Van Gool, L. One-Shot Video Object Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Tsai, Y.H.; Yang, M.H.; Black, M.J. Video Segmentation via Object Flow. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 26 June–1 July 2016; pp. 3899–3908. [Google Scholar]
- Märki, N.; Perazzi, F.; Wang, O.; Sorkine-Hornung, A. Bilateral Space Video Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 743–751. [Google Scholar]
- Yoon, J.S.; Rameau, F.; Kim, J.; Lee, S.; Shin, S.; Kweon, I.S. Pixel-Level Matching for Video Object Segmentation Using Convolutional Neural Networks. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2186–2195. [Google Scholar]
- Zhang, D.; Luo, M.; He, F. Reconstructed Similarity for Faster GANs-Based Word Translation to Mitigate Hubness. Neurocomputing 2019. [Google Scholar] [CrossRef]
- Long, J.; Shelhamer, E.; Darrell, T. Fully Convolutional Networks for Semantic Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar]
- Hu, P.; Wang, G.; Kong, X.; Kuen, J.; Tan, Y.P. Motion-Guided Cascaded Refinement Network for Video Object Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 1400–1409. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 26 June–1 July 2016; pp. 770–778. [Google Scholar]
- Ji, S.; Xu, W.; Yang, M.W.; Yu, K. 3D Convolutional Neural Networks for Human Action Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 35, 221–231. [Google Scholar] [CrossRef] [PubMed]
- Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. Deeplab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected Crfs. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 834–848. [Google Scholar] [CrossRef] [PubMed]
- Strand, R.; Ciesielski, K.; Malmberg, F.; Saha, P.K. The minimum barrier distance. Comput. Vis. Image Underst. 2013, 117, 429–437. [Google Scholar] [CrossRef]
- Krähenbühl, P.; Koltun, V. Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials. In Advances in Neural Information Processing Systems 24; Shawe-Taylor, J., Zemel, R.S., Bartlett, P.L., Pereira, F., Weinberger, K.Q., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2011; pp. 109–117. [Google Scholar]
- Dosovitskiy, A.; Fischer, P.; Ilg, E.; Häusser, P.; Hazirbas, C.; Golkov, V.; Smagt, P.; Cremers, D.; Brox, T. FlowNet: Learning Optical Flow with Convolutional Networks. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 2758–2766. [Google Scholar]
- Vijayanarasimhan, S.; Grauman, K. Active Frame Selection for Label Propagation in Videos. In Proceedings of the European Conference on Computer Vision, Florence, Italy, 7–13 October 2012. [Google Scholar]
- Tokmakov, P.; Alahari, K.; Schmid, C. Learning Motion Patterns in Videos. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 531–539. [Google Scholar]
- Fragkiadaki, K.; Arbeláez, P.A.; Felsen, P.; Malik, J. Learning to segment moving objects in videos. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 4083–4090. [Google Scholar]
- Yu, G.; Yuan, J. Fast action proposals for human action detection and search. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1302–1311. [Google Scholar]
- Perazzi, F.; Wang, O.; Gross, M.; Sorkine-Hornung, A. Fully Connected Object Proposals for Video Segmentation. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 3227–3234. [Google Scholar]
- Badrinarayanan, V.; Budvytis, I.; Cipolla, R.; Member, S. Semi-Supervised Video Segmentation Using Tree Structured Graphical Models. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 2751–2764. [Google Scholar] [CrossRef] [PubMed]
- Wen, L.; Du, D.; Lei, Z.; Li, S.Z.; Yang, M.H. JOTS: Joint Online Tracking and Segmentation. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 2226–2234. [Google Scholar]
- Caelles, S.; Chen, Y.; Pont-Tuset, J.; Gool, L.V. Semantically-Guided Video Object Segmentation. arXiv 2017, arXiv:1704.01926. [Google Scholar]
- Maninis, K.K.; Caelles, S.; Chen, Y.; Pont-Tuset, J.; Leal-Taixé, L.; Cremers, D.; Gool, L.V. Video Object Segmentation without Temporal Information. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 41, 1515–1530. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2961–2969. [Google Scholar]
- Kumar, A.; Irsoy, O.; Ondruska, P.; Iyyer, M.; Bradbury, J.; Gulrajani, I.; Zhong, V.; Paulus, R.; Socher, R. Ask Me Anything: Dynamic Memory Networks for Natural Language Processing. In Proceedings of the 33rd International Conference on Machine Learning (PMLR), New York, NY, USA, 19–24 June 2016; pp. 1378–1387. [Google Scholar]
- Choi, H.; Cho, K.; Bengio, Y. Fine-grained attention mechanism for neural machine translation. Neurocomputing 2018, 284, 171–176. [Google Scholar] [CrossRef] [Green Version]
- Zhang, B.; Xiong, D.; Su, J. Neural Machine Translation with Deep Attention. IEEE Trans. Pattern Anal. Mach. Intell. 2019. [Google Scholar] [CrossRef] [PubMed]
- Guo, H.; Zheng, K.; Fan, X.; Yu, H.; Wang, S. Visual Attention Consistency Under Image Transforms for Multi-Label Image Classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019. [Google Scholar]
- Dai, T.; Cai, J.; Zhang, Y.; Xia, S.T.; Zhang, L. Second-Order Attention Network for Single Image Super-Resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019. [Google Scholar]
- Wang, L.; Wang, Y.; Liang, Z.; Lin, Z.; Yang, J.; An, W.; Guo, Y. Learning Parallax Attention for Stereo Image Super-Resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019. [Google Scholar]
- Si, C.; Chen, W.; Wang, W.; Wang, L.; Tan, T. An Attention Enhanced Graph Convolutional LSTM Network for Skeleton-Based Action Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019. [Google Scholar]
- Zhang, T.; Lin, G.; Cai, J.; Shen, T.; Shen, C.; Kot, A.C. Decoupled Spatial Neural Attention for Weakly Supervised Semantic Segmentation. IEEE Trans. Multimed. 2019. [Google Scholar] [CrossRef]
- Yu, C.; Wang, J.; Peng, C.; Gao, C.; Yu, G.; Sang, N. BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018. [Google Scholar]
- Yu, C.; Wang, J.; Peng, C.; Gao, C.; Yu, G.; Sang, N. Learning a Discriminative Feature Network for Semantic Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018. [Google Scholar]
- Li, Y.; Chen, X.; Zhu, Z.; Xie, L.; Huang, G.; Du, D.; Wang, X. Attention-Guided Unified Network for Panoptic Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019. [Google Scholar]
- Li, X.; Change Loy, C. Video Object Segmentation with Joint Re-identification and Attention-Aware Mask Propagation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018. [Google Scholar]
- Lu, X.; Wang, W.; Ma, C.; Shen, J.; Shao, L.; Porikli, F. See More, Know More: Unsupervised Video Object Segmentation With Co-Attention Siamese Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019. [Google Scholar]
- Yang, Z.; Yang, D.; Dyer, C.; He, X.; Smola, A.; Hovy, E. Hierarchical Attention Networks for Document Classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, CA, USA, 12–17 June 2016; pp. 1480–1489. [Google Scholar]
- Cheng, J.; Dong, L.; Lapata, M. Long Short-Term Memory-Networks for Machine Reading. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, TX, USA, 1–5 November 2016; pp. 551–561. [Google Scholar]
- Xiong, C.; Zhong, V.; Socher, R. Dynamic Coattention Networks for Question Answering. arXiv 2017, arXiv:1611.01604. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-Excitation Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018. [Google Scholar]
- Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Li, K.; Li, F. ImageNet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 248–255. [Google Scholar]
- Peng, C.; Zhang, X.; Yu, G.; Luo, G.; Sun, J. Large Kernel Matters—Improve Semantic Segmentation by Global Convolutional Network. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 1743–1751. [Google Scholar]
- Ioffe, S.; Szegedy, C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In Proceedings of the 32nd International Conference on Machine Learning (PMLR), Lille, France, 6–11 July 2015; p. 9. [Google Scholar]
- Xu, K.; Wen, L.; Li, G.; Bo, L.; Huang, Q. Spatiotemporal CNN for Video Object Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019. [Google Scholar]
- Arribas, J.I.; Cid-Sueiro, J.; Adali, T.; Figueiras-Vidal, A.R. Neural architectures for parametric estimation of a posteriori probabilities by constrained conditional density functions. In Proceedings of the Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No. 98TH8468); Madison, WI, USA, 25–25 August 1999, pp. 263–272.
- Oh, S.W.; Lee, J.Y.; Sunkavalli, K.; Kim, S.J. Fast Video Object Segmentation by Reference-Guided Mask Propagation. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 7376–7385. [Google Scholar]
- Perazzi, F.; Pont-Tuset, J.; McWilliams, B.; Gool, L.V.; Gross, M.; Sorkine-Hornung, A. A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 26 June–1 July 2016; pp. 724–732. [Google Scholar]
- Pont-Tuset, J.; Perazzi, F.; Caelles, S.; Arbeláez, P.; Sorkine-Hornung, A.; Van Gool, L. The 2017 DAVIS Challenge on Video Object Segmentation. arXiv 2017, arXiv:1704.00675. [Google Scholar]
- Li, F.; Kim, T.; Humayun, A.; Tsai, D.; Rehg, J.M. Video Segmentation by Tracking Many Figure-Ground Segments. In Proceedings of the 2013 IEEE International Conference on Computer Vision, Sydney, Australia, 1–8 December 2013; pp. 2192–2199. [Google Scholar]
- Jampani, V.; Gadde, R.; Gehler, P.V. Video Propagation Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Yang, L.; Wang, Y.; Xiong, X.; Yang, J.; Katsaggelos, A.K. Efficient Video Object Segmentation via Network Modulation. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 6499–6507. [Google Scholar]
- Cheng, J.; Tsai, Y.H.; Hung, W.C.; Wang, S.; Yang, M.H. Fast and Accurate Online Video Object Segmentation via Tracking Parts. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 7415–7424. [Google Scholar]
Method | OL | PP | Mean | Mean | Time |
---|---|---|---|---|---|
MSK [11] | ✓ | ✓ | 79.7 | 75.4 | 12 s |
OSVOS [13] | ✓ | ✓ | 79.8 | 80.6 | 9 s |
[34] | ✓ | ✓ | 85.6 | 86.4 | 4.5 s |
OnAVOS [12] | ✓ | ✓ | 86.1 | 84.9 | 13 s |
VPN [62] | 70.2 | 65.5 | 0.63 s | ||
BVS [15] | 60 | 58.8 | 0.37 s | ||
OFL [14] | 68.0 | 63.4 | 120 s | ||
OnAVOS | 72.7 | - | - | ||
Ours | 80.3 | 79.5 | 0.51 s |
Method | Mean | Mean |
---|---|---|
OSVOS [13] | 52.1 | - |
OnAVOS [12] | 61 | 66.1 |
FAVOS [64] | 45.1 | 55.4 |
RGMP [58] | 64.8 | 68.6 |
OSMN [63] | 52.5 | 57.1 |
Ours | 62.1 | 63.5 |
Method | Component | Mean | B | |
---|---|---|---|---|
Lucid Dream | Attention Module | |||
Basic version | 78.3 | 77.5 | ||
Variant 1 | ✓ | 79.2 | 77.8 | |
Variant 2 | ✓ | ✓ | 80.3 | 79.5 |
Our | +OL | +CRF | |
---|---|---|---|
Mean | 80.3 | 81.0 | 80.8 |
Mean | 79.5 | 79.8 | 79.3 |
time | 0.51 s | +1.81 s | +2.71 s |
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Guo, M.; Zhang, D.; Sun, J.; Wu, Y. Symmetry Encoder-Decoder Network with Attention Mechanism for Fast Video Object Segmentation. Symmetry 2019, 11, 1006. https://doi.org/10.3390/sym11081006
Guo M, Zhang D, Sun J, Wu Y. Symmetry Encoder-Decoder Network with Attention Mechanism for Fast Video Object Segmentation. Symmetry. 2019; 11(8):1006. https://doi.org/10.3390/sym11081006
Chicago/Turabian StyleGuo, Mingyue, Dejun Zhang, Jun Sun, and Yiqi Wu. 2019. "Symmetry Encoder-Decoder Network with Attention Mechanism for Fast Video Object Segmentation" Symmetry 11, no. 8: 1006. https://doi.org/10.3390/sym11081006
APA StyleGuo, M., Zhang, D., Sun, J., & Wu, Y. (2019). Symmetry Encoder-Decoder Network with Attention Mechanism for Fast Video Object Segmentation. Symmetry, 11(8), 1006. https://doi.org/10.3390/sym11081006