Symmetry in Computer Vision and Its Applications

A special issue of Symmetry (ISSN 2073-8994). This special issue belongs to the section "Computer".

Deadline for manuscript submissions: closed (28 March 2022) | Viewed by 53896

Special Issue Editors


E-Mail Website
Guest Editor
School of Computer Science, China University of Geosciences, Wuhan 430074, China
Interests: computer graphics; computer-aided design; computer vision; image, and video processing
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Electronics and Computer Engineering, Hanyang University, Seoul, Korea
Interests: computer vision; pattern recognition; robot vision; 2D/3D vision systems; surveillance systems; 3D display; intelligent vehicle
LightVision Inc., Seoul, Korea
Interests: pattern recognition; machine learning; autonomous vehicle; video surveillance

Special Issue Information

Dear Colleagues,

Computer vision has been one of the fastest-changing and evolving areas of computer science. From the beginning of computer vision research, the key issue of research has been finding good features with some sort of symmetry, as it certainly did in nature. Using this symmetry, you can find better capabilities to detect, classify, or recognize objects in a variety of fields obtained in real environments. Although recent research trends tend to focus on deep learning, the importance of symmetry has not disappeared. Rather, expectations have risen due to the success of computer vision in various fields in recent years, and there are more practical problems to face. While solving these problems powerfully, we would like to see more symmetry being utilized, resulting in better results.

In that context, in this Special Issue, we would like to see academic advancements or interesting applications in the field of computer vision that highlights symmetry, including its contribution to image processing applications. Here, symmetry plays an important role: data growth in deep learning; stochastic gradient descent in deep learning; feature extraction or matching in terms of symmetry, object detection, and tracking using symmetry; symmetry of image segmentation, etc.

Prof. Dr. Dejun Zhang
Prof. Dr. Whoi-Yul Kim
Dr. Moonsoo Ra
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Symmetry is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • deep learning
  • deep convolutional neural network
  • data augmentation
  • stochastic gradient descent
  • feature extraction
  • feature matching
  • symmetry in object detection
  • symmetry in multi-object tracking
  • symmetry in image segmentation

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (14 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

15 pages, 5644 KiB  
Article
Murine Motion Behavior Recognition Based on DeepLabCut and Convolutional Long Short-Term Memory Network
by Ruiqing Liu, Juncai Zhu and Xiaoping Rao
Symmetry 2022, 14(7), 1340; https://doi.org/10.3390/sym14071340 - 29 Jun 2022
Cited by 5 | Viewed by 2764
Abstract
Murine behavior recognition is widely used in biology, neuroscience, pharmacology, and other aspects of research, and provides a basis for judging the psychological and physiological state of mice. To solve the problem whereby traditional behavior recognition methods only model behavioral changes in mice [...] Read more.
Murine behavior recognition is widely used in biology, neuroscience, pharmacology, and other aspects of research, and provides a basis for judging the psychological and physiological state of mice. To solve the problem whereby traditional behavior recognition methods only model behavioral changes in mice over time or space, we propose a symmetrical algorithm that can capture spatiotemporal information based on behavioral changes. The algorithm first uses the improved DeepLabCut keypoint detection algorithm to locate the nose, left ear, right ear, and tail root of the mouse, and then uses the ConvLSTM network to extract spatiotemporal information from the keypoint feature map sequence to classify five behaviors of mice: walking straight, resting, grooming, standing upright, and turning. We developed a murine keypoint detection and behavior recognition dataset, and experiments showed that the method achieved a percentage of correct keypoints (PCK) of 87±1% at three scales and against four backgrounds, while the classification accuracy for the five kinds of behaviors reached 93±1%. The proposed method is thus accurate for keypoint detection and behavior recognition, and is a useful tool for murine motion behavior recognition. Full article
(This article belongs to the Special Issue Symmetry in Computer Vision and Its Applications)
Show Figures

Figure 1

19 pages, 14625 KiB  
Article
Learning Multifeature Correlation Filter and Saliency Redetection for Long-Term Object Tracking
by Liqiang Liu, Tiantian Feng and Yanfang Fu
Symmetry 2022, 14(5), 911; https://doi.org/10.3390/sym14050911 - 29 Apr 2022
Cited by 4 | Viewed by 1934
Abstract
Recently due to the good balance between performance and tracking speed, the discriminative correlation filter (DCF) has become a popular and excellent tracking method in short-term tracking. Computing the correlation of a response map can be efficiently performed in the Fourier domain by [...] Read more.
Recently due to the good balance between performance and tracking speed, the discriminative correlation filter (DCF) has become a popular and excellent tracking method in short-term tracking. Computing the correlation of a response map can be efficiently performed in the Fourier domain by the discrete Fourier transform (DFT) of the input, where the DFT of an image has symmetry in the Fourier domain. However, most of the correlation filter (CF)-based trackers cannot deal with the tracking results and lack the effective mechanism to adjust the tracked errors during the tracking process, thus usually perform poorly in long-term tracking. In this paper, we propose a long-term tracking framework, which includes a tracking-by-detection part and redetection part. The tracking-by-detection part is built on a DCF framework, by integrating with a multifeature fusion model, which can effectively improve the discriminant ability of the correlation filter for some challenging situations, such as occlusion and color change. The redetection part can search the tracked object in a larger region and refine the tracking results after the tracking has failed. Benefited by the proposed redetection strategy, the tracking results are re-evaluated and refined, if it is necessary, in each frame. Moreover, the reliable estimation module in the redetection part can effectively identify whether the tracking results are correct and determine whether the redetector needs to open. The proposed redetection part utilizes a saliency detection algorithm, which is fast and valid for object detection in a limited region. These two parts can be integrated into DCF-based tracking methods to improve the long-term tracking performance and robustness. Extensive experiments on OTB2015 and VOT2016 benchmarks show that our proposed long-term tracking method has a proven effectiveness and high efficiency compared with various tracking methods. Full article
(This article belongs to the Special Issue Symmetry in Computer Vision and Its Applications)
Show Figures

Figure 1

19 pages, 6745 KiB  
Article
A Multi-Attention UNet for Semantic Segmentation in Remote Sensing Images
by Yu Sun, Fukun Bi, Yangte Gao, Liang Chen and Suting Feng
Symmetry 2022, 14(5), 906; https://doi.org/10.3390/sym14050906 - 29 Apr 2022
Cited by 39 | Viewed by 10128
Abstract
In recent years, with the development of deep learning, semantic segmentation for remote sensing images has gradually become a hot issue in computer vision. However, segmentation for multicategory targets is still a difficult problem. To address the issues regarding poor precision and multiple [...] Read more.
In recent years, with the development of deep learning, semantic segmentation for remote sensing images has gradually become a hot issue in computer vision. However, segmentation for multicategory targets is still a difficult problem. To address the issues regarding poor precision and multiple scales in different categories, we propose a UNet, based on multi-attention (MA-UNet). Specifically, we propose a residual encoder, based on a simple attention module, to improve the extraction capability of the backbone for fine-grained features. By using multi-head self-attention for the lowest level feature, the semantic representation of the given feature map is reconstructed, further implementing fine-grained segmentation for different categories of pixels. Then, to address the problem of multiple scales in different categories, we increase the number of down-sampling to subdivide the feature sizes of the target at different scales, and use channel attention and spatial attention in different feature fusion stages, to better fuse the feature information of the target at different scales. We conducted experiments on the WHDLD datasets and DLRSD datasets. The results show that, with multiple visual attention feature enhancements, our method achieves 63.94% mean intersection over union (IOU) on the WHDLD datasets; this result is 4.27% higher than that of UNet, and on the DLRSD datasets, the mean IOU of our methods improves UNet’s 56.17% to 61.90%, while exceeding those of other advanced methods. Full article
(This article belongs to the Special Issue Symmetry in Computer Vision and Its Applications)
Show Figures

Figure 1

12 pages, 1689 KiB  
Article
A Large-Scale Mouse Pose Dataset for Mouse Pose Estimation
by Jun Sun, Jing Wu, Xianghui Liao, Sijia Wang and Mantao Wang
Symmetry 2022, 14(5), 875; https://doi.org/10.3390/sym14050875 - 25 Apr 2022
Cited by 1 | Viewed by 4236
Abstract
Mouse pose estimations have important applications in the fields of animal behavior research, biomedicine, and animal conservation studies. Accurate and efficient mouse pose estimations using computer vision are necessary. Although methods for mouse pose estimations have developed, bottlenecks still exist. One of the [...] Read more.
Mouse pose estimations have important applications in the fields of animal behavior research, biomedicine, and animal conservation studies. Accurate and efficient mouse pose estimations using computer vision are necessary. Although methods for mouse pose estimations have developed, bottlenecks still exist. One of the most prominent problems is the lack of uniform and standardized training datasets. Here, we resolve this difficulty by introducing the mouse pose dataset. Our mouse pose dataset contains 40,000 frames of RGB images and large-scale 2D ground-truth motion images. All the images were captured from interacting lab mice through a stable single viewpoint, including 5 distinct species and 20 mice in total. Moreover, to improve the annotation efficiency, five keypoints of mice are creatively proposed, in which one keypoint is at the center and the other two pairs of keypoints are symmetric. Then, we created simple, yet effective software that works for annotating images. It is another important link to establish a benchmark model for 2D mouse pose estimations. We employed modified object detections and pose estimation algorithms to achieve precise, effective, and robust performances. As the first large and standardized mouse pose dataset, our proposed mouse pose dataset will help advance research on animal pose estimations and assist in application areas related to animal experiments. Full article
(This article belongs to the Special Issue Symmetry in Computer Vision and Its Applications)
Show Figures

Figure 1

16 pages, 1709 KiB  
Article
PointSCNet: Point Cloud Structure and Correlation Learning Based on Space-Filling Curve-Guided Sampling
by Xingye Chen, Yiqi Wu, Wenjie Xu, Jin Li, Huaiyi Dong and Yilin Chen
Symmetry 2022, 14(1), 8; https://doi.org/10.3390/sym14010008 - 22 Dec 2021
Cited by 7 | Viewed by 3964
Abstract
Geometrical structures and the internal local region relationship, such as symmetry, regular array, junction, etc., are essential for understanding a 3D shape. This paper proposes a point cloud feature extraction network named PointSCNet, to capture the geometrical structure information and local region correlation [...] Read more.
Geometrical structures and the internal local region relationship, such as symmetry, regular array, junction, etc., are essential for understanding a 3D shape. This paper proposes a point cloud feature extraction network named PointSCNet, to capture the geometrical structure information and local region correlation information of a point cloud. The PointSCNet consists of three main modules: the space-filling curve-guided sampling module, the information fusion module, and the channel-spatial attention module. The space-filling curve-guided sampling module uses Z-order curve coding to sample points that contain geometrical correlation. The information fusion module uses a correlation tensor and a set of skip connections to fuse the structure and correlation information. The channel-spatial attention module enhances the representation of key points and crucial feature channels to refine the network. The proposed PointSCNet is evaluated on shape classification and part segmentation tasks. The experimental results demonstrate that the PointSCNet outperforms or is on par with state-of-the-art methods by learning the structure and correlation of point clouds effectively. Full article
(This article belongs to the Special Issue Symmetry in Computer Vision and Its Applications)
Show Figures

Figure 1

29 pages, 23243 KiB  
Article
PDAM–STPNNet: A Small Target Detection Approach for Wildland Fire Smoke through Remote Sensing Images
by Jialei Zhan, Yaowen Hu, Weiwei Cai, Guoxiong Zhou and Liujun Li
Symmetry 2021, 13(12), 2260; https://doi.org/10.3390/sym13122260 - 27 Nov 2021
Cited by 34 | Viewed by 4147
Abstract
The target detection of smoke through remote sensing images obtained by means of unmanned aerial vehicles (UAVs) can be effective for monitoring early forest fires. However, smoke targets in UAV images are often small and difficult to detect accurately. In this paper, we [...] Read more.
The target detection of smoke through remote sensing images obtained by means of unmanned aerial vehicles (UAVs) can be effective for monitoring early forest fires. However, smoke targets in UAV images are often small and difficult to detect accurately. In this paper, we use YOLOX-L as a baseline and propose a forest smoke detection network based on the parallel spatial domain attention mechanism and a small-scale transformer feature pyramid network (PDAM–STPNNet). First, to enhance the proportion of small forest fire smoke targets in the dataset, we use component stitching data enhancement to generate small forest fire smoke target images in a scaled collage. Then, to fully extract the texture features of smoke, we propose a parallel spatial domain attention mechanism (PDAM) to consider the local and global textures of smoke with symmetry. Finally, we propose a small-scale transformer feature pyramid network (STPN), which uses the transformer encoder to replace all CSP_2 blocks in turn on top of YOLOX-L’s FPN, effectively improving the model’s ability to extract small-target smoke. We validated the effectiveness of our model with recourse to a home-made dataset, the Wildfire Observers and Smoke Recognition Homepage, and the Bowfire dataset. The experiments show that our method has a better detection capability than previous methods. Full article
(This article belongs to the Special Issue Symmetry in Computer Vision and Its Applications)
Show Figures

Graphical abstract

15 pages, 3047 KiB  
Article
Bi-SANet—Bilateral Network with Scale Attention for Retinal Vessel Segmentation
by Yun Jiang, Huixia Yao, Zeqi Ma and Jingyao Zhang
Symmetry 2021, 13(10), 1820; https://doi.org/10.3390/sym13101820 - 29 Sep 2021
Cited by 4 | Viewed by 2162
Abstract
The segmentation of retinal vessels is critical for the diagnosis of some fundus diseases. Retinal vessel segmentation requires abundant spatial information and receptive fields with different sizes while existing methods usually sacrifice spatial resolution to achieve real-time reasoning speed, resulting in inadequate vessel [...] Read more.
The segmentation of retinal vessels is critical for the diagnosis of some fundus diseases. Retinal vessel segmentation requires abundant spatial information and receptive fields with different sizes while existing methods usually sacrifice spatial resolution to achieve real-time reasoning speed, resulting in inadequate vessel segmentation of low-contrast regions and weak anti-noise interference ability. The asymmetry of capillaries in fundus images also increases the difficulty of segmentation. In this paper, we proposed a two-branch network based on multi-scale attention to alleviate the above problem. First, a coarse network with multi-scale U-Net as the backbone is designed to capture more semantic information and to generate high-resolution features. A multi-scale attention module is used to obtain enough receptive fields. The other branch is a fine network, which uses the residual block of a small convolution kernel to make up for the deficiency of spatial information. Finally, we use the feature fusion module to aggregate the information of the coarse and fine networks. The experiments were performed on the DRIVE, CHASE, and STARE datasets. Respectively, the accuracy reached 96.93%, 97.58%, and 97.70%. The specificity reached 97.72%, 98.52%, and 98.94%. The F-measure reached 83.82%, 81.39%, and 84.36%. Experimental results show that compared with some state-of-art methods such as Sine-Net, SA-Net, our proposed method has better performance on three datasets. Full article
(This article belongs to the Special Issue Symmetry in Computer Vision and Its Applications)
Show Figures

Figure 1

14 pages, 4311 KiB  
Article
Reinforced Neighbour Feature Fusion Object Detection with Deep Learning
by Ningwei Wang, Yaze Li and Hongzhe Liu
Symmetry 2021, 13(9), 1623; https://doi.org/10.3390/sym13091623 - 3 Sep 2021
Cited by 5 | Viewed by 2014
Abstract
Neural networks have enabled state-of-the-art approaches to achieve incredible results on computer vision tasks such as object detection. However, previous works have tried to improve the performance in various object detection necks but have failed to extract features efficiently. To solve the insufficient [...] Read more.
Neural networks have enabled state-of-the-art approaches to achieve incredible results on computer vision tasks such as object detection. However, previous works have tried to improve the performance in various object detection necks but have failed to extract features efficiently. To solve the insufficient features of objects, this work introduces some of the most advanced and representative network models based on the Faster R-CNN architecture, such as Libra R-CNN, Grid R-CNN, guided anchoring, and GRoIE. We observed the performance of Neighbour Feature Pyramid Network (NFPN) fusion, ResNet Region of Interest Feature Extraction (ResRoIE) and the Recursive Feature Pyramid (RFP) architecture at different scales of precision when these components were used in place of the corresponding original members in various networks obtained on the MS COCO dataset. Compared to the experimental results after replacing the neck and RoIE parts of these models with our Reinforced Neighbour Feature Fusion (RNFF) model, the average precision (AP) is increased by 3.2 percentage points concerning the performance of the baseline network. Full article
(This article belongs to the Special Issue Symmetry in Computer Vision and Its Applications)
Show Figures

Figure 1

18 pages, 82446 KiB  
Article
Research on Remote Sensing Image Matching with Special Texture Background
by Sen Wang, Xiaoming Sun, Pengfei Liu, Kaige Xu, Weifeng Zhang and Chenxu Wu
Symmetry 2021, 13(8), 1380; https://doi.org/10.3390/sym13081380 - 29 Jul 2021
Cited by 4 | Viewed by 1900
Abstract
The purpose of image registration is to find the symmetry between the reference image and the image to be registered. In order to improve the registration effect of unmanned aerial vehicle (UAV) remote sensing imagery with a special texture background, this paper proposes [...] Read more.
The purpose of image registration is to find the symmetry between the reference image and the image to be registered. In order to improve the registration effect of unmanned aerial vehicle (UAV) remote sensing imagery with a special texture background, this paper proposes an improved scale-invariant feature transform (SIFT) algorithm by combining image color and exposure information based on adaptive quantization strategy (AQCE-SIFT). By using the color and exposure information of the image, this method can enhance the contrast between the textures of the image with a special texture background, which allows easier feature extraction. The algorithm descriptor was constructed through an adaptive quantization strategy, so that remote sensing images with large geometric distortion or affine changes have a higher correct matching rate during registration. The experimental results showed that the AQCE-SIFT algorithm proposed in this paper was more reasonable in the distribution of the extracted feature points compared with the traditional SIFT algorithm. In the case of 0 degree, 30 degree, and 60 degree image geometric distortion, when the remote sensing image had a texture scarcity region, the number of matching points increased by 21.3%, 45.5%, and 28.6%, respectively and the correct matching rate increased by 0%, 6.0%, and 52.4%, respectively. When the remote sensing image had a large number of similar repetitive regions of texture, the number of matching points increased by 30.4%, 30.9%, and −11.1%, respectively and the correct matching rate increased by 1.2%, 0.8%, and 20.8% respectively. When processing remote sensing images with special texture backgrounds, the AQCE-SIFT algorithm also has more advantages than the existing common algorithms such as color SIFT (CSIFT), gradient location and orientation histogram (GLOH), and speeded-up robust features (SURF) in searching for the symmetry of features between images. Full article
(This article belongs to the Special Issue Symmetry in Computer Vision and Its Applications)
Show Figures

Figure 1

18 pages, 31516 KiB  
Article
MRDA-MGFSNet: Network Based on a Multi-Rate Dilated Attention Mechanism and Multi-Granularity Feature Sharer for Image-Based Butterflies Fine-Grained Classification
by Maopeng Li, Guoxiong Zhou, Weiwei Cai, Jiayong Li, Mingxuan Li, Mingfang He, Yahui Hu and Liujun Li
Symmetry 2021, 13(8), 1351; https://doi.org/10.3390/sym13081351 - 26 Jul 2021
Cited by 5 | Viewed by 2527
Abstract
Aiming at solving the problems of high background complexity of some butterfly images and the difficulty in identifying them caused by their small inter-class variance, we propose a new fine-grained butterfly classification architecture, called Network based on Multi-rate Dilated Attention Mechanism and Multi-granularity [...] Read more.
Aiming at solving the problems of high background complexity of some butterfly images and the difficulty in identifying them caused by their small inter-class variance, we propose a new fine-grained butterfly classification architecture, called Network based on Multi-rate Dilated Attention Mechanism and Multi-granularity Feature Sharer (MRDA-MGFSNet). First, in this network, in order to effectively identify similar patterns between butterflies and suppress the information that is similar to the butterfly’s features in the background but is invalid, a Multi-rate Dilated Attention Mechanism (MRDA) with a symmetrical structure which assigns different weights to channel and spatial features is designed. Second, fusing the multi-scale receptive field module with the depthwise separable convolution module, a Multi-granularity Feature Sharer (MGFS), which can better solve the recognition problem of a small inter-class variance and reduce the increase in parameters caused by multi-scale receptive fields, is proposed. In order to verify the feasibility and effectiveness of the model in a complex environment, compared with the existing methods, our proposed method obtained a mAP of 96.64%, and an F1 value of 95.44%, which showed that the method proposed in this paper has a good effect on the fine-grained classification of butterflies. Full article
(This article belongs to the Special Issue Symmetry in Computer Vision and Its Applications)
Show Figures

Figure 1

14 pages, 4870 KiB  
Article
A Descriptor-Based Advanced Feature Detector for Improved Visual Tracking
by Kai Yit Kok and Parvathy Rajendran
Symmetry 2021, 13(8), 1337; https://doi.org/10.3390/sym13081337 - 24 Jul 2021
Cited by 3 | Viewed by 2222
Abstract
Despite years of work, a robust, widely applicable generic “symmetry detector” that can paral-lel other kinds of computer vision/image processing tools for the more basic structural charac-teristics, such as a “edge” or “corner” detector, remains a computational challenge. A new symmetry feature detector [...] Read more.
Despite years of work, a robust, widely applicable generic “symmetry detector” that can paral-lel other kinds of computer vision/image processing tools for the more basic structural charac-teristics, such as a “edge” or “corner” detector, remains a computational challenge. A new symmetry feature detector with a descriptor is proposed in this paper, namely the Simple Robust Features (SRF) algorithm. A performance comparison is made among SRF with SRF, Speeded-up Robust Features (SURF) with SURF, Maximally Stable Extremal Regions (MSER) with SURF, Harris with Fast Retina Keypoint (FREAK), Minimum Eigenvalue with FREAK, Features from Accelerated Segment Test (FAST) with FREAK, and Binary Robust Invariant Scalable Keypoints (BRISK) with FREAK. A visual tracking dataset is used in this performance evaluation in terms of accuracy and computational cost. The results have shown that combining the SRF detector with the SRF descriptor is preferable, as it has on average the highest accuracy. Additionally, the computational cost of SRF with SRF is much lower than the others. Full article
(This article belongs to the Special Issue Symmetry in Computer Vision and Its Applications)
Show Figures

Figure 1

16 pages, 2295 KiB  
Article
FE-RetinaNet: Small Target Detection with Parallel Multi-Scale Feature Enhancement
by Hong Liang, Junlong Yang and Mingwen Shao
Symmetry 2021, 13(6), 950; https://doi.org/10.3390/sym13060950 - 27 May 2021
Cited by 11 | Viewed by 2724
Abstract
Because small targets have fewer pixels and carry fewer features, most target detection algorithms cannot effectively use the edge information and semantic information of small targets in the feature map, resulting in low detection accuracy, missed detections, and false detections from time to [...] Read more.
Because small targets have fewer pixels and carry fewer features, most target detection algorithms cannot effectively use the edge information and semantic information of small targets in the feature map, resulting in low detection accuracy, missed detections, and false detections from time to time. To solve the shortcoming of insufficient information features of small targets in the RetinaNet, this work introduces a parallel-assisted multi-scale feature enhancement module MFEM (Multi-scale Feature Enhancement Model), which uses dilated convolution with different expansion rates to avoid multiple down sampling. MFEM avoids information loss caused by multiple down sampling, and at the same time helps to assist shallow extraction of multi-scale context information. Additionally, this work adopts a backbone network improvement plan specifically designed for target detection tasks, which can effectively save small target information in high-level feature maps. The traditional top-down pyramid structure focuses on transferring high-level semantics from the top to the bottom, and the one-way information flow is not conducive to the detection of small targets. In this work, the auxiliary MFEM branch is combined with RetinaNet to construct a model with a bidirectional feature pyramid network, which can effectively integrate the strong semantic information of the high-level network and high-resolution information regarding the low level. The bidirectional feature pyramid network designed in this work is a symmetrical structure, including a top-down branch and a bottom-up branch, performs the transfer and fusion of strong semantic information and strong resolution information. To prove the effectiveness of the algorithm FE-RetinaNet (Feature Enhancement RetinaNet), this work conducts experiments on the MS COCO. Compared with the original RetinaNet, the improved RetinaNet has achieved a 1.8% improvement in the detection accuracy (mAP) on the MS COCO, and the COCO AP is 36.2%; FE-RetinaNet has a good detection effect on small targets, with APs increased by 3.2%. Full article
(This article belongs to the Special Issue Symmetry in Computer Vision and Its Applications)
Show Figures

Figure 1

19 pages, 9107 KiB  
Article
Multi-Stroke Thai Finger-Spelling Sign Language Recognition System with Deep Learning
by Thongpan Pariwat and Pusadee Seresangtakul
Symmetry 2021, 13(2), 262; https://doi.org/10.3390/sym13020262 - 4 Feb 2021
Cited by 17 | Viewed by 4882
Abstract
Sign language is a type of language for the hearing impaired that people in the general public commonly do not understand. A sign language recognition system, therefore, represents an intermediary between the two sides. As a communication tool, a multi-stroke Thai finger-spelling sign [...] Read more.
Sign language is a type of language for the hearing impaired that people in the general public commonly do not understand. A sign language recognition system, therefore, represents an intermediary between the two sides. As a communication tool, a multi-stroke Thai finger-spelling sign language (TFSL) recognition system featuring deep learning was developed in this study. This research uses a vision-based technique on a complex background with semantic segmentation performed with dilated convolution for hand segmentation, hand strokes separated using optical flow, and learning feature and classification done with convolution neural network (CNN). We then compared the five CNN structures that define the formats. The first format was used to set the number of filters to 64 and the size of the filter to 3 × 3 with 7 layers; the second format used 128 filters, each filter 3 × 3 in size with 7 layers; the third format used the number of filters in ascending order with 7 layers, all of which had an equal 3 × 3 filter size; the fourth format determined the number of filters in ascending order and the size of the filter based on a small size with 7 layers; the final format was a structure based on AlexNet. As a result, the average accuracy was 88.83%, 87.97%, 89.91%, 90.43%, and 92.03%, respectively. We implemented the CNN structure based on AlexNet to create models for multi-stroke TFSL recognition systems. The experiment was performed using an isolated video of 42 Thai alphabets, which are divided into three categories consisting of one stroke, two strokes, and three strokes. The results presented an 88.00% average accuracy for one stroke, 85.42% for two strokes, and 75.00% for three strokes. Full article
(This article belongs to the Special Issue Symmetry in Computer Vision and Its Applications)
Show Figures

Figure 1

23 pages, 12149 KiB  
Article
Vacant Parking Slot Recognition Method for Practical Autonomous Valet Parking System Using around View Image
by Seunghyun Kim, Joongsik Kim, Moonsoo Ra and Whoi-Yul Kim
Symmetry 2020, 12(10), 1725; https://doi.org/10.3390/sym12101725 - 19 Oct 2020
Cited by 8 | Viewed by 5672
Abstract
The parking assist system (PAS) provides information of parking slots around the vehicle. As the demand for an autonomous system is increasing, intelligent PAS has been developed to park the vehicle without the driver’s intervention. To locate parking slots, most existing methods detect [...] Read more.
The parking assist system (PAS) provides information of parking slots around the vehicle. As the demand for an autonomous system is increasing, intelligent PAS has been developed to park the vehicle without the driver’s intervention. To locate parking slots, most existing methods detect slot markings on the ground using an around-view monitoring (AVM) image. There are many types of parking slots of different shapes in the real world. Due to this fact, these methods either limit their target types or use predefined slot information of different types to cover the types. However, the approach using predefined slot information cannot handle more complex cases where the slot markings are connected to other line markings and the angle between slot marking is slightly different from the predefined settings. To overcome this problem, we propose a method to detect parking slots of various shapes without predefined type information. The proposed method is the first to introduce a free junction type feature to represent the structure of parking slot junction. Since the parking slot has a modular or repeated junction pattern at both sides, junction pair consisting of one parking slot can be detected using the free junction type feature. In this process, the geometrically symmetric characteristic of the junction pair is crucial to find each junction pair. The entrance of parking slot is reconstructed according to the structure of junction pair. Then, the vacancy of the parking slot is determined by a support vector machine. The Kalman tracker is applied for each detected parking slot to ensure stability of the detection in consecutive frames. We evaluate the performance of the proposed method by using manually collected datasets, captured in different parking environments. The experimental results show that the proposed method successfully detects various types of parking slots without predefined slot type information in different environments. Full article
(This article belongs to the Special Issue Symmetry in Computer Vision and Its Applications)
Show Figures

Figure 1

Back to TopTop