Machine Learning and Deep Learning Based Pattern Recognition

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Artificial Intelligence".

Deadline for manuscript submissions: 14 May 2025 | Viewed by 25448

Special Issue Editors


E-Mail Website
Guest Editor
Pattern Processing Lab, School of Computer Science and Engineering, The University of Aizu, Aizu-Wakamatsu, Fukushima 965-8580, Japan
Interests: pattern recognition; character recognition; image processing; computer vision; human–computer interaction; neurological disease analysis; machine learning
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Computer Science and Engineering, Rajshahi University of Engineering and Technology(RUET), Rajshahi 6204, Bangladesh
Interests: bioinformatics; artificial intelligence; pattern recognition; medical image and signal processing; machine learning; computer vision

E-Mail Website
Guest Editor
Computer Communications Laboratory, School of Computer Science and Engineering, The University of Aizu, Aizu-Wakamatsu, Fukushima 965-8580, Japan
Interests: applications of artificial intelligence/machine learning for wireless networks; wireless communication networks; network security

Special Issue Information

Dear Colleagues,

In the modern digital world, patterns can be found in many facets of daily life. They can be physically observed or computationally detected using algorithms. In the digital environment, a pattern is represented by a vector or matrix feature value. Recently, numerous machine learning (ML)- and deep learning (DL)-based techniques have been widely used in order to handle or analyze these feature values in the artificial intelligence (AI) domain. ML is a branch of AI and its goal is to let the computer make its own decisions with minimal human involvement using pattern data. On the other hand, DL is a branch of ML and a popular topic in the field of AI.  Using DL and ML models to extract meaningful features from the given text, image, video, or sensor data and analyze those features is known as pattern recognition (PR). PR has been used in various applications in the fields of engineering such as computer vision, sensor data analysis, natural language processing, speech recognition, robotics, bioinformatics, and so on. The goal of this Special Issue is to publish innovative and technically sound research papers that exhibit theoretical and practical contributions to PR utilizing ML and DL methodologies.

In this Special Issue, original research articles and reviews are welcome. Research areas may include (but are not limited to) the following:

  • Image processing/segmentation/recognition;
  • Computer vision;
  • Speech recognition;
  • Automated target recognition;
  • Character recognition;
  • Gesture and human activity recognition;
  • Industrial inspection;
  • Medical diagnosis;
  • Health informatics;
  • Biosignal processing;
  • Bioinformatics;
  • Remote sensing ;
  • Healthcare application;
  • ML and DL and the Internet of Things (IoT);
  • Large dataset analysis;
  • Current state-of-the-art and future trends of ML and DL.

Prof. Dr. Jungpil Shin
Prof. Dr. Md. Al Mehedi Hasan 
Dr. Hoang D. Le
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • machine learning
  • deep learning
  • pattern recognition

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (13 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

33 pages, 28200 KiB  
Article
Hierarchical Image Quality Improvement Based on Illumination, Resolution, and Noise Factors for Improving Object Detection
by Tae-su Wang, Gi-Tae Kim, Jungpil Shin and Si-Woong Jang
Electronics 2024, 13(22), 4438; https://doi.org/10.3390/electronics13224438 - 12 Nov 2024
Viewed by 460
Abstract
Object detection performance is significantly impacted by image quality factors such as illumination, resolution, and noise. This paper proposes a hierarchical image quality improvement process that dynamically prioritizes these factors based on severity, enhancing detection accuracy in diverse conditions. The process evaluates each [...] Read more.
Object detection performance is significantly impacted by image quality factors such as illumination, resolution, and noise. This paper proposes a hierarchical image quality improvement process that dynamically prioritizes these factors based on severity, enhancing detection accuracy in diverse conditions. The process evaluates each factor—illumination, resolution, and noise—using discriminators that analyze brightness, edge strength, and noise levels. Improvements are applied iteratively with an adaptive weight update mechanism that adjusts factor importance based on improvement effectiveness. Following each improvement, a quality assessment is conducted, updating weights to fine-tune subsequent adjustments. This allows the process to learn optimal parameters for varying conditions, enhancing adaptability. The image improved through the proposed process shows improved quality through quality index (PSNR, SSIM) evaluation, and the object detection accuracy is significantly improved when the performance is measured using deep learning models called YOLOv8 and RT-DETR. The detection rate is improved by 7% for the ‘Bottle’ object in a high-light environment, and by 4% and 2.5% for the ‘Bicycle’ and ‘Car’ objects in a low-light environment, respectively. Additionally, segmentation accuracy saw a 9.45% gain, supporting the effectiveness of this method in real-world applications. Full article
(This article belongs to the Special Issue Machine Learning and Deep Learning Based Pattern Recognition)
Show Figures

Figure 1

17 pages, 4756 KiB  
Article
CFE-YOLOv8s: Improved YOLOv8s for Steel Surface Defect Detection
by Shuxin Yang, Yang Xie, Jianqing Wu, Weidong Huang, Hongsheng Yan, Jingyong Wang, Bi Wang, Xiangchun Yu, Qiang Wu and Fei Xie
Electronics 2024, 13(14), 2771; https://doi.org/10.3390/electronics13142771 - 15 Jul 2024
Viewed by 1409
Abstract
Due to the low detection accuracy in steel surface defect detection and the constraints of limited hardware resources, we propose an improved model for steel surface defect detection, named CBiF-FC-EFC-YOLOv8s (CFE-YOLOv8s), including CBS-BiFormer (CBiF) modules, Faster-C2f (FC) modules, and EMA-Faster-C2f (EFC) modules. Firstly, [...] Read more.
Due to the low detection accuracy in steel surface defect detection and the constraints of limited hardware resources, we propose an improved model for steel surface defect detection, named CBiF-FC-EFC-YOLOv8s (CFE-YOLOv8s), including CBS-BiFormer (CBiF) modules, Faster-C2f (FC) modules, and EMA-Faster-C2f (EFC) modules. Firstly, because of the potential information loss that convolutional neural networks (CNN) may encounter when dealing with miniature targets, the CBiF combines CNN with Transformer to optimize local and global features. Secondly, to address the increased computational complexity caused by the extensive use of convolutional layers, the FC uses the FasterNet block to reduce redundant computations and memory access. Lastly, the EMA is incorporated into the FC to design the EFC module and enhance feature fusion capability while ensuring the light weight of the model. CFE-YOLOv8s achieves [email protected] values of 77.8% and 69.5% on the NEU-DET and GC10-DET datasets, respectively, representing enhancements of 3.1% and 2.8% over YOLOv8s, with reductions of 22% and 18% in model parameters and FLOPS. The CFE-YOLOv8s demonstrates superior overall performance and balance compared to other advanced models. Full article
(This article belongs to the Special Issue Machine Learning and Deep Learning Based Pattern Recognition)
Show Figures

Figure 1

32 pages, 18590 KiB  
Article
Data-Driven Machine Fault Diagnosis of Multisensor Vibration Data Using Synchrosqueezed Transform and Time-Frequency Image Recognition with Convolutional Neural Network
by Dominik Łuczak
Electronics 2024, 13(12), 2411; https://doi.org/10.3390/electronics13122411 - 20 Jun 2024
Cited by 4 | Viewed by 1011
Abstract
Accurate vibration classification using inertial measurement unit (IMU) data is critical for various applications such as condition monitoring and fault diagnosis. This study proposes a novel convolutional neural network (CNN) based approach, the IMU6DoF-SST-CNN in six variants, for robust vibration classification. The method [...] Read more.
Accurate vibration classification using inertial measurement unit (IMU) data is critical for various applications such as condition monitoring and fault diagnosis. This study proposes a novel convolutional neural network (CNN) based approach, the IMU6DoF-SST-CNN in six variants, for robust vibration classification. The method utilizes Fourier synchrosqueezed transform (FSST) and wavelet synchrosqueezed transform (WSST) for time-frequency analysis, effectively capturing the temporal and spectral characteristics of the vibration data. Additionally, was used the IMU6DoF-SST-CNN to explore three different fusion strategies for sensor data to combine information from the IMU’s multiple axes, allowing the CNN to learn from complementary information across various axes. The efficacy of the proposed method was validated using three datasets. The first dataset consisted of constant fan velocity data (three classes: idle, normal operation, and fault) at 200 Hz. The second dataset contained variable fan velocity data (also with three classes: normal operation, fault 1, and fault 2) at 2000 Hz. Finally, a third dataset of Case Western Reserve University (CWRU) comprised bearing fault data with thirteen classes, sampled at 12 kHz. The proposed method achieved a perfect validation accuracy for the investigated vibration classification task. While all variants of the method achieved high accuracy, a trade-off between training speed and image generation efficiency was observed. Furthermore, FSST demonstrated superior localization capabilities compared to traditional methods like continuous wavelet transform (CWT) and short-time Fourier transform (STFT), as confirmed by image representations and interpretability analysis. This improved localization allows the CNN to effectively capture transient features associated with faults, leading to more accurate vibration classification. Overall, this study presents a promising and efficient approach for vibration classification using IMU data with the proposed IMU6DoF-SST-CNN method. The best result was obtained for IMU6DoF-SST-CNN with FSST and sensor-type fusion. Full article
(This article belongs to the Special Issue Machine Learning and Deep Learning Based Pattern Recognition)
Show Figures

Figure 1

19 pages, 2103 KiB  
Article
GDCP-YOLO: Enhancing Steel Surface Defect Detection Using Lightweight Machine Learning Approach
by Zhaohui Yuan, Hao Ning, Xiangyang Tang and Zhengzhe Yang
Electronics 2024, 13(7), 1388; https://doi.org/10.3390/electronics13071388 - 6 Apr 2024
Cited by 5 | Viewed by 3112
Abstract
Surface imperfections in steel materials potentially degrade quality and performance, thereby escalating the risk of accidents in engineering applications. Manual inspection, while traditional, is laborious and lacks consistency. However, recent advancements in machine learning and computer vision have paved the way for automated [...] Read more.
Surface imperfections in steel materials potentially degrade quality and performance, thereby escalating the risk of accidents in engineering applications. Manual inspection, while traditional, is laborious and lacks consistency. However, recent advancements in machine learning and computer vision have paved the way for automated steel defect detection, yielding superior accuracy and efficiency. This paper introduces an innovative deep learning model, GDCP-YOLO, devised for multi-category steel defect detection. We enhance the reference YOLOv8n architecture by incorporating adaptive receptive fields via the DCNV2 module and channel attention in C2f. These integrations aim to concentrate on valuable features and minimize parameters. We incorporate the efficient Faster Block and employ Ghost convolutions to generate more feature maps with reduced computation. These modifications streamline feature extraction, curtail redundant information processing, and boost detection accuracy and speed. Comparative trials on the NEU-DET dataset underscore the state-of-the-art performance of GDCP-YOLO. Ablation studies and generalization experiments reveal consistent performance across a variety of defect types. The optimized lightweight architecture facilitates real-time automated inspection without sacrificing accuracy, offering invaluable insights to further deep learning techniques for surface defect identification across manufacturing sectors. Full article
(This article belongs to the Special Issue Machine Learning and Deep Learning Based Pattern Recognition)
Show Figures

Figure 1

15 pages, 7888 KiB  
Article
Sign Language Recognition with Multimodal Sensors and Deep Learning Methods
by Chenghong Lu, Misaki Kozakai and Lei Jing
Electronics 2023, 12(23), 4827; https://doi.org/10.3390/electronics12234827 - 29 Nov 2023
Cited by 2 | Viewed by 3012
Abstract
Sign language recognition is essential in hearing-impaired people’s communication. Wearable data gloves and computer vision are partially complementary solutions. However, sign language recognition using a general monocular camera suffers from occlusion and recognition accuracy issues. In this research, we aim to improve accuracy [...] Read more.
Sign language recognition is essential in hearing-impaired people’s communication. Wearable data gloves and computer vision are partially complementary solutions. However, sign language recognition using a general monocular camera suffers from occlusion and recognition accuracy issues. In this research, we aim to improve accuracy through data fusion of 2-axis bending sensors and computer vision. We obtain the hand key point information of sign language movements captured by a monocular RGB camera and use key points to calculate hand joint angles. The system achieves higher recognition accuracy by fusing multimodal data of the skeleton, joint angles, and finger curvature. In order to effectively fuse data, we spliced multimodal data and used CNN-BiLSTM to extract effective features for sign language recognition. CNN is a method that can learn spatial information, and BiLSTM can learn time series data. We built a data collection system with bending sensor data gloves and cameras. A dataset was collected that contains 32 Japanese sign language movements of seven people, including 27 static movements and 5 dynamic movements. Each movement is repeated 10 times, totaling about 112 min. In particular, we obtained data containing occlusions. Experimental results show that our system can fuse multimodal information and perform better than using only skeletal information, with the accuracy increasing from 68.34% to 84.13%. Full article
(This article belongs to the Special Issue Machine Learning and Deep Learning Based Pattern Recognition)
Show Figures

Figure 1

18 pages, 1035 KiB  
Article
Comparison of Different Methods for Building Ensembles of Convolutional Neural Networks
by Loris Nanni, Andrea Loreggia and Sheryl Brahnam
Electronics 2023, 12(21), 4428; https://doi.org/10.3390/electronics12214428 - 27 Oct 2023
Cited by 1 | Viewed by 1370
Abstract
In computer vision and image analysis, Convolutional Neural Networks (CNNs) and other deep-learning models are at the forefront of research and development. These advanced models have proven to be highly effective in tasks related to computer vision. One technique that has gained prominence [...] Read more.
In computer vision and image analysis, Convolutional Neural Networks (CNNs) and other deep-learning models are at the forefront of research and development. These advanced models have proven to be highly effective in tasks related to computer vision. One technique that has gained prominence in recent years is the construction of ensembles using deep CNNs. These ensembles typically involve combining multiple pretrained CNNs to create a more powerful and robust network. The purpose of this study is to evaluate the effectiveness of building CNN ensembles by combining several advanced techniques. Tested here are CNN ensembles constructed by replacing ReLU layers with different activation functions, employing various data-augmentation techniques, and utilizing several algorithms, including some novel ones, that perturb network weights. Experimental results performed across many datasets representing different tasks demonstrate that our proposed methods for building deep ensembles produces superior results. Full article
(This article belongs to the Special Issue Machine Learning and Deep Learning Based Pattern Recognition)
Show Figures

Figure 1

12 pages, 16406 KiB  
Article
A Study on Webtoon Generation Using CLIP and Diffusion Models
by Kyungho Yu, Hyoungju Kim, Jeongin Kim, Chanjun Chun and Pankoo Kim
Electronics 2023, 12(18), 3983; https://doi.org/10.3390/electronics12183983 - 21 Sep 2023
Viewed by 2301
Abstract
This study focuses on harnessing deep-learning-based text-to-image transformation techniques to help webtoon creators’ creative outputs. We converted publicly available datasets (e.g., MSCOCO) into a multimodal webtoon dataset using CartoonGAN. First, the dataset was leveraged for training contrastive language image pre-training (CLIP), a model [...] Read more.
This study focuses on harnessing deep-learning-based text-to-image transformation techniques to help webtoon creators’ creative outputs. We converted publicly available datasets (e.g., MSCOCO) into a multimodal webtoon dataset using CartoonGAN. First, the dataset was leveraged for training contrastive language image pre-training (CLIP), a model composed of multi-lingual BERT and a Vision Transformer that learnt to associate text with images. Second, a pre-trained diffusion model was employed to generate webtoons through text and text-similar image input. The webtoon dataset comprised treatments (i.e., textual descriptions) paired with their corresponding webtoon illustrations. CLIP (operating through contrastive learning) extracted features from different data modalities and aligned similar data more closely within the same feature space while pushing dissimilar data apart. This model learnt the relationships between various modalities in multimodal data. To generate webtoons using the diffusion model, the process involved providing the CLIP features of the desired webtoon’s text with those of the most text-similar image to a pre-trained diffusion model. Experiments were conducted using both single- and continuous-text inputs to generate webtoons. In the experiments, both single-text and continuous-text inputs were used to generate webtoons, and the results showed an inception score of 7.14 when using continuous-text inputs. The text-to-image technology developed here could streamline the webtoon creation process for artists by enabling the efficient generation of webtoons based on the provided text. However, the current inability to generate webtoons from multiple sentences or images while maintaining a consistent artistic style was noted. Therefore, further research is imperative to develop a text-to-image model capable of handling multi-sentence and -lingual input while ensuring coherence in the artistic style across the generated webtoon images. Full article
(This article belongs to the Special Issue Machine Learning and Deep Learning Based Pattern Recognition)
Show Figures

Figure 1

22 pages, 7626 KiB  
Article
Principal Component Analysis-Based Logistic Regression for Rotated Handwritten Digit Recognition in Consumer Devices
by Chao-Chung Peng, Chao-Yang Huang and Yi-Ho Chen
Electronics 2023, 12(18), 3809; https://doi.org/10.3390/electronics12183809 - 8 Sep 2023
Viewed by 1347
Abstract
Handwritten digit recognition has been used in many consumer electronic devices for a long time. However, we found that the recognition system used in current consumer electronics is sensitive to image or character rotations. To address this problem, this study builds a low-cost [...] Read more.
Handwritten digit recognition has been used in many consumer electronic devices for a long time. However, we found that the recognition system used in current consumer electronics is sensitive to image or character rotations. To address this problem, this study builds a low-cost and light computation consumption handwritten digit recognition system. A Principal Component Analysis (PCA)-based logistic regression classifier is presented, which is able to provide a certain degree of robustness in the digit subject to rotations. To validate the effectiveness of the developed image recognition algorithm, the popular MNIST dataset is used to conduct performance evaluations. Compared to other popular classifiers installed in MATLAB, the proposed method is able to achieve better prediction results with a smaller model size, which is 18.5% better than the traditional logistic regression. Finally, real-time experiments are conducted to verify the efficiency of the presented method, showing that the proposed system is successfully able to classify the rotated handwritten digit. Full article
(This article belongs to the Special Issue Machine Learning and Deep Learning Based Pattern Recognition)
Show Figures

Figure 1

12 pages, 306 KiB  
Article
Latent Regression Bayesian Network for Speech Representation
by Liang Xu, Yue Zhao, Xiaona Xu, Yigang Liu and Qiang Ji
Electronics 2023, 12(15), 3342; https://doi.org/10.3390/electronics12153342 - 4 Aug 2023
Viewed by 985
Abstract
In this paper, we present a novel approach for speech representation using latent regression Bayesian networks (LRBN) to address the issue of poor performance in low-resource language speech systems. LRBN, a lightweight unsupervised learning model, learns data distribution and high-level features, unlike computationally [...] Read more.
In this paper, we present a novel approach for speech representation using latent regression Bayesian networks (LRBN) to address the issue of poor performance in low-resource language speech systems. LRBN, a lightweight unsupervised learning model, learns data distribution and high-level features, unlike computationally expensive large models, such as Wav2vec 2.0. To evaluate the effectiveness of LRBN in learning speech representations, we conducted experiments on five different low-resource languages and applied them to two downstream tasks: phoneme classification and speech recognition. Our experimental results demonstrate that LRBN outperforms prevailing speech representation methods in both tasks, highlighting its potential in the realm of speech representation learning for low-resource languages. Full article
(This article belongs to the Special Issue Machine Learning and Deep Learning Based Pattern Recognition)
Show Figures

Figure 1

20 pages, 4433 KiB  
Article
Maintain a Better Balance between Performance and Cost for Image Captioning by a Size-Adjustable Convolutional Module
by Yan Lyu, Yong Liu and Qiangfu Zhao
Electronics 2023, 12(14), 3187; https://doi.org/10.3390/electronics12143187 - 22 Jul 2023
Cited by 1 | Viewed by 1685
Abstract
Image captioning is a challenging AI problem that connects computer vision and natural language processing. Many deep learning (DL) models have been proposed in the literature for solving this problem. So far, the primary concern of image captioning has been focused on increasing [...] Read more.
Image captioning is a challenging AI problem that connects computer vision and natural language processing. Many deep learning (DL) models have been proposed in the literature for solving this problem. So far, the primary concern of image captioning has been focused on increasing the accuracy of generating human-style sentences for describing given images. As a result, state-of-the-art (SOTA) models are often too expensive to be implemented in computationally weak devices. In contrast, the primary concern of this paper is to maintain a balance between performance and cost. For this purpose, we propose using a DL model pre-trained for object detection to encode the given image so that features of various objects can be extracted simultaneously. We also propose adding a size-adjustable convolutional module (SACM) before decoding the features into sentences. The experimental results show that the model with the properly adjusted SACM could reach a BLEU-1 score of 82.3 and a BLEU-4 score of 43.9 on the Flickr 8K dataset, and a BLEU-1 score of 83.1 and a BLEU-4 score of 44.3 on the MS COCO dataset. With the SACM, the number of parameters is decreased to 108M, which is about 1/4 of the original YOLOv3-LSTM model with 430M parameters. Specifically, compared with mPLUG with 510M parameters, which is one of the SOTA methods, the proposed method can achieve almost the same BLEU-4 scores, but the number of parameters is 78% less than the mPLUG. Full article
(This article belongs to the Special Issue Machine Learning and Deep Learning Based Pattern Recognition)
Show Figures

Figure 1

15 pages, 1906 KiB  
Article
Multi-Stream General and Graph-Based Deep Neural Networks for Skeleton-Based Sign Language Recognition
by Abu Saleh Musa Miah, Md. Al Mehedi Hasan, Si-Woong Jang, Hyoun-Sup Lee and Jungpil Shin
Electronics 2023, 12(13), 2841; https://doi.org/10.3390/electronics12132841 - 27 Jun 2023
Cited by 13 | Viewed by 1805
Abstract
Sign language recognition (SLR) aims to bridge speech-impaired and general communities by recognizing signs from given videos. However, due to the complex background, light illumination, and subject structures in videos, researchers still face challenges in developing effective SLR systems. Many researchers have recently [...] Read more.
Sign language recognition (SLR) aims to bridge speech-impaired and general communities by recognizing signs from given videos. However, due to the complex background, light illumination, and subject structures in videos, researchers still face challenges in developing effective SLR systems. Many researchers have recently sought to develop skeleton-based sign language recognition systems to overcome the subject and background variation in hand gesture sign videos. However, skeleton-based SLR is still under exploration, mainly due to a lack of information and hand key point annotations. More recently, researchers have included body and face information along with hand gesture information for SLR; however, the obtained performance accuracy and generalizability properties remain unsatisfactory. In this paper, we propose a multi-stream graph-based deep neural network (SL-GDN) for a skeleton-based SLR system in order to overcome the above-mentioned problems. The main purpose of the proposed SL-GDN approach is to improve the generalizability and performance accuracy of the SLR system while maintaining a low computational cost based on the human body pose in the form of 2D landmark locations. We first construct a skeleton graph based on 27 whole-body key points selected among 67 key points to address the high computational cost problem. Then, we utilize the multi-stream SL-GDN to extract features from the whole-body skeleton graph considering four streams. Finally, we concatenate the four different features and apply a classification module to refine the features and recognize corresponding sign classes. Our data-driven graph construction method increases the system’s flexibility and brings high generalizability, allowing it to adapt to varied data. We use two large-scale benchmark SLR data sets to evaluate the proposed model: The Turkish Sign Language data set (AUTSL) and Chinese Sign Language (CSL). The reported performance accuracy results demonstrate the outstanding ability of the proposed model, and we believe that it will be considered a great innovation in the SLR domain. Full article
(This article belongs to the Special Issue Machine Learning and Deep Learning Based Pattern Recognition)
Show Figures

Figure 1

19 pages, 17533 KiB  
Article
Stochastic Neighbor Embedding Feature-Based Hyperspectral Image Classification Using 3D Convolutional Neural Network
by Md. Moazzem Hossain, Md. Ali Hossain, Abu Saleh Musa Miah, Yuichi Okuyama, Yoichi Tomioka and Jungpil Shin
Electronics 2023, 12(9), 2082; https://doi.org/10.3390/electronics12092082 - 2 May 2023
Cited by 7 | Viewed by 1978
Abstract
The ample amount of information from hyperspectral image (HSI) bands allows the non-destructive detection and recognition of earth objects. However, dimensionality reduction (DR) of hyperspectral images (HSI) is required before classification as the classifier may suffer from the curse of dimensionality. Therefore, dimensionality [...] Read more.
The ample amount of information from hyperspectral image (HSI) bands allows the non-destructive detection and recognition of earth objects. However, dimensionality reduction (DR) of hyperspectral images (HSI) is required before classification as the classifier may suffer from the curse of dimensionality. Therefore, dimensionality reduction plays a significant role in HSI data analysis (e.g., effective processing and seamless interpretation). In this article, a sophisticated technique established as t-Distributed Stochastic Neighbor Embedding (tSNE) following the dimension reduction along with a blended CNN was implemented to improve the visualization and characterization of HSI. In the procedure, first, we employed principal component analysis (PCA) to reduce the HSI dimensions and remove non-linear consistency features between the wavelengths to project them to a smaller scale. Then we proposed tSNE to preserve the local and global pixel relationships and check the HSI information visually and experimentally. Lastly, it yielded two-dimensional data, improving the visualization and classification accuracy compared to other standard dimensionality-reduction algorithms. Finally, we employed deep-learning-based CNN to classify the reduced and improved HSI intra- and inter-band relationship-feature vector. The evaluation performance of 95.21% accuracy and 6.2% test loss proved the superiority of the proposed model compared to other state-of-the-art DR reduction algorithms. Full article
(This article belongs to the Special Issue Machine Learning and Deep Learning Based Pattern Recognition)
Show Figures

Figure 1

11 pages, 431 KiB  
Communication
Supervised Learning Spectrum Sensing Method via Geometric Power Feature
by Qian Hu, Zhongqiang Luo and Wenshi Xiao
Electronics 2023, 12(7), 1616; https://doi.org/10.3390/electronics12071616 - 29 Mar 2023
Cited by 1 | Viewed by 1406
Abstract
In order to improve the spectrum sensing (SS) performance under a low Signal Noise Ratio (SNR), this paper proposes a supervised learning spectrum sensing method based on Geometric Power (GP) feature. The GP is used as the feature vector in the supervised learning [...] Read more.
In order to improve the spectrum sensing (SS) performance under a low Signal Noise Ratio (SNR), this paper proposes a supervised learning spectrum sensing method based on Geometric Power (GP) feature. The GP is used as the feature vector in the supervised learning spectrum sensing method for training and testing based on the actual captured data set. Experimental results show that the detection performance of the GP-based supervised learning spectrum sensing method is better than that of the Energy Statistics (ES) and Differential Entropy (DE)-based supervised learning spectrum sensing methods. Full article
(This article belongs to the Special Issue Machine Learning and Deep Learning Based Pattern Recognition)
Show Figures

Figure 1

Back to TopTop