Advances of Artificial Intelligence and Vision Applications

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Artificial Intelligence".

Deadline for manuscript submissions: closed (29 June 2024) | Viewed by 61653

Special Issue Editors

School of Electronics and Information Technology, Sun Yat-sen University, Guangzhou 510006, China
Interests: computer vision and pattern recognition
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Artificial intelligence technologies represented by deep learning and convolutional neural networks have greatly promoted the research and development of computer vision in the last decade. Simultaneously, advances in software and hardware also enable engineers to implement their elaborated computer vision algorithms onto powerful platforms. These advancements have enabled computer vision to attain enormous success across every aspect of modern society, including agriculture, retail, insurance, manufacturing, logistics, smart city, healthcare, pharmaceutical, construction, and so on. The performance of an AI-based computer vision system is still constrained by the quality and quantity of training data, as well as the computing power and processing speed of the hardware platforms. This Special Issue aims to collect the advances and contributions of related research to the design, optimization, and implementation of artificial intelligence and computer vision applications.

General topics covered in this Special Issue include, but are not limited to, the following:

  • Image interpretation;
  • Object recognition and tracking;
  • Shape analysis, monitoring, and surveillance;
  • Biologically inspired computer vision;
  • Motion analysis;
  • Document image understanding;
  • Face and gesture recognition;
  • Vision-based human–computer interaction;
  • Human activity and behavior understanding;
  • Emotion recognition.

Dr. Dong Zhang
Prof. Dr. Dah-Jye Lee
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • artificial intelligence
  • computer vision
  • deep learning
  • convolutional neural networks
  • affective computing

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (15 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Other

17 pages, 29032 KiB  
Article
Real-Time Dense Visual SLAM with Neural Factor Representation
by Weifeng Wei, Jie Wang, Xiaolong Xie, Jie Liu and Pengxiang Su
Electronics 2024, 13(16), 3332; https://doi.org/10.3390/electronics13163332 - 22 Aug 2024
Viewed by 927
Abstract
Developing a high-quality, real-time, dense visual SLAM system poses a significant challenge in the field of computer vision. NeRF introduces neural implicit representation, marking a notable advancement in visual SLAM research. However, existing neural implicit SLAM methods suffer from long runtimes and face [...] Read more.
Developing a high-quality, real-time, dense visual SLAM system poses a significant challenge in the field of computer vision. NeRF introduces neural implicit representation, marking a notable advancement in visual SLAM research. However, existing neural implicit SLAM methods suffer from long runtimes and face challenges when modeling complex structures in scenes. In this paper, we propose a neural implicit dense visual SLAM method that enables high-quality real-time reconstruction even on a desktop PC. Firstly, we propose a novel neural scene representation, encoding the geometry and appearance information of the scene as a combination of the basis and coefficient factors. This representation allows for efficient memory usage and the accurate modeling of high-frequency detail regions. Secondly, we introduce feature integration rendering to significantly improve rendering speed while maintaining the quality of color rendering. Extensive experiments on synthetic and real-world datasets demonstrate that our method achieves an average improvement of more than 60% for Depth L1 and ATE RMSE compared to existing state-of-the-art methods when running at 9.8 Hz on a desktop PC with a 3.20 GHz Intel Core i9-12900K CPU and a single NVIDIA RTX 3090 GPU. This remarkable advancement highlights the crucial importance of our approach in the field of dense visual SLAM. Full article
(This article belongs to the Special Issue Advances of Artificial Intelligence and Vision Applications)
Show Figures

Figure 1

22 pages, 13011 KiB  
Article
RA-YOLOv8: An Improved YOLOv8 Seal Text Detection Method
by Han Sun, Chaohong Tan, Si Pang, Hancheng Wang and Baohua Huang
Electronics 2024, 13(15), 3001; https://doi.org/10.3390/electronics13153001 - 30 Jul 2024
Viewed by 1200
Abstract
To detect text from electronic seals that have significant background interference, blurring, text overlapping, and curving, an improved YOLOv8 model named RA-YOLOv8 was developed. The model is primarily based on YOLOv8, with optimized structures in its backbone and neck. The receptive-field attention and [...] Read more.
To detect text from electronic seals that have significant background interference, blurring, text overlapping, and curving, an improved YOLOv8 model named RA-YOLOv8 was developed. The model is primarily based on YOLOv8, with optimized structures in its backbone and neck. The receptive-field attention and efficient multi-scale attention (RFEMA) module is introduced in the backbone. The model’s ability to extract and integrate local and global features is enhanced by combining the attention on the receptive-field spatial feature of the receptive-field attention and coordinate attention (RFCA) module and the cross-spatial learning of the efficient multi-scale attention (EMA) module. The Alterable Kernel Convolution (AKConv) module is incorporated in the neck, enhancing the model’s detection accuracy of curved text by dynamically adjusting the sampling position. Furthermore, to boost the model’s detection performance, the original loss function is replaced with the bounding box regression loss function of Minimum Point Distance Intersection over Union (MPDIoU). Experimental results demonstrate that RA-YOLOv8 surpasses YOLOv8 in terms of precision, recall, and F1 value, with improvements of 0.4%, 1.6%, and 1.03%, respectively, validating its effectiveness and utility in seal text detection. Full article
(This article belongs to the Special Issue Advances of Artificial Intelligence and Vision Applications)
Show Figures

Figure 1

13 pages, 1897 KiB  
Article
Driver Abnormal Expression Detection Method Based on Improved Lightweight YOLOv5
by Keming Yao, Zhongzhou Wang, Fuao Guo and Feng Li
Electronics 2024, 13(6), 1138; https://doi.org/10.3390/electronics13061138 - 20 Mar 2024
Viewed by 1082
Abstract
The rapid advancement of intelligent assisted driving technology has significantly enhanced transportation convenience in society and contributed to the mitigation of traffic safety hazards. Addressing the potential for drivers to experience abnormal physical conditions during the driving process, an enhanced lightweight network model [...] Read more.
The rapid advancement of intelligent assisted driving technology has significantly enhanced transportation convenience in society and contributed to the mitigation of traffic safety hazards. Addressing the potential for drivers to experience abnormal physical conditions during the driving process, an enhanced lightweight network model based on YOLOv5 for detecting abnormal facial expressions of drivers is proposed in this paper. Initially, the lightweighting of the YOLOv5 backbone network is achieved by integrating the FasterNet Block, a lightweight module from the FasterNet network, with the C3 module in the main network. This combination forms the C3-faster module. Subsequently, the original convolutional modules in the YOLOv5 model are replaced with the improved GSConvns module to reduce computational load. Building upon the GSConvns module, the VoV-GSCSP module is constructed to ensure the lightweighting of the neck network while maintaining detection accuracy. Finally, channel pruning and fine-tuning operations are applied to the entire model. Channel pruning involves removing channels with minimal impact on output results, further reducing the model’s computational load, parameters, and size. The fine-tuning operation compensates for any potential loss in detection accuracy. Experimental results demonstrate that the proposed model achieves a substantial reduction in both parameter count and computational load while maintaining a high detection accuracy of 84.5%. The improved model has a compact size of only 4.6 MB, making it more conducive to the efficient operation of onboard computers. Full article
(This article belongs to the Special Issue Advances of Artificial Intelligence and Vision Applications)
Show Figures

Figure 1

13 pages, 6704 KiB  
Article
A Multi-Object Tracking Approach Combining Contextual Features and Trajectory Prediction
by Peng Zhang, Qingyang Jing, Xinlei Zhao, Lijia Dong, Weimin Lei, Wei Zhang and Zhaonan Lin
Electronics 2023, 12(23), 4720; https://doi.org/10.3390/electronics12234720 - 21 Nov 2023
Cited by 1 | Viewed by 1496
Abstract
Aiming to solve the problem of the identity switching of objects with similar appearances in real scenarios, a multi-object tracking approach combining contextual features and trajectory prediction is proposed. This approach integrates the motion and appearance features of objects. The motion features are [...] Read more.
Aiming to solve the problem of the identity switching of objects with similar appearances in real scenarios, a multi-object tracking approach combining contextual features and trajectory prediction is proposed. This approach integrates the motion and appearance features of objects. The motion features are mainly used for trajectory prediction, and the appearance features are divided into contextual features and individual features, which are mainly used for trajectory matching. In order to accurately distinguish the identities of objects with similar appearances, a context graph is constructed by taking the specified object as the master node and its neighboring objects as the branch nodes. A preprocessing module is applied to exclude unnecessary connections in the graph model based on the speed of the historical trajectory of the object, and to distinguish the features of objects with similar appearances. Feature matching is performed using the Hungarian algorithm, based on the similarity matrix obtained from the features. Post-processing is performed for the temporarily unmatched frames to obtain the final object matching results. The experimental results show that the approach proposed in this paper can achieve the highest MOTA. Full article
(This article belongs to the Special Issue Advances of Artificial Intelligence and Vision Applications)
Show Figures

Figure 1

18 pages, 4763 KiB  
Article
Automated Facial Emotion Recognition Using the Pelican Optimization Algorithm with a Deep Convolutional Neural Network
by Mohammed Alonazi, Hala J. Alshahrani, Faiz Abdullah Alotaibi, Mohammed Maray, Mohammed Alghamdi and Ahmed Sayed
Electronics 2023, 12(22), 4608; https://doi.org/10.3390/electronics12224608 - 11 Nov 2023
Cited by 12 | Viewed by 2436
Abstract
Facial emotion recognition (FER) stands as a pivotal artificial intelligence (AI)-driven technology that exploits the capabilities of computer-vision techniques for decoding and comprehending emotional expressions displayed on human faces. With the use of machine-learning (ML) models, specifically deep neural networks (DNN), FER empowers [...] Read more.
Facial emotion recognition (FER) stands as a pivotal artificial intelligence (AI)-driven technology that exploits the capabilities of computer-vision techniques for decoding and comprehending emotional expressions displayed on human faces. With the use of machine-learning (ML) models, specifically deep neural networks (DNN), FER empowers the automatic detection and classification of a broad spectrum of emotions, encompassing surprise, happiness, sadness, anger, and more. Challenges in FER include handling variations in lighting, poses, and facial expressions, as well as ensuring that the model generalizes well to various emotions and populations. This study introduces an automated facial emotion recognition using the pelican optimization algorithm with a deep convolutional neural network (AFER-POADCNN) model. The primary objective of the AFER-POADCNN model lies in the automatic recognition and classification of facial emotions. To accomplish this, the AFER-POADCNN model exploits the median-filtering (MF) approach to remove the noise present in it. Furthermore, the capsule-network (CapsNet) approach can be applied to the feature-extraction process, allowing the model to capture intricate facial expressions and nuances. To optimize the CapsNet model’s performance, hyperparameter tuning is undertaken with the aid of the pelican optimization algorithm (POA). This ensures that the model is finely tuned to detect a wide array of emotions and generalizes effectively across diverse populations and scenarios. Finally, the detection and classification of different kinds of facial emotions take place using a bidirectional long short-term memory (BiLSTM) network. The simulation analysis of the AFER-POADCNN system is tested on a benchmark FER dataset. The comparative result analysis showed the better performance of the AFER-POADCNN algorithm over existing models, with a maximum accuracy of 99.05%. Full article
(This article belongs to the Special Issue Advances of Artificial Intelligence and Vision Applications)
Show Figures

Figure 1

15 pages, 2716 KiB  
Article
Fault Detection in Solar Energy Systems: A Deep Learning Approach
by Zeynep Bala Duranay
Electronics 2023, 12(21), 4397; https://doi.org/10.3390/electronics12214397 - 24 Oct 2023
Cited by 10 | Viewed by 6174
Abstract
While solar energy holds great significance as a clean and sustainable energy source, photovoltaic panels serve as the linchpin of this energy conversion process. However, defects in these panels can adversely impact energy production, necessitating the rapid and effective detection of such faults. [...] Read more.
While solar energy holds great significance as a clean and sustainable energy source, photovoltaic panels serve as the linchpin of this energy conversion process. However, defects in these panels can adversely impact energy production, necessitating the rapid and effective detection of such faults. This study explores the potential of using infrared solar module images for the detection of photovoltaic panel defects through deep learning, which represents a crucial step toward enhancing the efficiency and sustainability of solar energy systems. A dataset comprising 20,000 images, derived from infrared solar modules, was utilized in this study, consisting of 12 classes: cell, cell-multi, cracking, diode, diode-multi, hot spot, hot spot-multi, no-anomaly, offline-module, shadowing, soiling, and vegetation. The methodology employed the exemplar Efficientb0 model. From the exemplar model, 17,000 features were selected using the NCA feature selector. Subsequently, classification was performed using an SVM classifier. The proposed method applied to a dataset consisting of 12 classes has yielded successful results in terms of accuracy, F1-score, precision, and sensitivity metrics. These results indicate average values of 93.93% accuracy, 89.82% F1-score, 91.50% precision, and 88.28% sensitivity, respectively. The proposed method in this study accurately classifies photovoltaic panel defects based on images of infrared solar modules. Full article
(This article belongs to the Special Issue Advances of Artificial Intelligence and Vision Applications)
Show Figures

Figure 1

12 pages, 652 KiB  
Article
Deform2NeRF: Non-Rigid Deformation and 2D–3D Feature Fusion with Cross-Attention for Dynamic Human Reconstruction
by Xiaolong Xie, Xusheng Guo, Wei Li, Jie Liu and Jianfeng Xu
Electronics 2023, 12(21), 4382; https://doi.org/10.3390/electronics12214382 - 24 Oct 2023
Cited by 2 | Viewed by 1797
Abstract
Reconstructing dynamic human body models from multi-view videos poses a substantial challenge in the field of 3D computer vision. Currently, the Animatable NeRF method addresses this challenge by mapping observed points from the viewing space to a canonical space. However, this mapping introduces [...] Read more.
Reconstructing dynamic human body models from multi-view videos poses a substantial challenge in the field of 3D computer vision. Currently, the Animatable NeRF method addresses this challenge by mapping observed points from the viewing space to a canonical space. However, this mapping introduces positional shifts in predicted points, resulting in artifacts, particularly in intricate areas. In this paper, we propose an innovative approach called Deform2NeRF that incorporates non-rigid deformation correction and image feature fusion modules into the Animatable NeRF framework to enhance the reconstruction of animatable human models. Firstly, we introduce a non-rigid deformation field network to address the issue of point position shift effectively. This network adeptly corrects positional discrepancies caused by non-rigid deformations. Secondly, we introduce a 2D–3D feature fusion learning module with cross-attention and integrate it with the NeRF network to mitigate artifacts in specific detailed regions. Our experimental results demonstrate that our method significantly improves the PSNR index by approximately 5% compared to representative methods in the field. This remarkable advancement underscores the profound importance of our approach in the domains of new view synthesis and digital human reconstruction. Full article
(This article belongs to the Special Issue Advances of Artificial Intelligence and Vision Applications)
Show Figures

Figure 1

19 pages, 4827 KiB  
Article
Global Individual Interaction Network Based on Consistency for Group Activity Recognition
by Cheng Huang, Dong Zhang, Bing Li, Yun Xian and Dah-Jye Lee
Electronics 2023, 12(19), 4104; https://doi.org/10.3390/electronics12194104 - 30 Sep 2023
Viewed by 970
Abstract
Modeling the interactions among individuals in a group is essential for group activity recognition (GAR). Various graph neural networks (GNNs) are regarded as popular modeling methods for GAR, as they can characterize the interaction among individuals at a low computational cost. The performance [...] Read more.
Modeling the interactions among individuals in a group is essential for group activity recognition (GAR). Various graph neural networks (GNNs) are regarded as popular modeling methods for GAR, as they can characterize the interaction among individuals at a low computational cost. The performance of the current GNN-based modeling methods is affected by two factors. Firstly, their local receptive field in the mapping layer limits their ability to characterize the global interactions among individuals in spatial–temporal dimensions. Secondly, GNN-based GAR methods do not have an efficient mechanism to use global activity consistency and individual action consistency. In this paper, we argue that the global interactions among individuals, as well as the constraints of global activity and individual action consistencies, are critical to group activity recognition. We propose new convolutional operations to capture the interactions among individuals from a global perspective. We use contrastive learning to maximize the global activity consistency and individual action consistency for more efficient recognition. Comprehensive experiments show that our method achieved better GAR performance than the state-of-the-art methods on two popular GAR benchmark datasets. Full article
(This article belongs to the Special Issue Advances of Artificial Intelligence and Vision Applications)
Show Figures

Figure 1

19 pages, 6756 KiB  
Article
An AIoT-Based Assistance System for Visually Impaired People
by Jiawen Li, Lianglu Xie, Zhe Chen, Liang Shi, Rongjun Chen, Yongqi Ren, Leijun Wang and Xu Lu
Electronics 2023, 12(18), 3760; https://doi.org/10.3390/electronics12183760 - 6 Sep 2023
Cited by 8 | Viewed by 4206
Abstract
In this work, an assistance system based on the Artificial Intelligence of Things (AIoT) framework was designed and implemented to provide convenience for visually impaired people. This system aims to be low-cost and multi-functional with object detection, obstacle distance measurement, and text recognition [...] Read more.
In this work, an assistance system based on the Artificial Intelligence of Things (AIoT) framework was designed and implemented to provide convenience for visually impaired people. This system aims to be low-cost and multi-functional with object detection, obstacle distance measurement, and text recognition achieved by wearable smart glasses, heart rate detection, fall detection, body temperature measurement, and humidity-temperature monitoring offered by an intelligent walking stick. The total hardware cost is approximately $66.8, as diverse low-cost sensors and modules are embedded. Meanwhile, a voice assistant is adopted, which helps to convey detection results to users. As for the performance evaluation, the accuracies of object detection and text recognition in the wearable smart glasses experiments are 92.16% and 99.91%, respectively, and the maximum deviation rate compared to the mobile app on obstacle distance measurement is 6.32%. In addition, the intelligent walking stick experiments indicate that the maximum deviation rates compared to the commercial devices on heart rate detection, body temperature measurement, and humidity-temperature monitoring are 3.52%, 0.19%, and 3.13%, respectively, and the fall detection accuracy is 87.33%. Such results demonstrate that the proposed assistance system yields reliable performances similar to commercial devices and is impressive when considering the total cost as a primary concern. Consequently, it satisfies the fundamental requirements of daily life, benefiting the safety and well-being of visually impaired people. Full article
(This article belongs to the Special Issue Advances of Artificial Intelligence and Vision Applications)
Show Figures

Figure 1

13 pages, 10750 KiB  
Article
AI-Driven High-Precision Model for Blockage Detection in Urban Wastewater Systems
by Ravindra R. Patil, Rajnish Kaur Calay, Mohamad Y. Mustafa and Saniya M. Ansari
Electronics 2023, 12(17), 3606; https://doi.org/10.3390/electronics12173606 - 26 Aug 2023
Cited by 1 | Viewed by 1652
Abstract
In artificial intelligence (AI), computer vision consists of intelligent models to interpret and recognize the visual world, similar to human vision. This technology relies on a synergy of extensive data and human expertise, meticulously structured to yield accurate results. Tackling the intricate task [...] Read more.
In artificial intelligence (AI), computer vision consists of intelligent models to interpret and recognize the visual world, similar to human vision. This technology relies on a synergy of extensive data and human expertise, meticulously structured to yield accurate results. Tackling the intricate task of locating and resolving blockages within sewer systems is a significant challenge due to their diverse nature and lack of robust technique. This research utilizes the previously introduced “S-BIRD” dataset, a collection of frames depicting sewer blockages, as the foundational training data for a deep neural network model. To enhance the model’s performance and attain optimal results, transfer learning and fine-tuning techniques are strategically implemented on the YOLOv5 architecture, using the corresponding dataset. The outcomes of the trained model exhibit a remarkable accuracy rate in sewer blockage detection, thereby boosting the reliability and efficacy of the associated robotic framework for proficient removal of various blockages. Particularly noteworthy is the achieved mean average precision (mAP) score of 96.30% at a confidence threshold of 0.5, maintaining a consistently high-performance level of 79.20% across Intersection over Union (IoU) thresholds ranging from 0.5 to 0.95. It is expected that this work contributes to advancing the applications of AI-driven solutions for modern urban sanitation systems. Full article
(This article belongs to the Special Issue Advances of Artificial Intelligence and Vision Applications)
Show Figures

Figure 1

14 pages, 3989 KiB  
Article
Efficient Reversible Data Hiding Using Two-Dimensional Pixel Clustering
by Junying Yuan, Huicheng Zheng and Jiangqun Ni
Electronics 2023, 12(7), 1645; https://doi.org/10.3390/electronics12071645 - 30 Mar 2023
Cited by 5 | Viewed by 1219
Abstract
Pixel clustering is a technique of content-adaptive data embedding in the area of high-performance reversible data hiding (RDH). Using pixel clustering, the pixels in a cover image can be classified into different groups based on a single factor, which is usually the local [...] Read more.
Pixel clustering is a technique of content-adaptive data embedding in the area of high-performance reversible data hiding (RDH). Using pixel clustering, the pixels in a cover image can be classified into different groups based on a single factor, which is usually the local complexity. Since finer pixel clustering seems to improve the embedding performance, in this manuscript, we propose using two factors for two-dimensional pixel clustering to develop high-performance RDH. Firstly, in addition to the local complexity, a novel factor was designed as the second factor for pixel clustering. Specifically, the proposed factor was defined using the rotation-invariant code derived from pixel relationships in the four-neighborhood. Then, pixels were allocated to the two-dimensional clusters based on the two clustering factors, and cluster-based pixel prediction was realized. As a result, two-dimensional prediction-error histograms (2D-PEHs) were constructed, and performance optimization was based on the selection of expansion bins from the 2D-PEHs. Next, an algorithm for fast expansion-bin selection was introduced to reduce the time complexity. Lastly, data embedding was realized using the technique of prediction-error expansion according to the optimally selected expansion bins. Extensive experiments show that the embedding performance was significantly enhanced, particularly in terms of improved image quality and reduced time complexity, and embedding capacity also moderately improved. Full article
(This article belongs to the Special Issue Advances of Artificial Intelligence and Vision Applications)
Show Figures

Figure 1

12 pages, 2843 KiB  
Article
Applying Monte Carlo Dropout to Quantify the Uncertainty of Skip Connection-Based Convolutional Neural Networks Optimized by Big Data
by Abouzar Choubineh, Jie Chen, Frans Coenen and Fei Ma
Electronics 2023, 12(6), 1453; https://doi.org/10.3390/electronics12061453 - 19 Mar 2023
Cited by 4 | Viewed by 4529
Abstract
Although Deep Learning (DL) models have been introduced in various fields as effective prediction tools, they often do not care about uncertainty. This can be a barrier to their adoption in real-world applications. The current paper aims to apply and evaluate Monte Carlo [...] Read more.
Although Deep Learning (DL) models have been introduced in various fields as effective prediction tools, they often do not care about uncertainty. This can be a barrier to their adoption in real-world applications. The current paper aims to apply and evaluate Monte Carlo (MC) dropout, a computationally efficient approach, to investigate the reliability of several skip connection-based Convolutional Neural Network (CNN) models while keeping their high accuracy. To do so, a high-dimensional regression problem is considered in the context of subterranean fluid flow modeling using 376,250 generated samples. The results demonstrate the effectiveness of MC dropout in terms of reliability with a Standard Deviation (SD) of 0.012–0.174, and of accuracy with a coefficient of determination (R2) of 0.7881–0.9584 and Mean Squared Error (MSE) of 0.0113–0.0508, respectively. The findings of this study may contribute to the distribution of pressure in the development of oil/gas fields. Full article
(This article belongs to the Special Issue Advances of Artificial Intelligence and Vision Applications)
Show Figures

Figure 1

20 pages, 1551 KiB  
Article
CAE-Net: Cross-Modal Attention Enhancement Network for RGB-T Salient Object Detection
by Chengtao Lv, Bin Wan, Xiaofei Zhou, Yaoqi Sun, Ji Hu, Jiyong Zhang and Chenggang Yan
Electronics 2023, 12(4), 953; https://doi.org/10.3390/electronics12040953 - 14 Feb 2023
Viewed by 2175
Abstract
RGB salient object detection (SOD) performs poorly in low-contrast and complex background scenes. Fortunately, the thermal infrared image can capture the heat distribution of scenes as complementary information to the RGB image, so the RGB-T SOD has recently attracted more and more attention. [...] Read more.
RGB salient object detection (SOD) performs poorly in low-contrast and complex background scenes. Fortunately, the thermal infrared image can capture the heat distribution of scenes as complementary information to the RGB image, so the RGB-T SOD has recently attracted more and more attention. Many researchers have committed to accelerating the development of RGB-T SOD, but some problems still remain to be solved. For example, the defective sample and interfering information contained in the RGB or thermal image hinder the model from learning proper saliency features, meanwhile the low-level features with noisy information result in incomplete salient objects or false positive detection. To solve these problems, we design a cross-modal attention enhancement network (CAE-Net). First, we concretely design a cross-modal fusion (CMF) module to fuse cross-modal features, where the cross-attention unit (CAU) is employed to enhance the two modal features, and channel attention is used to dynamically weigh and fuse the two modal features. Then, we design the joint-modality decoder (JMD) to fuse cross-level features, where the low-level features are purified by higher level features, and multi-scale features are sufficiently integrated. Besides, we add two single-modality decoder (SMD) branches to preserve more modality-specific information. Finally, we employ a multi-stream fusion (MSF) module to fuse three decoders’ features. Comprehensive experiments are conducted on three RGB-T datasets, and the results show that our CAE-Net is comparable to the other methods. Full article
(This article belongs to the Special Issue Advances of Artificial Intelligence and Vision Applications)
Show Figures

Figure 1

30 pages, 19008 KiB  
Article
Automated Pre-Play Analysis of American Football Formations Using Deep Learning
by Jacob Newman, Andrew Sumsion, Shad Torrie and Dah-Jye Lee
Electronics 2023, 12(3), 726; https://doi.org/10.3390/electronics12030726 - 1 Feb 2023
Cited by 7 | Viewed by 7733
Abstract
Annotation and analysis of sports videos is a time-consuming task that, once automated, will provide benefits to coaches, players, and spectators. American football, as the most watched sport in the United States, could especially benefit from this automation. Manual annotation and analysis of [...] Read more.
Annotation and analysis of sports videos is a time-consuming task that, once automated, will provide benefits to coaches, players, and spectators. American football, as the most watched sport in the United States, could especially benefit from this automation. Manual annotation and analysis of recorded videos of American football games is an inefficient and tedious process. Currently, most college football programs focus on annotating offensive formations to help them develop game plans for their upcoming games. As a first step to further research for this unique application, we use computer vision and deep learning to analyze an overhead image of a football play immediately before the play begins. This analysis consists of locating individual football players and labeling their position or roles, as well as identifying the formation of the offensive team. We obtain greater than 90% accuracy on both player detection and labeling, and 84.8% accuracy on formation identification. These results prove the feasibility of building a complete American football strategy analysis system using artificial intelligence. Collecting a larger dataset in real-world situations will enable further improvements. This would likewise enable American football teams to analyze game footage quickly. Full article
(This article belongs to the Special Issue Advances of Artificial Intelligence and Vision Applications)
Show Figures

Figure 1

Other

Jump to: Research

44 pages, 1279 KiB  
Systematic Review
Intelligent Robotics—A Systematic Review of Emerging Technologies and Trends
by Josip Tomo Licardo, Mihael Domjan and Tihomir Orehovački
Electronics 2024, 13(3), 542; https://doi.org/10.3390/electronics13030542 - 29 Jan 2024
Cited by 13 | Viewed by 21859
Abstract
Intelligent robotics has the potential to revolutionize various industries by amplifying output, streamlining operations, and enriching customer interactions. This systematic literature review aims to analyze emerging technologies and trends in intelligent robotics, addressing key research questions, identifying challenges and opportunities, and proposing the [...] Read more.
Intelligent robotics has the potential to revolutionize various industries by amplifying output, streamlining operations, and enriching customer interactions. This systematic literature review aims to analyze emerging technologies and trends in intelligent robotics, addressing key research questions, identifying challenges and opportunities, and proposing the best practices for responsible and beneficial integration into various sectors. Our research uncovers the significant improvements brought by intelligent robotics across industries such as manufacturing, logistics, tourism, agriculture, healthcare, and construction. The main results indicate the importance of focusing on human–robot collaboration, ethical considerations, sustainable practices, and addressing industry-specific challenges to harness the opportunities presented by intelligent robotics fully. The implications and future directions of intelligent robotics involve addressing both challenges and potential risks, maximizing benefits, and ensuring responsible implementation. The continuous improvement and refinement of existing technology will shape human life and industries, driving innovation and advancements in intelligent robotics. Full article
(This article belongs to the Special Issue Advances of Artificial Intelligence and Vision Applications)
Show Figures

Figure 1

Back to TopTop