Next Issue
Volume 9, August
Previous Issue
Volume 9, June
 
 

J. Imaging, Volume 9, Issue 7 (July 2023) – 24 articles

Cover Story (view full-size image): In this paper, we propose some strategies to improve stability without losing too much accuracy to deblur images with deep-learning-based methods. First, we suggest a very small neural architecture, which reduces the execution time for training, satisfying a green AI need, and does not extremely amplify noise in the computed image. Second, we introduce a unified framework where a pre-processing step balances the lack of stability of the following neural-network-based step. Two different pre-processors are presented. The former implements a strong parameter-free denoiser, and the latter is a variational-model-based regularized formulation of the latent imaging problem. View this paper
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
18 pages, 4509 KiB  
Article
Automatic Localization of Five Relevant Dermoscopic Structures Based on YOLOv8 for Diagnosis Improvement
by Esther Chabi Adjobo, Amadou Tidjani Sanda Mahama, Pierre Gouton and Joël Tossa
J. Imaging 2023, 9(7), 148; https://doi.org/10.3390/jimaging9070148 - 21 Jul 2023
Cited by 10 | Viewed by 2583
Abstract
The automatic detection of dermoscopic features is a task that provides the specialists with an image with indications about the different patterns present in it. This information can help them fully understand the image and improve their decisions. However, the automatic analysis of [...] Read more.
The automatic detection of dermoscopic features is a task that provides the specialists with an image with indications about the different patterns present in it. This information can help them fully understand the image and improve their decisions. However, the automatic analysis of dermoscopic features can be a difficult task because of their small size. Some work was performed in this area, but the results can be improved. The objective of this work is to improve the precision of the automatic detection of dermoscopic features. To achieve this goal, an algorithm named yolo-dermoscopic-features is proposed. The algorithm consists of four points: (i) generate annotations in the JSON format for supervised learning of the model; (ii) propose a model based on the latest version of Yolo; (iii) pre-train the model for the segmentation of skin lesions; (iv) train five models for the five dermoscopic features. The experiments are performed on the ISIC 2018 task2 dataset. After training, the model is evaluated and compared to the performance of two methods. The proposed method allows us to reach average performances of 0.9758, 0.954, 0.9724, 0.938, and 0.9692, respectively, for the Dice similarity coefficient, Jaccard similarity coefficient, precision, recall, and average precision. Furthermore, comparing to other methods, the proposed method reaches a better Jaccard similarity coefficient of 0.954 and, thus, presents the best similarity with the annotations made by specialists. This method can also be used to automatically annotate images and, therefore, can be a solution to the lack of features annotation in the dataset. Full article
(This article belongs to the Special Issue Imaging Informatics: Computer-Aided Diagnosis)
Show Figures

Figure 1

4 pages, 523 KiB  
Editorial
Deep Learning and Vision Transformer for Medical Image Analysis
by Yudong Zhang, Jiaji Wang, Juan Manuel Gorriz and Shuihua Wang
J. Imaging 2023, 9(7), 147; https://doi.org/10.3390/jimaging9070147 - 21 Jul 2023
Cited by 9 | Viewed by 3872
Abstract
Artificial intelligence (AI) refers to the field of computer science theory and technology [...] Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

27 pages, 6394 KiB  
Article
Algebraic Multi-Layer Network: Key Concepts
by Igor Khanykov, Vadim Nenashev and Mikhail Kharinov
J. Imaging 2023, 9(7), 146; https://doi.org/10.3390/jimaging9070146 - 18 Jul 2023
Cited by 3 | Viewed by 1306
Abstract
The paper refers to interdisciplinary research in the areas of hierarchical cluster analysis of big data and ordering of primary data to detect objects in a color or in a grayscale image. To perform this on a limited domain of multidimensional data, an [...] Read more.
The paper refers to interdisciplinary research in the areas of hierarchical cluster analysis of big data and ordering of primary data to detect objects in a color or in a grayscale image. To perform this on a limited domain of multidimensional data, an NP-hard problem of calculation of close to optimal piecewise constant data approximations with the smallest possible standard deviations or total squared errors (approximation errors) is solved. The solution is achieved by revisiting, modernizing, and combining classical Ward’s clustering, split/merge, and K-means methods. The concepts of objects, images, and their elements (superpixels) are formalized as structures that are distinguishable from each other. The results of structuring and ordering the image data are presented to the user in two ways, as tabulated approximations of the image showing the available object hierarchies. For not only theoretical reasoning, but also for practical implementation, reversible calculations with pixel sets are performed easily, as with individual pixels in terms of Sleator–Tarjan Dynamic trees and cyclic graphs forming an Algebraic Multi-Layer Network (AMN). The detailing of the latter significantly distinguishes this paper from our prior works. The establishment of the invariance of detected objects with respect to changing the context of the image and its transformation into grayscale is also new. Full article
(This article belongs to the Special Issue Image Segmentation Techniques: Current Status and Future Directions)
Show Figures

Figure 1

24 pages, 8111 KiB  
Article
Semi-Automatic GUI Platform to Characterize Brain Development in Preterm Children Using Ultrasound Images
by David Rabanaque, Maria Regalado, Raul Benítez, Sonia Rabanaque, Thais Agut, Nuria Carreras and Christian Mata
J. Imaging 2023, 9(7), 145; https://doi.org/10.3390/jimaging9070145 - 18 Jul 2023
Viewed by 1597
Abstract
The third trimester of pregnancy is the most critical period for human brain development, during which significant changes occur in the morphology of the brain. The development of sulci and gyri allows for a considerable increase in the brain surface. In preterm newborns, [...] Read more.
The third trimester of pregnancy is the most critical period for human brain development, during which significant changes occur in the morphology of the brain. The development of sulci and gyri allows for a considerable increase in the brain surface. In preterm newborns, these changes occur in an extrauterine environment that may cause a disruption of the normal brain maturation process. We hypothesize that a normalized atlas of brain maturation with cerebral ultrasound images from birth to term equivalent age will help clinicians assess these changes. This work proposes a semi-automatic Graphical User Interface (GUI) platform for segmenting the main cerebral sulci in the clinical setting from ultrasound images. This platform has been obtained from images of a cerebral ultrasound neonatal database images provided by two clinical researchers from the Hospital Sant Joan de Déu in Barcelona, Spain. The primary objective is to provide a user-friendly design platform for clinicians for running and visualizing an atlas of images validated by medical experts. This GUI offers different segmentation approaches and pre-processing tools and is user-friendly and designed for running, visualizing images, and segmenting the principal sulci. The presented results are discussed in detail in this paper, providing an exhaustive analysis of the proposed approach’s effectiveness. Full article
(This article belongs to the Special Issue Imaging Informatics: Computer-Aided Diagnosis)
Show Figures

Figure 1

16 pages, 3403 KiB  
Article
Varroa Destructor Classification Using Legendre–Fourier Moments with Different Color Spaces
by Alicia Noriega-Escamilla, César J. Camacho-Bello, Rosa M. Ortega-Mendoza, José H. Arroyo-Núñez and Lucia Gutiérrez-Lazcano
J. Imaging 2023, 9(7), 144; https://doi.org/10.3390/jimaging9070144 - 14 Jul 2023
Cited by 4 | Viewed by 2035
Abstract
Bees play a critical role in pollination and food production, so their preservation is essential, particularly highlighting the importance of detecting diseases in bees early. The Varroa destructor mite is the primary factor contributing to increased viral infections that can lead to hive [...] Read more.
Bees play a critical role in pollination and food production, so their preservation is essential, particularly highlighting the importance of detecting diseases in bees early. The Varroa destructor mite is the primary factor contributing to increased viral infections that can lead to hive mortality. This study presents an innovative method for identifying Varroa destructors in honey bees using multichannel Legendre–Fourier moments. The descriptors derived from this approach possess distinctive characteristics, such as rotation and scale invariance, and noise resistance, allowing the representation of digital images with minimal descriptors. This characteristic is advantageous when analyzing images of living organisms that are not in a static posture. The proposal evaluates the algorithm’s efficiency using different color models, and to enhance its capacity, a subdivision of the VarroaDataset is used. This enhancement allows the algorithm to process additional information about the color and shape of the bee’s legs, wings, eyes, and mouth. To demonstrate the advantages of our approach, we compare it with other deep learning methods, in semantic segmentation techniques, such as DeepLabV3, and object detection techniques, such as YOLOv5. The results suggest that our proposal offers a promising means for the early detection of the Varroa destructor mite, which could be an essential pillar in the preservation of bees and, therefore, in food production. Full article
(This article belongs to the Topic Applications in Image Analysis and Pattern Recognition)
Show Figures

Figure 1

26 pages, 28184 KiB  
Article
The Dangers of Analyzing Thermographic Radiometric Data as Images
by Časlav Livada, Hrvoje Glavaš, Alfonzo Baumgartner and Dina Jukić
J. Imaging 2023, 9(7), 143; https://doi.org/10.3390/jimaging9070143 - 12 Jul 2023
Cited by 1 | Viewed by 1737
Abstract
Thermography is probably the most used method of measuring surface temperature by analyzing radiation in the infrared part of the spectrum which accuracy depends on factors such as emissivity and reflected radiation. Contrary to popular belief that thermographic images represent temperature maps, they [...] Read more.
Thermography is probably the most used method of measuring surface temperature by analyzing radiation in the infrared part of the spectrum which accuracy depends on factors such as emissivity and reflected radiation. Contrary to popular belief that thermographic images represent temperature maps, they are actually thermal radiation converted into an image, and if not properly calibrated, they show incorrect temperatures. The objective of this study is to analyze commonly used image processing techniques and their impact on radiometric data in thermography. In particular, the extent to which a thermograph can be considered as an image and how image processing affects radiometric data. Three analyzes are presented in the paper. The first one examines how image processing techniques, such as contrast and brightness, affect physical reality and its representation in thermographic imaging. The second analysis examines the effects of JPEG compression on radiometric data and how degradation of the data varies with the compression parameters. The third analysis aims to determine the optimal resolution increase required to minimize the effects of compression on the radiometric data. The output from an IR camera in CSV format was used for these analyses, and compared to images from the manufacturer’s software. The IR camera providing data in JPEG format was used, and the data included thermographic images, visible images, and a matrix of thermal radiation data. The study was verified with a reference blackbody radiation set at 60 °C. The results highlight the dangers of interpreting thermographic images as temperature maps without considering the underlying radiometric data which can be affected by image processing and compression. The paper concludes with the importance of accurate and precise thermographic analysis for reliable temperature measurement. Full article
(This article belongs to the Special Issue Data Processing with Artificial Intelligence in Thermal Imagery)
Show Figures

Figure 1

20 pages, 9805 KiB  
Article
Augmented Reality in Maintenance—History and Perspectives
by Ana Malta, Torres Farinha and Mateus Mendes
J. Imaging 2023, 9(7), 142; https://doi.org/10.3390/jimaging9070142 - 10 Jul 2023
Cited by 5 | Viewed by 3178
Abstract
Augmented Reality (AR) is a technology that allows virtual elements to be superimposed over images of real contexts, whether these are text elements, graphics, or other types of objects. Smart AR glasses are increasingly optimized, and modern ones have features such as Global [...] Read more.
Augmented Reality (AR) is a technology that allows virtual elements to be superimposed over images of real contexts, whether these are text elements, graphics, or other types of objects. Smart AR glasses are increasingly optimized, and modern ones have features such as Global Positioning System (GPS), a microphone, and gesture recognition, among others. These devices allow users to have their hands free to perform tasks while they receive instructions in real time through the glasses. This allows maintenance professionals to carry out interventions more efficiently and in a shorter time than would be necessary without the support of this technology. In the present work, a timeline of important achievements is established, including important findings in object recognition, real-time operation. and integration of technologies for shop floor use. Perspectives on future research and related recommendations are proposed as well. Full article
(This article belongs to the Topic Applications in Image Analysis and Pattern Recognition)
Show Figures

Figure 1

19 pages, 7045 KiB  
Article
An Effective Hyperspectral Image Classification Network Based on Multi-Head Self-Attention and Spectral-Coordinate Attention
by Minghua Zhang, Yuxia Duan, Wei Song, Haibin Mei and Qi He
J. Imaging 2023, 9(7), 141; https://doi.org/10.3390/jimaging9070141 - 10 Jul 2023
Cited by 1 | Viewed by 1752
Abstract
In hyperspectral image (HSI) classification, convolutional neural networks (CNNs) have been widely employed and achieved promising performance. However, CNN-based methods face difficulties in achieving both accurate and efficient HSI classification due to their limited receptive fields and deep architectures. To alleviate these limitations, [...] Read more.
In hyperspectral image (HSI) classification, convolutional neural networks (CNNs) have been widely employed and achieved promising performance. However, CNN-based methods face difficulties in achieving both accurate and efficient HSI classification due to their limited receptive fields and deep architectures. To alleviate these limitations, we propose an effective HSI classification network based on multi-head self-attention and spectral-coordinate attention (MSSCA). Specifically, we first reduce the redundant spectral information of HSI by using a point-wise convolution network (PCN) to enhance discriminability and robustness of the network. Then, we capture long-range dependencies among HSI pixels by introducing a modified multi-head self-attention (M-MHSA) model, which applies a down-sampling operation to alleviate the computing burden caused by the dot-product operation of MHSA. Furthermore, to enhance the performance of the proposed method, we introduce a lightweight spectral-coordinate attention fusion module. This module combines spectral attention (SA) and coordinate attention (CA) to enable the network to better weight the importance of useful bands and more accurately localize target objects. Importantly, our method achieves these improvements without increasing the complexity or computational cost of the network. To demonstrate the effectiveness of our proposed method, experiments were conducted on three classic HSI datasets: Indian Pines (IP), Pavia University (PU), and Salinas. The results show that our proposed method is highly competitive in terms of both efficiency and accuracy when compared to existing methods. Full article
Show Figures

Figure 1

20 pages, 3285 KiB  
Article
Conv-ViT: A Convolution and Vision Transformer-Based Hybrid Feature Extraction Method for Retinal Disease Detection
by Pramit Dutta, Khaleda Akther Sathi, Md. Azad Hossain and M. Ali Akber Dewan
J. Imaging 2023, 9(7), 140; https://doi.org/10.3390/jimaging9070140 - 10 Jul 2023
Cited by 23 | Viewed by 5781
Abstract
The current advancement towards retinal disease detection mainly focused on distinct feature extraction using either a convolutional neural network (CNN) or a transformer-based end-to-end deep learning (DL) model. The individual end-to-end DL models are capable of only processing texture or shape-based information for [...] Read more.
The current advancement towards retinal disease detection mainly focused on distinct feature extraction using either a convolutional neural network (CNN) or a transformer-based end-to-end deep learning (DL) model. The individual end-to-end DL models are capable of only processing texture or shape-based information for performing detection tasks. However, extraction of only texture- or shape-based features does not provide the model robustness needed to classify different types of retinal diseases. Therefore, concerning these two features, this paper developed a fusion model called ‘Conv-ViT’ to detect retinal diseases from foveal cut optical coherence tomography (OCT) images. The transfer learning-based CNN models, such as Inception-V3 and ResNet-50, are utilized to process texture information by calculating the correlation of the nearby pixel. Additionally, the vision transformer model is fused to process shape-based features by determining the correlation between long-distance pixels. The hybridization of these three models results in shape-based texture feature learning during the classification of retinal diseases into its four classes, including choroidal neovascularization (CNV), diabetic macular edema (DME), DRUSEN, and NORMAL. The weighted average classification accuracy, precision, recall, and F1 score of the model are found to be approximately 94%. The results indicate that the fusion of both texture and shape features assisted the proposed Conv-ViT model to outperform the state-of-the-art retinal disease classification models. Full article
Show Figures

Figure 1

19 pages, 17842 KiB  
Article
Order Space-Based Morphology for Color Image Processing
by Shanqian Sun, Yunjia Huang, Kohei Inoue and Kenji Hara
J. Imaging 2023, 9(7), 139; https://doi.org/10.3390/jimaging9070139 - 7 Jul 2023
Cited by 4 | Viewed by 1854
Abstract
Mathematical morphology is a fundamental tool based on order statistics for image processing, such as noise reduction, image enhancement and feature extraction, and is well-established for binary and grayscale images, whose pixels can be sorted by their pixel values, i.e., each pixel has [...] Read more.
Mathematical morphology is a fundamental tool based on order statistics for image processing, such as noise reduction, image enhancement and feature extraction, and is well-established for binary and grayscale images, whose pixels can be sorted by their pixel values, i.e., each pixel has a single number. On the other hand, each pixel in a color image has three numbers corresponding to three color channels, e.g., red (R), green (G) and blue (B) channels in an RGB color image. Therefore, it is difficult to sort color pixels uniquely. In this paper, we propose a method for unifying the orders of pixels sorted in each color channel separately, where we consider that a pixel exists in a three-dimensional space called order space, and derive a single order by a monotonically nondecreasing function defined on the order space. We also fuzzify the proposed order space-based morphological operations, and demonstrate the effectiveness of the proposed method by comparing with a state-of-the-art method based on hypergraph theory. The proposed method treats three orders of pixels sorted in respective color channels equally. Therefore, the proposed method is consistent with the conventional morphological operations for binary and grayscale images. Full article
(This article belongs to the Topic Color Image Processing: Models and Methods (CIP: MM))
Show Figures

Figure 1

22 pages, 1615 KiB  
Article
VGG16 Feature Extractor with Extreme Gradient Boost Classifier for Pancreas Cancer Prediction
by Wilson Bakasa and Serestina Viriri
J. Imaging 2023, 9(7), 138; https://doi.org/10.3390/jimaging9070138 - 7 Jul 2023
Cited by 14 | Viewed by 2609
Abstract
The prognosis of patients with pancreatic ductal adenocarcinoma (PDAC) is greatly improved by an early and accurate diagnosis. Several studies have created automated methods to forecast PDAC development utilising various medical imaging modalities. These papers give a general overview of the classification, segmentation, [...] Read more.
The prognosis of patients with pancreatic ductal adenocarcinoma (PDAC) is greatly improved by an early and accurate diagnosis. Several studies have created automated methods to forecast PDAC development utilising various medical imaging modalities. These papers give a general overview of the classification, segmentation, or grading of many cancer types utilising conventional machine learning techniques and hand-engineered characteristics, including pancreatic cancer. This study uses cutting-edge deep learning techniques to identify PDAC utilising computerised tomography (CT) medical imaging modalities. This work suggests that the hybrid model VGG16–XGBoost (VGG16—backbone feature extractor and Extreme Gradient Boosting—classifier) for PDAC images. According to studies, the proposed hybrid model performs better, obtaining an accuracy of 0.97 and a weighted F1 score of 0.97 for the dataset under study. The experimental validation of the VGG16–XGBoost model uses the Cancer Imaging Archive (TCIA) public access dataset, which has pancreas CT images. The results of this study can be extremely helpful for PDAC diagnosis from computerised tomography (CT) pancreas images, categorising them into five different tumours (T), node (N), and metastases (M) (TNM) staging system class labels, which are T0, T1, T2, T3, and T4. Full article
Show Figures

Figure 1

14 pages, 3330 KiB  
Article
Improving Visual Defect Detection and Localization in Industrial Thermal Images Using Autoencoders
by Sasha Behrouzi, Marcel Dix, Fatemeh Karampanah, Omer Ates, Nissy Sasidharan, Swati Chandna and Binh Vu
J. Imaging 2023, 9(7), 137; https://doi.org/10.3390/jimaging9070137 - 7 Jul 2023
Cited by 2 | Viewed by 2129
Abstract
Reliable functionality in anomaly detection in thermal image datasets is crucial for defect detection of industrial products. Nevertheless, achieving reliable functionality is challenging, especially when datasets are image sequences captured during equipment runtime with a smooth transition from healthy to defective images. This [...] Read more.
Reliable functionality in anomaly detection in thermal image datasets is crucial for defect detection of industrial products. Nevertheless, achieving reliable functionality is challenging, especially when datasets are image sequences captured during equipment runtime with a smooth transition from healthy to defective images. This causes contamination of healthy training data with defective samples. Anomaly detection methods based on autoencoders are susceptible to a slight violation of a clean training dataset and lead to challenging threshold determination for sample classification. This paper indicates that combining anomaly scores leads to better threshold determination that effectively separates healthy and defective data. Our research results show that our approach helps to overcome these challenges. The autoencoder models in our research are trained with healthy images optimizing two loss functions: mean squared error (MSE) and structural similarity index measure (SSIM). Anomaly score outputs are used for classification. Three anomaly scores are applied: MSE, SSIM, and kernel density estimation (KDE). The proposed method is trained and tested on the 32 × 32-sized thermal images, including one contaminated dataset. The model achieved the following average accuracies across the datasets: MSE, 95.33%; SSIM, 88.37%; and KDE, 92.81%. Using a combination of anomaly scores could assist in solving a low classification accuracy. The use of KDE improves performance when healthy training data are contaminated. The MSE+ and SSIM+ methods, as well as two parameters to control quantitative anomaly localization using SSIM, are introduced. Full article
(This article belongs to the Special Issue Data Processing with Artificial Intelligence in Thermal Imagery)
Show Figures

Figure 1

13 pages, 3947 KiB  
Article
VeerNet: Using Deep Neural Networks for Curve Classification and Digitization of Raster Well-Log Images
by M. Quamer Nasim, Narendra Patwardhan, Tannistha Maiti, Stefano Marrone and Tarry Singh
J. Imaging 2023, 9(7), 136; https://doi.org/10.3390/jimaging9070136 - 6 Jul 2023
Cited by 2 | Viewed by 2382
Abstract
Raster logs are scanned representations of the analog data recorded in subsurface drilling. Geologists rely on these images to interpret well-log curves and deduce the physical properties of geological formations. Scanned images contain various artifacts, including hand-written texts, brightness variability, scan defects, etc. [...] Read more.
Raster logs are scanned representations of the analog data recorded in subsurface drilling. Geologists rely on these images to interpret well-log curves and deduce the physical properties of geological formations. Scanned images contain various artifacts, including hand-written texts, brightness variability, scan defects, etc. The manual effort involved in reading the data is substantial. To mitigate this, unsupervised computer vision techniques are employed to extract and interpret the curves digitally. Existing algorithms predominantly require manual intervention, resulting in slow processing times, and are erroneous. This research aims to address these challenges by proposing VeerNet, a deep neural network architecture designed to semantically segment the raster images from the background grid to classify and digitize (i.e., extracting the analytic formulation of the written curve) the well-log data. The proposed approach is based on a modified UNet-inspired architecture leveraging an attention-augmented read–process–write strategy to balance retaining key signals while dealing with the different input–output sizes. The reported results show that the proposed architecture efficiently classifies and digitizes the curves with an overall F1 score of 35% and Intersection over Union of 30%, achieving 97% recall and 0.11 Mean Absolute Error when compared with real data on binary segmentation of multiple curves. Finally, we analyzed VeerNet’s ability in predicting Gamma-ray values, achieving a Pearson coefficient score of 0.62 when compared to measured data. Full article
Show Figures

Figure 1

11 pages, 2637 KiB  
Article
Fast and Efficient Evaluation of the Mass Composition of Shredded Electrodes from Lithium-Ion Batteries Using 2D Imaging
by Peter Bischoff, Alexandra Kaas, Christiane Schuster, Thomas Härtling and Urs Peuker
J. Imaging 2023, 9(7), 135; https://doi.org/10.3390/jimaging9070135 - 5 Jul 2023
Cited by 3 | Viewed by 1989
Abstract
With the increasing number of electrical devices, especially electric vehicles, the need for efficient recycling processes of electric components is on the rise. Mechanical recycling of lithium-ion batteries includes the comminution of the electrodes and sorting the particle mixtures to achieve the highest [...] Read more.
With the increasing number of electrical devices, especially electric vehicles, the need for efficient recycling processes of electric components is on the rise. Mechanical recycling of lithium-ion batteries includes the comminution of the electrodes and sorting the particle mixtures to achieve the highest possible purities of the individual material components (e.g., copper and aluminum). An important part of recycling is the quantitative determination of the yield and recovery rate, which is required to adapt the processes to different feed materials. Since this is usually done by sorting individual particles manually before determining the mass of each material, we developed a novel method for automating this evaluation process. The method is based on detecting the different material particles in images based on simple thresholding techniques and analyzing the correlation of the area of each material in the field of view to the mass in the previously prepared samples. This can then be applied to further samples to determine their mass composition. Using this automated method, the process is accelerated, the accuracy is improved compared to a human operator, and the cost of the evaluation process is reduced. Full article
(This article belongs to the Topic Applications in Image Analysis and Pattern Recognition)
Show Figures

Figure 1

23 pages, 2614 KiB  
Review
Diagnostic Applications of Intraoral Scanners: A Systematic Review
by Francesca Angelone, Alfonso Maria Ponsiglione, Carlo Ricciardi, Giuseppe Cesarelli, Mario Sansone and Francesco Amato
J. Imaging 2023, 9(7), 134; https://doi.org/10.3390/jimaging9070134 - 3 Jul 2023
Cited by 15 | Viewed by 7277
Abstract
In addition to their recognized value for obtaining 3D digital dental models, intraoral scanners (IOSs) have recently been proven to be promising tools for oral health diagnostics. In this work, the most recent literature on IOSs was reviewed with a focus on their [...] Read more.
In addition to their recognized value for obtaining 3D digital dental models, intraoral scanners (IOSs) have recently been proven to be promising tools for oral health diagnostics. In this work, the most recent literature on IOSs was reviewed with a focus on their applications as detection systems of oral cavity pathologies. Those applications of IOSs falling in the general area of detection systems for oral health diagnostics (e.g., caries, dental wear, periodontal diseases, oral cancer) were included, while excluding those works mainly focused on 3D dental model reconstruction for implantology, orthodontics, or prosthodontics. Three major scientific databases, namely Scopus, PubMed, and Web of Science, were searched and explored by three independent reviewers. The synthesis and analysis of the studies was carried out by considering the type and technical features of the IOS, the study objectives, and the specific diagnostic applications. From the synthesis of the twenty-five included studies, the main diagnostic fields where IOS technology applies were highlighted, ranging from the detection of tooth wear and caries to the diagnosis of plaques, periodontal defects, and other complications. This shows how additional diagnostic information can be obtained by combining the IOS technology with other radiographic techniques. Despite some promising results, the clinical evidence regarding the use of IOSs as oral health probes is still limited, and further efforts are needed to validate the diagnostic potential of IOSs over conventional tools. Full article
(This article belongs to the Topic Digital Dentistry)
Show Figures

Figure 1

15 pages, 8102 KiB  
Article
Ambiguity in Solving Imaging Inverse Problems with Deep-Learning-Based Operators
by Davide Evangelista, Elena Morotti, Elena Loli Piccolomini and James Nagy
J. Imaging 2023, 9(7), 133; https://doi.org/10.3390/jimaging9070133 - 30 Jun 2023
Cited by 1 | Viewed by 1826
Abstract
In recent years, large convolutional neural networks have been widely used as tools for image deblurring, because of their ability in restoring images very precisely. It is well known that image deblurring is mathematically modeled as an ill-posed inverse problem and its solution [...] Read more.
In recent years, large convolutional neural networks have been widely used as tools for image deblurring, because of their ability in restoring images very precisely. It is well known that image deblurring is mathematically modeled as an ill-posed inverse problem and its solution is difficult to approximate when noise affects the data. Really, one limitation of neural networks for deblurring is their sensitivity to noise and other perturbations, which can lead to instability and produce poor reconstructions. In addition, networks do not necessarily take into account the numerical formulation of the underlying imaging problem when trained end-to-end. In this paper, we propose some strategies to improve stability without losing too much accuracy to deblur images with deep-learning-based methods. First, we suggest a very small neural architecture, which reduces the execution time for training, satisfying a green AI need, and does not extremely amplify noise in the computed image. Second, we introduce a unified framework where a pre-processing step balances the lack of stability of the following neural-network-based step. Two different pre-processors are presented. The former implements a strong parameter-free denoiser, and the latter is a variational-model-based regularized formulation of the latent imaging problem. This framework is also formally characterized by mathematical analysis. Numerical experiments are performed to verify the accuracy and stability of the proposed approaches for image deblurring when unknown or not-quantified noise is present; the results confirm that they improve the network stability with respect to noise. In particular, the model-based framework represents the most reliable trade-off between visual precision and robustness. Full article
(This article belongs to the Topic Computer Vision and Image Processing)
Show Figures

Figure 1

17 pages, 5941 KiB  
Article
Motion Vector Extrapolation for Video Object Detection
by Julian True and Naimul Khan
J. Imaging 2023, 9(7), 132; https://doi.org/10.3390/jimaging9070132 - 29 Jun 2023
Cited by 1 | Viewed by 2276
Abstract
Despite the continued successes of computationally efficient deep neural network architectures for video object detection, performance continually arrives at the great trilemma of speed versus accuracy versus computational resources (pick two). Current attempts to exploit temporal information in video data to overcome this [...] Read more.
Despite the continued successes of computationally efficient deep neural network architectures for video object detection, performance continually arrives at the great trilemma of speed versus accuracy versus computational resources (pick two). Current attempts to exploit temporal information in video data to overcome this trilemma are bottlenecked by the state of the art in object detection models. This work presents motion vector extrapolation (MOVEX), a technique which performs video object detection through the use of off-the-shelf object detectors alongside existing optical flow-based motion estimation techniques in parallel. This work demonstrates that this approach significantly reduces the baseline latency of any given object detector without sacrificing accuracy performance. Further latency reductions up to 24 times lower than the original latency can be achieved with minimal accuracy loss. MOVEX enables low-latency video object detection on common CPU-based systems, thus allowing for high-performance video object detection beyond the domain of GPU computing. Full article
(This article belongs to the Topic Visual Object Tracking: Challenges and Applications)
Show Figures

Figure 1

19 pages, 2421 KiB  
Article
Automated Vehicle Counting from Pre-Recorded Video Using You Only Look Once (YOLO) Object Detection Model
by Mishuk Majumder and Chester Wilmot
J. Imaging 2023, 9(7), 131; https://doi.org/10.3390/jimaging9070131 - 27 Jun 2023
Cited by 15 | Viewed by 6777
Abstract
Different techniques are being applied for automated vehicle counting from video footage, which is a significant subject of interest to many researchers. In this context, the You Only Look Once (YOLO) object detection model, which has been developed recently, has emerged as a [...] Read more.
Different techniques are being applied for automated vehicle counting from video footage, which is a significant subject of interest to many researchers. In this context, the You Only Look Once (YOLO) object detection model, which has been developed recently, has emerged as a promising tool. In terms of accuracy and flexible interval counting, the adequacy of existing research on employing the model for vehicle counting from video footage is unlikely sufficient. The present study endeavors to develop computer algorithms for automated traffic counting from pre-recorded videos using the YOLO model with flexible interval counting. The study involves the development of algorithms aimed at detecting, tracking, and counting vehicles from pre-recorded videos. The YOLO model was applied in TensorFlow API with the assistance of OpenCV. The developed algorithms implement the YOLO model for counting vehicles in two-way directions in an efficient way. The accuracy of the automated counting was evaluated compared to the manual counts, and was found to be about 90 percent. The accuracy comparison also shows that the error of automated counting consistently occurs due to undercounting from unsuitable videos. In addition, a benefit–cost (B/C) analysis shows that implementing the automated counting method returns 1.76 times the investment. Full article
(This article belongs to the Special Issue Visual Localization—Volume II)
Show Figures

Figure 1

30 pages, 10861 KiB  
Article
Human Activity Recognition Using Cascaded Dual Attention CNN and Bi-Directional GRU Framework
by Hayat Ullah and Arslan Munir
J. Imaging 2023, 9(7), 130; https://doi.org/10.3390/jimaging9070130 - 26 Jun 2023
Cited by 14 | Viewed by 3188
Abstract
Vision-based human activity recognition (HAR) has emerged as one of the essential research areas in video analytics. Over the last decade, numerous advanced deep learning algorithms have been introduced to recognize complex human actions from video streams. These deep learning algorithms have shown [...] Read more.
Vision-based human activity recognition (HAR) has emerged as one of the essential research areas in video analytics. Over the last decade, numerous advanced deep learning algorithms have been introduced to recognize complex human actions from video streams. These deep learning algorithms have shown impressive performance for the video analytics task. However, these newly introduced methods either exclusively focus on model performance or the effectiveness of these models in terms of computational efficiency, resulting in a biased trade-off between robustness and computational efficiency in their proposed methods to deal with challenging HAR problem. To enhance both the accuracy and computational efficiency, this paper presents a computationally efficient yet generic spatial–temporal cascaded framework that exploits the deep discriminative spatial and temporal features for HAR. For efficient representation of human actions, we propose an efficient dual attentional convolutional neural network (DA-CNN) architecture that leverages a unified channel–spatial attention mechanism to extract human-centric salient features in video frames. The dual channel–spatial attention layers together with the convolutional layers learn to be more selective in the spatial receptive fields having objects within the feature maps. The extracted discriminative salient features are then forwarded to a stacked bi-directional gated recurrent unit (Bi-GRU) for long-term temporal modeling and recognition of human actions using both forward and backward pass gradient learning. Extensive experiments are conducted on three publicly available human action datasets, where the obtained results verify the effectiveness of our proposed framework (DA-CNN+Bi-GRU) over the state-of-the-art methods in terms of model accuracy and inference runtime across each dataset. Experimental results show that the DA-CNN+Bi-GRU framework attains an improvement in execution time up to 167× in terms of frames per second as compared to most of the contemporary action-recognition methods. Full article
(This article belongs to the Special Issue Image Processing and Computer Vision: Algorithms and Applications)
Show Figures

Figure 1

18 pages, 14748 KiB  
Article
A Joint De-Rain and De-Mist Network Based on the Atmospheric Scattering Model
by Linyun Gu, Huahu Xu and Xiaojin Ma
J. Imaging 2023, 9(7), 129; https://doi.org/10.3390/jimaging9070129 - 26 Jun 2023
Viewed by 1397
Abstract
Rain can have a detrimental effect on optical components, leading to the appearance of streaks and halos in images captured during rainy conditions. These visual distortions caused by rain and mist contribute significant noise information that can compromise image quality. In this paper, [...] Read more.
Rain can have a detrimental effect on optical components, leading to the appearance of streaks and halos in images captured during rainy conditions. These visual distortions caused by rain and mist contribute significant noise information that can compromise image quality. In this paper, we propose a novel approach for simultaneously removing both streaks and halos from the image to produce clear results. First, based on the principle of atmospheric scattering, a rain and mist model is proposed to initially remove the streaks and halos from the image by reconstructing the image. The Deep Memory Block (DMB) selectively extracts the rain layer transfer spectrum and the mist layer transfer spectrum from the rainy image to separate these layers. Then, the Multi-scale Convolution Block (MCB) receives the reconstructed images and extracts both structural and detailed features to enhance the overall accuracy and robustness of the model. Ultimately, extensive results demonstrate that our proposed model JDDN (Joint De-rain and De-mist Network) outperforms current state-of-the-art deep learning methods on synthetic datasets as well as real-world datasets, with an average improvement of 0.29 dB on the heavy-rainy-image dataset. Full article
(This article belongs to the Topic Computer Vision and Image Processing)
Show Figures

Figure 1

20 pages, 4934 KiB  
Article
Hybrid Classical–Quantum Transfer Learning for Cardiomegaly Detection in Chest X-rays
by Pierre Decoodt, Tan Jun Liang, Soham Bopardikar, Hemavathi Santhanam, Alfaxad Eyembe, Begonya Garcia-Zapirain and Daniel Sierra-Sosa
J. Imaging 2023, 9(7), 128; https://doi.org/10.3390/jimaging9070128 - 25 Jun 2023
Cited by 5 | Viewed by 4882
Abstract
Cardiovascular diseases are among the major health problems that are likely to benefit from promising developments in quantum machine learning for medical imaging. The chest X-ray (CXR), a widely used modality, can reveal cardiomegaly, even when performed primarily for a non-cardiological indication. Based [...] Read more.
Cardiovascular diseases are among the major health problems that are likely to benefit from promising developments in quantum machine learning for medical imaging. The chest X-ray (CXR), a widely used modality, can reveal cardiomegaly, even when performed primarily for a non-cardiological indication. Based on pre-trained DenseNet-121, we designed hybrid classical–quantum (CQ) transfer learning models to detect cardiomegaly in CXRs. Using Qiskit and PennyLane, we integrated a parameterized quantum circuit into a classic network implemented in PyTorch. We mined the CheXpert public repository to create a balanced dataset with 2436 posteroanterior CXRs from different patients distributed between cardiomegaly and the control. Using k-fold cross-validation, the CQ models were trained using a state vector simulator. The normalized global effective dimension allowed us to compare the trainability in the CQ models run on Qiskit. For prediction, ROC AUC scores up to 0.93 and accuracies up to 0.87 were achieved for several CQ models, rivaling the classical–classical (CC) model used as a reference. A trustworthy Grad-CAM++ heatmap with a hot zone covering the heart was visualized more often with the QC option than that with the CC option (94% vs. 61%, p < 0.001), which may boost the rate of acceptance by health professionals. Full article
Show Figures

Graphical abstract

17 pages, 4491 KiB  
Article
Measuring Dental Enamel Thickness: Morphological and Functional Relevance of Topographic Mapping
by Armen V. Gaboutchian, Vladimir A. Knyaz, Evgeniy N. Maschenko, Le Xuan Dac, Anatoly A. Maksimov, Anton V. Emelyanov, Dmitry V. Korost and Nikita V. Stepanov
J. Imaging 2023, 9(7), 127; https://doi.org/10.3390/jimaging9070127 - 23 Jun 2023
Viewed by 3001
Abstract
The interest in the development of dental enamel thickness measurement techniques is connected to the importance of metric data in taxonomic assessments and evolutionary research as well as in other directions of dental studies. At the same time, advances in non-destructive imaging techniques [...] Read more.
The interest in the development of dental enamel thickness measurement techniques is connected to the importance of metric data in taxonomic assessments and evolutionary research as well as in other directions of dental studies. At the same time, advances in non-destructive imaging techniques and the application of scanning methods, such as micro-focus-computed X-ray tomography, has enabled researchers to study the internal morpho-histological layers of teeth with a greater degree of accuracy and detail. These tendencies have contributed to changes in established views in different areas of dental research, ranging from the interpretation of morphology to metric assessments. In fact, a significant amount of data have been obtained using traditional metric techniques, which now should be critically reassessed using current technologies and methodologies. Hence, we propose new approaches for measuring dental enamel thickness using palaeontological material from the territories of northern Vietnam by means of automated and manually operated techniques. We also discuss method improvements, taking into account their relevance for dental morphology and occlusion. As we have shown, our approaches demonstrate the potential to form closer links between the metric data and dental morphology and provide the possibility for objective and replicable studies on dental enamel thickness through the application of automated techniques. These features are likely to be effective in more profound taxonomic research and for the development of metric and analytical systems. Our technique provides scope for its targeted application in clinical methods, which could help to reveal functional changes in the masticatory system. However, this will likely require improvements in clinically applicable imaging techniques. Full article
Show Figures

Figure 1

15 pages, 7229 KiB  
Article
Fast Reservoir Characterization with AI-Based Lithology Prediction Using Drill Cuttings Images and Noisy Labels
by Ekaterina Tolstaya, Anuar Shakirov, Mokhles Mezghani and Sergey Safonov
J. Imaging 2023, 9(7), 126; https://doi.org/10.3390/jimaging9070126 - 21 Jun 2023
Cited by 1 | Viewed by 2251
Abstract
In this paper, we considered one of the problems that arise during drilling automation, namely the automation of lithology identification from drill cuttings images. Usually, this work is performed by experienced geologists, but this is a tedious and subjective process. Drill cuttings are [...] Read more.
In this paper, we considered one of the problems that arise during drilling automation, namely the automation of lithology identification from drill cuttings images. Usually, this work is performed by experienced geologists, but this is a tedious and subjective process. Drill cuttings are the cheapest source of rock formation samples; therefore, reliable lithology prediction can greatly reduce the cost of analysis during drilling. To predict the lithology content from images of cuttings samples, we used a convolutional neural network (CNN). For training a model with an acceptable generalization ability, we applied dataset-cleaning techniques, which help to reveal bad samples, as well as samples with uncertain labels. It was shown that the model trained on a cleaned dataset performs better in terms of accuracy. Data cleaning was performed using a cross-validation technique, as well as a clustering analysis of embeddings, where it is possible to identify clusters with distinctive visual characteristics and clusters where visually similar samples of rocks are attributed to different lithologies during the labeling process. Full article
(This article belongs to the Section AI in Imaging)
Show Figures

Figure 1

12 pages, 1692 KiB  
Article
Quantifying the Displacement of Data Matrix Code Modules: A Comparative Study of Different Approximation Approaches for Predictive Maintenance of Drop-on-Demand Printing Systems
by Peter Bischoff, André V. Carreiro, Christiane Schuster and Thomas Härtling
J. Imaging 2023, 9(7), 125; https://doi.org/10.3390/jimaging9070125 - 21 Jun 2023
Viewed by 1732
Abstract
Drop-on-demand printing using colloidal or pigmented inks is prone to the clogging of printing nozzles, which can lead to positional deviations and inconsistently printed patterns (e.g., data matrix codes, DMCs). However, if such deviations are detected early, they can be useful for determining [...] Read more.
Drop-on-demand printing using colloidal or pigmented inks is prone to the clogging of printing nozzles, which can lead to positional deviations and inconsistently printed patterns (e.g., data matrix codes, DMCs). However, if such deviations are detected early, they can be useful for determining the state of the print head and planning maintenance operations prior to reaching a printing state where the printed DMCs are unreadable. To realize this predictive maintenance approach, it is necessary to accurately quantify the positional deviation of individually printed dots from the actual target position. Here, we present a comparison of different methods based on affinity transformations and clustering algorithms for calculating the target position from the printed positions and, subsequently, the deviation of both for complete DMCs. Hence, our method focuses on the evaluation of the print quality, not on the decoding of DMCs. We compare our results to a state-of-the-art decoding algorithm, adopted to return the target grid positions, and find that we can determine the occurring deviations with significantly higher accuracy, especially when the printed DMCs are of low quality. The results enable the development of decision systems for predictive maintenance and subsequently the optimization of printing systems. Full article
(This article belongs to the Topic Computer Vision and Image Processing)
Show Figures

Figure 1

Previous Issue
Next Issue
Back to TopTop