Journal of Imaging

17 pages, 5156 KiB

Open AccessArticle

Plant Detection in RGB Images from Unmanned Aerial Vehicles Using Segmentation by Deep Learning and an Impact of Model Accuracy on Downstream Analysis

by Mikhail V. Kozhekin, Mikhail A. Genaev, Evgenii G. Komyshev, Zakhar A. Zavyalov and Dmitry A. Afonnikov

J. Imaging 2025, 11(1), 28; https://doi.org/10.3390/jimaging11010028 - 20 Jan 2025

Viewed by 589

Abstract

Crop field monitoring using unmanned aerial vehicles (UAVs) is one of the most important technologies for plant growth control in modern precision agriculture. One of the important and widely used tasks in field monitoring is plant stand counting. The accurate identification of plants [...] Read more.

Crop field monitoring using unmanned aerial vehicles (UAVs) is one of the most important technologies for plant growth control in modern precision agriculture. One of the important and widely used tasks in field monitoring is plant stand counting. The accurate identification of plants in field images provides estimates of plant number per unit area, detects missing seedlings, and predicts crop yield. Current methods are based on the detection of plants in images obtained from UAVs by means of computer vision algorithms and deep learning neural networks. These approaches depend on image spatial resolution and the quality of plant markup. The performance of automatic plant detection may affect the efficiency of downstream analysis of a field cropping pattern. In the present work, a method is presented for detecting the plants of five species in images acquired via a UAV on the basis of image segmentation by deep learning algorithms (convolutional neural networks). Twelve orthomosaics were collected and marked at several sites in Russia to train and test the neural network algorithms. Additionally, 17 existing datasets of various spatial resolutions and markup quality levels from the Roboflow service were used to extend training image sets. Finally, we compared several texture features between manually evaluated and neural-network-estimated plant masks. It was demonstrated that adding images to the training sample (even those of lower resolution and markup quality) improves plant stand counting significantly. The work indicates how the accuracy of plant detection in field images may affect their cropping pattern evaluation by means of texture characteristics. For some of the characteristics (GLCM mean, GLRM long run, GLRM run ratio) the estimates between images marked manually and automatically are close. For others, the differences are large and may lead to erroneous conclusions about the properties of field cropping patterns. Nonetheless, overall, plant detection algorithms with a higher accuracy show better agreement with the estimates of texture parameters obtained from manually marked images. Full article

(This article belongs to the Special Issue Imaging Applications in Agriculture)

► Show Figures

Figure 1

15 pages, 3743 KiB

Open AccessArticle

Blink Detection Using 3D Convolutional Neural Architectures and Analysis of Accumulated Frame Predictions

by George Nousias, Konstantinos K. Delibasis and Georgios Labiris

J. Imaging 2025, 11(1), 27; https://doi.org/10.3390/jimaging11010027 - 19 Jan 2025

Viewed by 497

Abstract

Blink detection is considered a useful indicator both for clinical conditions and drowsiness state. In this work, we propose and compare deep learning architectures for the task of detecting blinks in video frame sequences. The first step is the training and application of [...] Read more.

Blink detection is considered a useful indicator both for clinical conditions and drowsiness state. In this work, we propose and compare deep learning architectures for the task of detecting blinks in video frame sequences. The first step is the training and application of an eye detector that extracts the eye regions from each video frame. The cropped eye regions are organized as three-dimensional (3D) input with the third dimension spanning time of 300 ms. Two different 3D convolutional neural networks are utilized (a simple 3D CNN and 3D ResNet), as well as a 3D autoencoder combined with a classifier coupled to the latent space. Finally, we propose the usage of a frame prediction accumulator combined with morphological processing and watershed segmentation to detect blinks and determine their start and stop frame in previously unseen videos. The proposed framework was trained on ten (9) different participants and tested on five (8) different ones, with a total of 162,400 frames and 1172 blinks for each eye. The start and end frame of each blink in the dataset has been annotate by specialized ophthalmologist. Quantitative comparison with state-of-the-art blink detection methodologies provide favorable results for the proposed neural architectures coupled with the prediction accumulator, with the 3D ResNet being the best as well as the fastest performer. Full article

(This article belongs to the Special Issue Deep Learning in Biomedical Image Segmentation and Classification: Advancements, Challenges and Applications)

► Show Figures

Figure 1

17 pages, 7356 KiB

Open AccessArticle

Increasing Neural-Based Pedestrian Detectors’ Robustness to Adversarial Patch Attacks Using Anomaly Localization

by Olga Ilina, Maxim Tereshonok and Vadim Ziyadinov

J. Imaging 2025, 11(1), 26; https://doi.org/10.3390/jimaging11010026 - 17 Jan 2025

Viewed by 491

Abstract

Object detection in images is a fundamental component of many safety-critical systems, such as autonomous driving, video surveillance systems, and robotics. Adversarial patch attacks, being easily implemented in the real world, provide effective counteraction to object detection by state-of-the-art neural-based detectors. It poses [...] Read more.

Object detection in images is a fundamental component of many safety-critical systems, such as autonomous driving, video surveillance systems, and robotics. Adversarial patch attacks, being easily implemented in the real world, provide effective counteraction to object detection by state-of-the-art neural-based detectors. It poses a serious danger in various fields of activity. Existing defense methods against patch attacks are insufficiently effective, which underlines the need to develop new reliable solutions. In this manuscript, we propose a method which helps to increase the robustness of neural network systems to the input adversarial images. The proposed method consists of a Deep Convolutional Neural Network to reconstruct a benign image from the adversarial one; a Calculating Maximum Error block to highlight the mismatches between input and reconstructed images; a Localizing Anomalous Fragments block to extract the anomalous regions using the Isolation Forest algorithm from histograms of images’ fragments; and a Clustering and Processing block to group and evaluate the extracted anomalous regions. The proposed method, based on anomaly localization, demonstrates high resistance to adversarial patch attacks while maintaining the high quality of object detection. The experimental results show that the proposed method is effective in defending against adversarial patch attacks. Using the YOLOv3 algorithm with the proposed defensive method for pedestrian detection in the INRIAPerson dataset under the adversarial attacks, the mAP50 metric reaches 80.97% compared to 46.79% without a defensive method. The results of the research demonstrate that the proposed method is promising for improvement of object detection systems security. Full article

(This article belongs to the Section Image and Video Processing)

► Show Figures

Figure 1

23 pages, 22211 KiB

Open AccessArticle

A Local Adversarial Attack with a Maximum Aggregated Region Sparseness Strategy for 3D Objects

by Ling Zhao, Xun Lv, Lili Zhu, Binyan Luo, Hang Cao, Jiahao Cui, Haifeng Li and Jian Peng

J. Imaging 2025, 11(1), 25; https://doi.org/10.3390/jimaging11010025 - 13 Jan 2025

Viewed by 651

Abstract

The increasing reliance on deep neural network-based object detection models in various applications has raised significant security concerns due to their vulnerability to adversarial attacks. In physical 3D environments, existing adversarial attacks that target object detection (3D-AE) face significant challenges. These attacks often [...] Read more.

The increasing reliance on deep neural network-based object detection models in various applications has raised significant security concerns due to their vulnerability to adversarial attacks. In physical 3D environments, existing adversarial attacks that target object detection (3D-AE) face significant challenges. These attacks often require large and dispersed modifications to objects, making them easily noticeable and reducing their effectiveness in real-world scenarios. To maximize the attack effectiveness, large and dispersed attack camouflages are often employed, which makes the camouflages overly conspicuous and reduces their visual stealth. The core issue is how to use minimal and concentrated camouflage to maximize the attack effect. Addressing this, our research focuses on developing more subtle and efficient attack methods that can better evade detection in practical settings. Based on these principles, this paper proposes a local 3D attack method driven by a Maximum Aggregated Region Sparseness (MARS) strategy. In simpler terms, our approach strategically concentrates the attack modifications to specific areas to enhance effectiveness while maintaining stealth. To maximize the aggregation of attack-camouflaged regions, an aggregation regularization term is designed to constrain the mask aggregation matrix based on the face-adjacency relationships. To minimize the attack camouflage regions, a sparseness regularization is designed to make the mask weights tend toward a U-shaped distribution and limit extreme values. Additionally, neural rendering is used to obtain gradient-propagating multi-angle augmented data and suppress the model’s detection to locate universal critical decision regions from multiple angles. These technical strategies ensure that the adversarial modifications remain effective across different viewpoints and conditions. We test the attack effectiveness of different region selection strategies. On the CARLA dataset, the average attack efficiency of attacking the YOLOv3 and v5 series networks reaches 1.724, which represents an improvement of 0.986 (134%) compared to baseline methods. These results demonstrate a significant enhancement in attack performance, highlighting the potential risks to real-world object detection systems. The experimental results demonstrate that our attack method achieves both stealth and aggressiveness from different viewpoints. Furthermore, we explore the transferability of the decision regions. The results indicate that our method can be effectively combined with different texture optimization methods, with the average precision decreasing by 0.488 and 0.662 across different networks, which indicates a strong attack effectiveness. Full article

► Show Figures

Figure 1

22 pages, 11474 KiB

Open AccessArticle

LittleFaceNet: A Small-Sized Face Recognition Method Based on RetinaFace and AdaFace

by Zhengwei Ren, Xinyu Liu, Jing Xu, Yongsheng Zhang and Ming Fang

J. Imaging 2025, 11(1), 24; https://doi.org/10.3390/jimaging11010024 - 13 Jan 2025

Viewed by 496

Abstract

For surveillance video management in university laboratories, issues such as occlusion and low-resolution face capture often arise. Traditional face recognition algorithms are typically static and rely heavily on clear images, resulting in inaccurate recognition for low-resolution, small-sized faces. To address the challenges of [...] Read more.

For surveillance video management in university laboratories, issues such as occlusion and low-resolution face capture often arise. Traditional face recognition algorithms are typically static and rely heavily on clear images, resulting in inaccurate recognition for low-resolution, small-sized faces. To address the challenges of occlusion and low-resolution person identification, this paper proposes a new face recognition framework by reconstructing Retinaface-Resnet and combining it with Quality-Adaptive Margin (adaface). Currently, although there are many target detection algorithms, they all require a large amount of data for training. However, datasets for low-resolution face detection are scarce, leading to poor detection performance of the models. This paper aims to solve Retinaface’s weak face recognition capability in low-resolution scenarios and its potential inaccuracies in face bounding box localization when faces are at extreme angles or partially occluded. To this end, Spatial Depth-wise Separable Convolutions are introduced. Retinaface-Resnet is designed for face detection and localization, while adaface is employed to address low-resolution face recognition by using feature norm approximation to estimate image quality and applying an adaptive margin function. Additionally, a multi-object tracking algorithm is used to solve the problem of moving occlusion. Experimental results demonstrate significant improvements, achieving an accuracy of 96.12% on the WiderFace dataset and a recognition accuracy of 84.36% in practical laboratory applications. Full article

(This article belongs to the Section Computer Vision and Pattern Recognition)

► Show Figures

Figure 1

20 pages, 7090 KiB

Open AccessArticle

An Infrared and Visible Image Alignment Method Based on Gradient Distribution Properties and Scale-Invariant Features in Electric Power Scenes

by Lin Zhu, Yuxing Mao, Chunxu Chen and Lanjia Ning

J. Imaging 2025, 11(1), 23; https://doi.org/10.3390/jimaging11010023 - 13 Jan 2025

Viewed by 411

Abstract

In grid intelligent inspection systems, automatic registration of infrared and visible light images in power scenes is a crucial research technology. Since there are obvious differences in key attributes between visible and infrared images, direct alignment is often difficult to achieve the expected [...] Read more.

In grid intelligent inspection systems, automatic registration of infrared and visible light images in power scenes is a crucial research technology. Since there are obvious differences in key attributes between visible and infrared images, direct alignment is often difficult to achieve the expected results. To overcome the high difficulty of aligning infrared and visible light images, an image alignment method is proposed in this paper. First, we use the Sobel operator to extract the edge information of the image pair. Second, the feature points in the edges are recognised by a curvature scale space (CSS) corner detector. Third, the Histogram of Orientation Gradients (HOG) is extracted as the gradient distribution characteristics of the feature points, which are normalised with the Scale Invariant Feature Transform (SIFT) algorithm to form feature descriptors. Finally, initial matching and accurate matching are achieved by the improved fast approximate nearest-neighbour matching method and adaptive thresholding, respectively. Experiments show that this method can robustly match the feature points of image pairs under rotation, scale, and viewpoint differences, and achieves excellent matching results. Full article

(This article belongs to the Topic Computer Vision and Image Processing, 2nd Edition)

► Show Figures

Graphical abstract

19 pages, 4635 KiB

Open AccessArticle

ZooCNN: A Zero-Order Optimized Convolutional Neural Network for Pneumonia Classification Using Chest Radiographs

by Saravana Kumar Ganesan, Parthasarathy Velusamy, Santhosh Rajendran, Ranjithkumar Sakthivel, Manikandan Bose and Baskaran Stephen Inbaraj

J. Imaging 2025, 11(1), 22; https://doi.org/10.3390/jimaging11010022 - 13 Jan 2025

Viewed by 544

Abstract

Pneumonia, a leading cause of mortality in children under five, is usually diagnosed through chest X-ray (CXR) images due to its efficiency and cost-effectiveness. However, the shortage of radiologists in the Least Developed Countries (LDCs) emphasizes the need for automated pneumonia diagnostic systems. [...] Read more.

Pneumonia, a leading cause of mortality in children under five, is usually diagnosed through chest X-ray (CXR) images due to its efficiency and cost-effectiveness. However, the shortage of radiologists in the Least Developed Countries (LDCs) emphasizes the need for automated pneumonia diagnostic systems. This article presents a Deep Learning model, Zero-Order Optimized Convolutional Neural Network (ZooCNN), a Zero-Order Optimization (Zoo)-based CNN model for classifying CXR images into three classes, Normal Lungs (NL), Bacterial Pneumonia (BP), and Viral Pneumonia (VP); this model utilizes the Adaptive Synthetic Sampling (ADASYN) approach to ensure class balance in the Kaggle CXR Images (Pneumonia) dataset. Conventional CNN models, though promising, face challenges such as overfitting and have high computational costs. The use of ZooPlatform (ZooPT), a hyperparameter finetuning strategy, on a baseline CNN model finetunes the hyperparameters and provides a modified architecture, ZooCNN, with a 72% reduction in weights. The model was trained, tested, and validated on the Kaggle CXR Images (Pneumonia) dataset. The ZooCNN achieved an accuracy of 97.27%, a sensitivity of 97.00%, a specificity of 98.60%, and an F1 score of 97.03%. The results were compared with contemporary models to highlight the efficacy of the ZooCNN in pneumonia classification (PC), offering a potential tool to aid physicians in clinical settings. Full article

► Show Figures

Figure 1

17 pages, 2650 KiB

Open AccessArticle

Typical and Local Diagnostic Reference Levels for Chest and Abdomen Radiography Examinations in Dubai Health Sector

by Entesar Z. Dalah, Maitha M. Al Zarooni, Faryal Y. Binismail, Hashim A. Beevi, Mohammed Siraj and Subrahmanian Pottybindu

J. Imaging 2025, 11(1), 21; https://doi.org/10.3390/jimaging11010021 - 13 Jan 2025

Viewed by 554

Abstract

Chest and abdomen radiographs are the most common radiograph examinations conducted in the Dubai Health sector, with both involving exposure to several radiosensitive organs. Diagnostic reference levels (DRLs) are accepted as an effective safety, optimization, and auditing tool in clinical practice. The present [...] Read more.

Chest and abdomen radiographs are the most common radiograph examinations conducted in the Dubai Health sector, with both involving exposure to several radiosensitive organs. Diagnostic reference levels (DRLs) are accepted as an effective safety, optimization, and auditing tool in clinical practice. The present work aims to establish a comprehensive projection and weight-based structured DRL system that allows one to confidently highlight healthcare centers in need of urgent action. The data of a total of 5474 adult males and non-pregnant females who underwent chest and abdomen radiography examinations in five different healthcare centers were collected and retrospectively analyzed. The typical DRL (TDRL) for each healthcare center was established and defined per projection (chest: posterior–anterior (PA), anterior–posterior (AP) and lateral (LAT); abdomen: erect and supine) for a weight band (60–80 kg) and for the whole data (no weight band). Local DRL (LDRL) values were established per project for the selected radiograph for the whole data (no weight band) and the 60–80 kg population. Chest radiography data from 1755 (60–80 kg) images were used to build this comprehensive DRL system (PA: 1471, AP: 252, and LAT: 32). Similarly, 611 (60–80 kg) abdomen radiographs were used to establish a DRL system (erect: 286 and supine: 325). The LDRL values defined per chest and abdomen projection for the weight band group (60–80 kg) were as follows: chest—0.51 PA, 2.46 AP, and 2.13 LAT dGy·cm²; abdomen—8.08 for erect and 5.95 for supine dGy·cm². The LDRL defined per abdomen projection for the 60–80 kg weight band highlighted at least one healthcare center in need of optimization. Such a system is efficient, easy to use, and very effective clinically. Full article

(This article belongs to the Special Issue Tools and Techniques for Improving Radiological Imaging Applications)

► Show Figures

Figure 1

15 pages, 1946 KiB

Open AccessArticle

Enhanced Image Retrieval Using Multiscale Deep Feature Fusion in Supervised Hashing

by Amina Belalia, Kamel Belloulata and Adil Redaoui

J. Imaging 2025, 11(1), 20; https://doi.org/10.3390/jimaging11010020 - 12 Jan 2025

Viewed by 608

Abstract

In recent years, deep-network-based hashing has gained prominence in image retrieval for its ability to generate compact and efficient binary representations. However, most existing methods predominantly focus on high-level semantic features extracted from the final layers of networks, often neglecting structural details that [...] Read more.

In recent years, deep-network-based hashing has gained prominence in image retrieval for its ability to generate compact and efficient binary representations. However, most existing methods predominantly focus on high-level semantic features extracted from the final layers of networks, often neglecting structural details that are crucial for capturing spatial relationships within images. Achieving a balance between preserving structural information and maximizing retrieval accuracy is the key to effective image hashing and retrieval. To address this challenge, we introduce Multiscale Deep Feature Fusion for Supervised Hashing (MDFF-SH), a novel approach that integrates multiscale feature fusion into the hashing process. The hallmark of MDFF-SH lies in its ability to combine low-level structural features with high-level semantic context, synthesizing robust and compact hash codes. By leveraging multiscale features from multiple convolutional layers, MDFF-SH ensures the preservation of fine-grained image details while maintaining global semantic integrity, achieving a harmonious balance that enhances retrieval precision and recall. Our approach demonstrated a superior performance on benchmark datasets, achieving significant gains in the Mean Average Precision (MAP) compared with the state-of-the-art methods: 9.5% on CIFAR-10, 5% on NUS-WIDE, and 11.5% on MS-COCO. These results highlight the effectiveness of MDFF-SH in bridging structural and semantic information, setting a new standard for high-precision image retrieval through multiscale feature fusion. Full article

(This article belongs to the Special Issue Recent Techniques in Image Feature Extraction)

► Show Figures

Figure 1

21 pages, 6639 KiB

Open AccessArticle

Efficient Generative-Adversarial U-Net for Multi-Organ Medical Image Segmentation

by Haoran Wang, Gengshen Wu and Yi Liu

J. Imaging 2025, 11(1), 19; https://doi.org/10.3390/jimaging11010019 - 12 Jan 2025

Viewed by 474

Abstract

Manual labeling of lesions in medical image analysis presents a significant challenge due to its labor-intensive and inefficient nature, which ultimately strains essential medical resources and impedes the advancement of computer-aided diagnosis. This paper introduces a novel medical image-segmentation framework named Efficient Generative-Adversarial [...] Read more.

Manual labeling of lesions in medical image analysis presents a significant challenge due to its labor-intensive and inefficient nature, which ultimately strains essential medical resources and impedes the advancement of computer-aided diagnosis. This paper introduces a novel medical image-segmentation framework named Efficient Generative-Adversarial U-Net (EGAUNet), designed to facilitate rapid and accurate multi-organ labeling. To enhance the model’s capability to comprehend spatial information, we propose the Global Spatial-Channel Attention Mechanism (GSCA). This mechanism enables the model to concentrate more effectively on regions of interest. Additionally, we have integrated Efficient Mapping Convolutional Blocks (EMCB) into the feature-learning process, allowing for the extraction of multi-scale spatial information and the adjustment of feature map channels through optimized weight values. Moreover, the proposed framework progressively enhances its performance by utilizing a generative-adversarial learning strategy, which contributes to improvements in segmentation accuracy. Consequently, EGAUNet demonstrates exemplary segmentation performance on public multi-organ datasets while maintaining high efficiency. For instance, in evaluations on the CHAOS T2SPIR dataset, EGAUNet achieves approximately

2 %

higher performance on the Jaccard metric,

1 %

higher on the Dice metric, and nearly

3 %

higher on the precision metric in comparison to advanced networks such as Swin-Unet and TransUnet. Full article

(This article belongs to the Special Issue Image Segmentation Techniques: Current Status and Future Directions (2nd Edition))

► Show Figures

Figure 1

14 pages, 7427 KiB

Open AccessArticle

Spectral Bidirectional Reflectance Distribution Function Simplification

by Shubham Chitnis, Aditya Sole and Sharat Chandran

J. Imaging 2025, 11(1), 18; https://doi.org/10.3390/jimaging11010018 - 11 Jan 2025

Viewed by 550

Abstract

Non-diffuse materials (e.g., metallic inks, varnishes, and paints) are widely used in real-world applications. Accurate spectral rendering relies on the bidirectional reflectance distribution function (BRDF). Current methods of capturing the BRDFs have proven to be onerous in accomplishing quick turnaround time, from conception [...] Read more.

Non-diffuse materials (e.g., metallic inks, varnishes, and paints) are widely used in real-world applications. Accurate spectral rendering relies on the bidirectional reflectance distribution function (BRDF). Current methods of capturing the BRDFs have proven to be onerous in accomplishing quick turnaround time, from conception and design to production. We propose a multi-layer perceptron for compact spectral material representations, with 31 wavelengths for four real-world packaging materials. Our neural-based scenario reduces measurement requirements while maintaining significant saliency. Unlike tristimulus BRDF acquisition, this spectral approach has not, to our knowledge, been previously explored with neural networks. We demonstrate compelling results for diffuse, glossy, and goniochromatic materials. Full article

(This article belongs to the Special Issue Imaging Technologies for Understanding Material Appearance)

► Show Figures

Figure 1

23 pages, 10925 KiB

Open AccessArticle

Supervised and Self-Supervised Learning for Assembly Line Action Recognition

by Christopher Indris, Fady Ibrahim, Hatem Ibrahem, Götz Bramesfeld, Jie Huo, Hafiz Mughees Ahmad, Syed Khizer Hayat and Guanghui Wang

J. Imaging 2025, 11(1), 17; https://doi.org/10.3390/jimaging11010017 - 10 Jan 2025

Viewed by 591

Abstract

The safety and efficiency of assembly lines are critical to manufacturing, but human supervisors cannot oversee all activities simultaneously. This study addresses this challenge by performing a comparative study to construct an initial real-time, semi-supervised temporal action recognition setup for monitoring worker actions [...] Read more.

The safety and efficiency of assembly lines are critical to manufacturing, but human supervisors cannot oversee all activities simultaneously. This study addresses this challenge by performing a comparative study to construct an initial real-time, semi-supervised temporal action recognition setup for monitoring worker actions on assembly lines. Various feature extractors and localization models were benchmarked using a new assembly dataset, with the I3D model achieving an average mAP@IoU=0.1:0.7 of 85% without optical flow or fine-tuning. The comparative study was extended to self-supervised learning via a modified SPOT model, which achieved a mAP@IoU=0.1:0.7 of 65% with just 10% of the data labeled using extractor architectures from the fully-supervised portion. Milestones include high scores for both fully and semi-supervised learning on this dataset and improved SPOT performance on ANet1.3. This study identified the particularities of the problem, which were leveraged and referenced to explain the results observed in semi-supervised scenarios. The findings highlight the potential for developing a scalable solution in the future, providing labour efficiency and safety compliance for manufacturers. Full article

(This article belongs to the Special Issue Advancing Action Recognition: Novel Approaches, Techniques and Applications)

► Show Figures

Figure 1

8 pages, 3549 KiB

Open AccessCommunication

Unmasking the Area Postrema on MRI: Utility of 3D FLAIR, 3D-T2, and 3D-DIR Sequences in a Case–Control Study

by Javier Lara-García, Jessica Romo-Martínez, Jonathan Javier De-La-Cruz-Cisneros, Marco Antonio Olvera-Olvera and Luis Jesús Márquez-Bejarano

J. Imaging 2025, 11(1), 16; https://doi.org/10.3390/jimaging11010016 - 10 Jan 2025

Viewed by 495

Abstract

The area postrema (AP) is a key circumventricular organ involved in the regulation of autonomic functions. Accurate identification of the AP via MRI is essential in neuroimaging but it is challenging. This study evaluated 3D FSE Cube T2WI, 3D FSE Cube FLAIR, and [...] Read more.

The area postrema (AP) is a key circumventricular organ involved in the regulation of autonomic functions. Accurate identification of the AP via MRI is essential in neuroimaging but it is challenging. This study evaluated 3D FSE Cube T2WI, 3D FSE Cube FLAIR, and 3D DIR sequences to improve AP detection in patients with and without multiple sclerosis (MS). A case–control study included 35 patients with MS and 35 with other non-demyelinating central nervous system diseases (ND-CNSD). MRI images were acquired employing 3D DIR, 3D FSE Cube FLAIR, and 3D FSE Cube T2WI sequences. The evaluation of AP was conducted using a 3-point scale. Statistical analysis was performed with the chi-square test used to assess group homogeneity and differences between sequences. No significant differences were found in the visualization of the AP between the MS and ND-CNSD groups across the sequences or planes. The AP was not visible in 27.6% of the 3D FSE Cube T2WI sequences, while it was visualized in 99% of the 3D FSE Cube FLAIR sequences and 100% of the 3D DIR sequences. The 3D DIR sequence showed superior performance in identifying the AP. Full article

(This article belongs to the Section Medical Imaging)

► Show Figures

Figure 1

28 pages, 4795 KiB

Open AccessArticle

Skin Lesion Classification Through Test Time Augmentation and Explainable Artificial Intelligence

by Loris Cino, Cosimo Distante, Alessandro Martella and Pier Luigi Mazzeo

J. Imaging 2025, 11(1), 15; https://doi.org/10.3390/jimaging11010015 - 9 Jan 2025

Viewed by 802

Abstract

Despite significant advancements in the automatic classification of skin lesions using artificial intelligence (AI) algorithms, skepticism among physicians persists. This reluctance is primarily due to the lack of transparency and explainability inherent in these models, which hinders their widespread acceptance in clinical settings. [...] Read more.

Despite significant advancements in the automatic classification of skin lesions using artificial intelligence (AI) algorithms, skepticism among physicians persists. This reluctance is primarily due to the lack of transparency and explainability inherent in these models, which hinders their widespread acceptance in clinical settings. The primary objective of this study is to develop a highly accurate AI-based algorithm for skin lesion classification that also provides visual explanations to foster trust and confidence in these novel diagnostic tools. By improving transparency, the study seeks to contribute to earlier and more reliable diagnoses. Additionally, the research investigates the impact of Test Time Augmentation (TTA) on the performance of six Convolutional Neural Network (CNN) architectures, which include models from the EfficientNet, ResNet (Residual Network), and ResNeXt (an enhanced variant of ResNet) families. To improve the interpretability of the models’ decision-making processes, techniques such as t-distributed Stochastic Neighbor Embedding (t-SNE) and Gradient-weighted Class Activation Mapping (Grad-CAM) are employed. t-SNE is utilized to visualize the high-dimensional latent features of the CNNs in a two-dimensional space, providing insights into how the models group different skin lesion classes. Grad-CAM is used to generate heatmaps that highlight the regions of input images that influence the model’s predictions. Our findings reveal that Test Time Augmentation enhances the balanced multi-class accuracy of CNN models by up to 0.3%, achieving a balanced accuracy rate of 97.58% on the International Skin Imaging Collaboration (ISIC 2019) dataset. This performance is comparable to, or marginally better than, more complex approaches such as Vision Transformers (ViTs), demonstrating the efficacy of our methodology. Full article

(This article belongs to the Special Issue Computer Vision and Deep Learning: Trends and Applications (2nd Edition))

► Show Figures

Figure 1

24 pages, 1098 KiB

Open AccessArticle

Face Boundary Formulation for Harmonic Models: Face Image Resembling

by Hung-Tsai Huang, Zi-Cai Li, Yimin Wei and Ching Yee Suen

J. Imaging 2025, 11(1), 14; https://doi.org/10.3390/jimaging11010014 - 8 Jan 2025

Viewed by 404

Abstract

This paper is devoted to numerical algorithms based on harmonic transformations with two goals: (1) face boundary formulation by blending techniques based on the known characteristic nodes and (2) some challenging examples of face resembling. The formulation of the face boundary is imperative [...] Read more.

This paper is devoted to numerical algorithms based on harmonic transformations with two goals: (1) face boundary formulation by blending techniques based on the known characteristic nodes and (2) some challenging examples of face resembling. The formulation of the face boundary is imperative for face recognition, transformation, and combination. Mapping between the source and target face boundaries with constituent pixels is explored by two approaches: cubic spline interpolation and ordinary differential equation (ODE) using Hermite interpolation. The ODE approach is more flexible and suitable for handling different boundary conditions, such as the clamped and simple support conditions. The intrinsic relations between the cubic spline and ODE methods are explored for different face boundaries, and their combinations are developed. Face combination and resembling are performed by employing blending curves for generating the face boundary, and face images are converted by numerical methods for harmonic models, such as the finite difference method (FDM), the finite element method (FEM) and the finite volume method (FVM) for harmonic models, and the splitting–integrating method (SIM) for the resampling of constituent pixels. For the second goal, the age effects of facial appearance are explored to discover that different ages of face images can be produced by integrating the photos and images of the old and the young. Then, the following challenging task is targeted. Based on the photos and images of parents and their children, can we obtain an integrated image to resemble his/her current image as closely as possible? Amazing examples of face combination and resembling are reported in this paper to give a positive answer. Furthermore, an optimal combination of face images of parents and their children in the least-squares sense is introduced to greatly facilitate face resembling. Face combination and resembling may also be used for plastic surgery, finding missing children, and identifying criminals. The boundary and numerical techniques of face images in this paper can be used not only for pattern recognition but also for face morphing, morphing attack detection (MAD), and computer animation as Sora to greatly enhance further developments in AI. Full article

(This article belongs to the Special Issue Techniques and Applications in Face Image Analysis)

► Show Figures

Figure 1

13 pages, 1390 KiB

Open AccessArticle

Combined Input Deep Learning Pipeline for Embryo Selection for In Vitro Fertilization Using Light Microscopic Images and Additional Features

by Krittapat Onthuam, Norrawee Charnpinyo, Kornrapee Suthicharoenpanich, Supphaset Engphaiboon, Punnarai Siricharoen, Ronnapee Chaichaowarat and Chanakarn Suebthawinkul

J. Imaging 2025, 11(1), 13; https://doi.org/10.3390/jimaging11010013 - 7 Jan 2025

Viewed by 712

Abstract

The current process of embryo selection in in vitro fertilization is based on morphological criteria; embryos are manually evaluated by embryologists under subjective assessment. In this study, a deep learning-based pipeline was developed to classify the viability of embryos using combined inputs, including [...] Read more.

The current process of embryo selection in in vitro fertilization is based on morphological criteria; embryos are manually evaluated by embryologists under subjective assessment. In this study, a deep learning-based pipeline was developed to classify the viability of embryos using combined inputs, including microscopic images of embryos and additional features, such as patient age and developed pseudo-features, including a continuous interpretation of Istanbul grading scores by predicting the embryo stage, inner cell mass, and trophectoderm. For viability prediction, convolution-based transferred learning models were employed, multiple pretrained models were compared, and image preprocessing techniques and hyperparameter optimization via Optuna were utilized. In addition, a custom weight was trained using a self-supervised learning framework known as the Simple Framework for Contrastive Learning of Visual Representations (SimCLR) in cooperation with generated images using generative adversarial networks (GANs). The best model was developed from the EfficientNet-B0 model using preprocessed images combined with pseudo-features generated using separate EfficientNet-B0 models, and optimized by Optuna to tune the hyperparameters of the models. The designed model’s F1 score, accuracy, sensitivity, and area under curve (AUC) were 65.02%, 69.04%, 56.76%, and 66.98%, respectively. This study also showed an advantage in accuracy and a similar AUC when compared with the recent ensemble method. Full article

► Show Figures

Figure 1

66 pages, 2123 KiB

Open AccessSystematic Review

Hybrid Quality-Based Recommender Systems: A Systematic Literature Review

by Bihi Sabiri, Amal Khtira, Bouchra El Asri and Maryem Rhanoui

J. Imaging 2025, 11(1), 12; https://doi.org/10.3390/jimaging11010012 - 7 Jan 2025

Viewed by 636

Abstract

As technology develops, consumer behavior and how people search for what they want are constantly evolving. Online shopping has fundamentally changed the e-commerce industry. Although there are more products available than ever before, only a small portion of them are noticed; as a [...] Read more.

As technology develops, consumer behavior and how people search for what they want are constantly evolving. Online shopping has fundamentally changed the e-commerce industry. Although there are more products available than ever before, only a small portion of them are noticed; as a result, a few items gain disproportionate attention. Recommender systems can help to increase the visibility of lesser-known products. Major technology businesses have adopted these technologies as essential offerings, resulting in better user experiences and more sales. As a result, recommender systems have achieved considerable economic, social, and global advancements. Companies are improving their algorithms with hybrid techniques that combine more recommendation methodologies as these systems are a major research focus. This review provides a thorough examination of several hybrid models by combining ideas from the current research and emphasizing their practical uses, strengths, and limits. The review identifies special problems and opportunities for designing and implementing hybrid recommender systems by focusing on the unique aspects of big data, notably volume, velocity, and variety. Adhering to the Cochrane Handbook and the principles developed by Kitchenham and Charters guarantees that the assessment process is transparent and high in quality. The current aim is to conduct a systematic review of several recent developments in the area of hybrid recommender systems. The study covers the state of the art of the relevant research over the last four years regarding four knowledge bases (ACM, Google Scholar, Scopus, and Springer), as well as all Web of Science articles regardless of their date of publication. This study employs ASReview, an open-source application that uses active learning to help academics filter literature efficiently. This study aims to assess the progress achieved in the field of hybrid recommender systems to identify frequently used recommender approaches, explore the technical context, highlight gaps in the existing research, and position our future research in relation to the current studies. Full article

(This article belongs to the Section Document Analysis and Processing)

► Show Figures

Figure 1

20 pages, 32805 KiB

Open AccessArticle

Application of Generative Artificial Intelligence Models for Accurate Prescription Label Identification and Information Retrieval for the Elderly in Northern East of Thailand

by Parinya Thetbanthad, Benjaporn Sathanarugsawait and Prasong Praneetpolgrang

J. Imaging 2025, 11(1), 11; https://doi.org/10.3390/jimaging11010011 - 6 Jan 2025

Viewed by 622

Abstract

This study introduces a novel AI-driven approach to support elderly patients in Thailand with medication management, focusing on accurate drug label interpretation. Two model architectures were explored: a Two-Stage Optical Character Recognition (OCR) and Large Language Model (LLM) pipeline combining EasyOCR with Qwen2-72b-instruct [...] Read more.

This study introduces a novel AI-driven approach to support elderly patients in Thailand with medication management, focusing on accurate drug label interpretation. Two model architectures were explored: a Two-Stage Optical Character Recognition (OCR) and Large Language Model (LLM) pipeline combining EasyOCR with Qwen2-72b-instruct and a Uni-Stage Visual Question Answering (VQA) model using Qwen2-72b-VL. Both models operated in a zero-shot capacity, utilizing Retrieval-Augmented Generation (RAG) with DrugBank references to ensure contextual relevance and accuracy. Performance was evaluated on a dataset of 100 diverse prescription labels from Thai healthcare facilities, using RAG Assessment (RAGAs) metrics to assess Context Recall, Factual Correctness, Faithfulness, and Semantic Similarity. The Two-Stage model achieved high accuracy (94%) and strong RAGAs scores, particularly in Context Recall (0.88) and Semantic Similarity (0.91), making it well-suited for complex medication instructions. In contrast, the Uni-Stage model delivered faster response times, making it practical for high-volume environments such as pharmacies. This study demonstrates the potential of zero-shot AI models in addressing medication management challenges for the elderly by providing clear, accurate, and contextually relevant label interpretations. The findings underscore the adaptability of AI in healthcare, balancing accuracy and efficiency to meet various real-world needs. Full article

(This article belongs to the Section AI in Imaging)

► Show Figures

Figure 1

13 pages, 4901 KiB

Open AccessArticle

A New Deep Learning-Based Method for Automated Identification of Thoracic Lymph Node Stations in Endobronchial Ultrasound (EBUS): A Proof-of-Concept Study

by Øyvind Ervik, Mia Rødde, Erlend Fagertun Hofstad, Ingrid Tveten, Thomas Langø, Håkon O. Leira, Tore Amundsen and Hanne Sorger

J. Imaging 2025, 11(1), 10; https://doi.org/10.3390/jimaging11010010 - 5 Jan 2025

Viewed by 706

Abstract

Endobronchial ultrasound-guided transbronchial needle aspiration (EBUS-TBNA) is a cornerstone in minimally invasive thoracic lymph node sampling. In lung cancer staging, precise assessment of lymph node position is crucial for clinical decision-making. This study aimed to demonstrate a new deep learning method to classify [...] Read more.

Endobronchial ultrasound-guided transbronchial needle aspiration (EBUS-TBNA) is a cornerstone in minimally invasive thoracic lymph node sampling. In lung cancer staging, precise assessment of lymph node position is crucial for clinical decision-making. This study aimed to demonstrate a new deep learning method to classify thoracic lymph nodes based on their anatomical location using EBUS images. Bronchoscopists labeled lymph node stations in real-time according to the Mountain Dressler nomenclature. EBUS images were then used to train and test a deep neural network (DNN) model, with intraoperative labels as ground truth. In total, 28,134 EBUS images were acquired from 56 patients. The model achieved an overall classification accuracy of 59.5 ± 5.2%. The highest precision, sensitivity, and F1 score were observed in station 4L, 77.6 ± 13.1%, 77.6 ± 15.4%, and 77.6 ± 15.4%, respectively. The lowest precision, sensitivity, and F1 score were observed in station 10L. The average processing and prediction time for a sequence of ten images was 0.65 ± 0.04 s, demonstrating the feasibility of real-time applications. In conclusion, the new DNN-based model could be used to classify lymph node stations from EBUS images. The method performance was promising with a potential for clinical use. Full article

(This article belongs to the Special Issue Advances in Medical Imaging and Machine Learning)

► Show Figures

Figure 1

38 pages, 4397 KiB

Open AccessArticle

Visual Impairment Spatial Awareness System for Indoor Navigation and Daily Activities

by Xinrui Yu and Jafar Saniie

J. Imaging 2025, 11(1), 9; https://doi.org/10.3390/jimaging11010009 - 4 Jan 2025

Viewed by 686

Abstract

The integration of artificial intelligence into daily life significantly enhances the autonomy and quality of life of visually impaired individuals. This paper introduces the Visual Impairment Spatial Awareness (VISA) system, designed to holistically assist visually impaired users in indoor activities through a structured, [...] Read more.

The integration of artificial intelligence into daily life significantly enhances the autonomy and quality of life of visually impaired individuals. This paper introduces the Visual Impairment Spatial Awareness (VISA) system, designed to holistically assist visually impaired users in indoor activities through a structured, multi-level approach. At the foundational level, the system employs augmented reality (AR) markers for indoor positioning, neural networks for advanced object detection and tracking, and depth information for precise object localization. At the intermediate level, it integrates data from these technologies to aid in complex navigational tasks such as obstacle avoidance and pathfinding. The advanced level synthesizes these capabilities to enhance spatial awareness, enabling users to navigate complex environments and locate specific items. The VISA system exhibits an efficient human–machine interface (HMI), incorporating text-to-speech and speech-to-text technologies for natural and intuitive communication. Evaluations in simulated real-world environments demonstrate that the system allows users to interact naturally and with minimal effort. Our experimental results confirm that the VISA system efficiently assists visually impaired users in indoor navigation, object detection and localization, and label and text recognition, thereby significantly enhancing their spatial awareness and independence. Full article

(This article belongs to the Special Issue Image and Video Processing for Blind and Visually Impaired)

► Show Figures

Figure 1

17 pages, 2944 KiB

Open AccessArticle

Enhanced CATBraTS for Brain Tumour Semantic Segmentation

by Rim El Badaoui, Ester Bonmati Coll, Alexandra Psarrou, Hykoush A. Asaturyan and Barbara Villarini

J. Imaging 2025, 11(1), 8; https://doi.org/10.3390/jimaging11010008 - 3 Jan 2025

Viewed by 435

Abstract

The early and precise identification of a brain tumour is imperative for enhancing a patient’s life expectancy; this can be facilitated by quick and efficient tumour segmentation in medical imaging. Automatic brain tumour segmentation tools in computer vision have integrated powerful deep learning [...] Read more.

The early and precise identification of a brain tumour is imperative for enhancing a patient’s life expectancy; this can be facilitated by quick and efficient tumour segmentation in medical imaging. Automatic brain tumour segmentation tools in computer vision have integrated powerful deep learning architectures to enable accurate tumour boundary delineation. Our study aims to demonstrate improved segmentation accuracy and higher statistical stability, using datasets obtained from diverse imaging acquisition parameters. This paper introduces a novel, fully automated model called Enhanced Channel Attention Transformer (E-CATBraTS) for Brain Tumour Semantic Segmentation; this model builds upon 3D CATBraTS, a vision transformer employed in magnetic resonance imaging (MRI) brain tumour segmentation tasks. E-CATBraTS integrates convolutional neural networks and Swin Transformer, incorporating channel shuffling and attention mechanisms to effectively segment brain tumours in multi-modal MRI. The model was evaluated on four datasets containing 3137 brain MRI scans. Through the adoption of E-CATBraTS, the accuracy of the results improved significantly on two datasets, outperforming the current state-of-the-art models by a mean DSC of 2.6% while maintaining a high accuracy that is comparable to the top-performing models on the other datasets. The results demonstrate that E-CATBraTS achieves both high segmentation accuracy and elevated generalisation abilities, ensuring the model is robust to dataset variation. Full article

(This article belongs to the Special Issue Advances in Medical Imaging and Machine Learning)

► Show Figures

Figure 1

13 pages, 543 KiB

Open AccessArticle

Fitting Geometric Shapes to Fuzzy Point Cloud Data

by Vincent B. Verhoeven, Pasi Raumonen and Markku Åkerblom

J. Imaging 2025, 11(1), 7; https://doi.org/10.3390/jimaging11010007 - 3 Jan 2025

Viewed by 416

Abstract

This article describes procedures and thoughts regarding the reconstruction of geometry-given data and its uncertainty. The data are considered as a continuous fuzzy point cloud, instead of a discrete point cloud. Shape fitting is commonly performed by minimizing the discrete Euclidean distance; however, [...] Read more.

This article describes procedures and thoughts regarding the reconstruction of geometry-given data and its uncertainty. The data are considered as a continuous fuzzy point cloud, instead of a discrete point cloud. Shape fitting is commonly performed by minimizing the discrete Euclidean distance; however, we propose the novel approach of using the expected Mahalanobis distance. The primary benefit is that it takes both the different magnitude and orientation of uncertainty for each data point into account. We illustrate the approach with laser scanning data of a cylinder and compare its performance with that of the conventional least squares method with and without random sample consensus (RANSAC). Our proposed method fits the geometry more accurately, albeit generally with greater uncertainty, and shows promise for geometry reconstruction with laser-scanned data. Full article

(This article belongs to the Special Issue Geometry Reconstruction from Images (2nd Edition))

► Show Figures

Figure 1

17 pages, 2421 KiB

Open AccessArticle

Exploring Multi-Pathology Brain Segmentation: From Volume-Based to Component-Based Deep Learning Analysis

by Ioannis Stathopoulos, Roman Stoklasa, Maria Anthi Kouri, Georgios Velonakis, Efstratios Karavasilis, Efstathios Efstathopoulos and Luigi Serio

J. Imaging 2025, 11(1), 6; https://doi.org/10.3390/jimaging11010006 - 31 Dec 2024

Viewed by 725

Abstract

Detection and segmentation of brain abnormalities using Magnetic Resonance Imaging (MRI) is an important task that, nowadays, the role of AI algorithms as supporting tools is well established both at the research and clinical-production level. While the performance of the state-of-the-art models is [...] Read more.

Detection and segmentation of brain abnormalities using Magnetic Resonance Imaging (MRI) is an important task that, nowadays, the role of AI algorithms as supporting tools is well established both at the research and clinical-production level. While the performance of the state-of-the-art models is increasing, reaching radiologists and other experts’ accuracy levels in many cases, there is still a lot of research needed on the direction of in-depth and transparent evaluation of the correct results and failures, especially in relation to important aspects of the radiological practice: abnormality position, intensity level, and volume. In this work, we focus on the analysis of the segmentation results of a pre-trained U-net model trained and validated on brain MRI examinations containing four different pathologies: Tumors, Strokes, Multiple Sclerosis (MS), and White Matter Hyperintensities (WMH). We present the segmentation results for both the whole abnormal volume and for each abnormal component inside the examinations of the validation set. In the first case, a dice score coefficient (DSC), sensitivity, and precision of 0.76, 0.78, and 0.82, respectively, were found, while in the second case the model detected and segmented correct (True positives) the 48.8% (DSC ≥ 0.5) of abnormal components, partially correct the 27.1% (0.05 > DSC > 0.5), and missed (False Negatives) the 24.1%, while it produced 25.1% False Positives. Finally, we present an extended analysis between the True positives, False Negatives, and False positives versus their position inside the brain, their intensity at three MRI modalities (FLAIR, T2, and T1ce) and their volume. Full article

(This article belongs to the Special Issue Image Segmentation Techniques: Current Status and Future Directions (2nd Edition))

► Show Figures

Figure 1

28 pages, 7288 KiB

Open AccessArticle

Geometric Feature Characterization of Apple Trees from 3D LiDAR Point Cloud Data

by Md Rejaul Karim, Shahriar Ahmed, Md Nasim Reza, Kyu-Ho Lee, Joonjea Sung and Sun-Ok Chung

J. Imaging 2025, 11(1), 5; https://doi.org/10.3390/jimaging11010005 - 31 Dec 2024

Viewed by 788

Abstract

The geometric feature characterization of fruit trees plays a role in effective management in orchards. LiDAR (light detection and ranging) technology for object detection enables the rapid and precise evaluation of geometric features. This study aimed to quantify the height, canopy volume, tree [...] Read more.

The geometric feature characterization of fruit trees plays a role in effective management in orchards. LiDAR (light detection and ranging) technology for object detection enables the rapid and precise evaluation of geometric features. This study aimed to quantify the height, canopy volume, tree spacing, and row spacing in an apple orchard using a three-dimensional (3D) LiDAR sensor. A LiDAR sensor was used to collect 3D point cloud data from the apple orchard. Six samples of apple trees, representing a variety of shapes and sizes, were selected for data collection and validation. Commercial software and the python programming language were utilized to process the collected data. The data processing steps involved data conversion, radius outlier removal, voxel grid downsampling, denoising through filtering and erroneous points, segmentation of the region of interest (ROI), clustering using the density-based spatial clustering (DBSCAN) algorithm, data transformation, and the removal of ground points. Accuracy was assessed by comparing the estimated outputs from the point cloud with the corresponding measured values. The sensor-estimated and measured tree heights were 3.05 ± 0.34 m and 3.13 ± 0.33 m, respectively, with a mean absolute error (MAE) of 0.08 m, a root mean squared error (RMSE) of 0.09 m, a linear coefficient of determination (r²) of 0.98, a confidence interval (CI) of −0.14 to −0.02 m, and a high concordance correlation coefficient (CCC) of 0.96, indicating strong agreement and high accuracy. The sensor-estimated and measured canopy volumes were 13.76 ± 2.46 m³ and 14.09 ± 2.10 m³, respectively, with an MAE of 0.57 m³, an RMSE of 0.61 m³, an r² value of 0.97, and a CI of −0.92 to 0.26, demonstrating high precision. For tree and row spacing, the sensor-estimated distances and measured distances were 3.04 ± 0.17 and 3.18 ± 0.24 m, and 3.35 ± 0.08 and 3.40 ± 0.05 m, respectively, with RMSE and r² values of 0.12 m and 0.92 for tree spacing, and 0.07 m and 0.94 for row spacing, respectively. The MAE and CI values were 0.09 m, 0.05 m, and −0.18 for tree spacing and 0.01, −0.1, and 0.002 for row spacing, respectively. Although minor differences were observed, the sensor estimates were efficient, though specific measurements require further refinement. The results are based on a limited dataset of six measured values, providing initial insights into geometric feature characterization performance. However, a larger dataset would offer a more reliable accuracy assessment. The small sample size (six apple trees) limits the generalizability of the findings and necessitates caution in interpreting the results. Future studies should incorporate a broader and more diverse dataset to validate and refine the characterization, enhancing management practices in apple orchards. Full article

(This article belongs to the Special Issue Exploring Challenges and Innovations in 3D Point Cloud Processing)

► Show Figures

Figure 1

21 pages, 7929 KiB

Open AccessArticle

Experimental Protocol for Color Difference Evaluation Under Stabilized LED Light

by Sofiane Vernet, Eric Dinet, Alain Trémeau and Philippe Colantoni

J. Imaging 2025, 11(1), 4; https://doi.org/10.3390/jimaging11010004 - 30 Dec 2024

Viewed by 473

Abstract

There are two key factors to consider before implementing a color discrimination experiment. First, a set of color patches should be selected or designed for the specific purpose of the experiment to be carried out. Second, the lighting conditions should be controlled to [...] Read more.

There are two key factors to consider before implementing a color discrimination experiment. First, a set of color patches should be selected or designed for the specific purpose of the experiment to be carried out. Second, the lighting conditions should be controlled to eliminate the impact of lighting instability on the experiment. This paper addresses both of these challenges. It proposes a method to print pairs of color patches with non-noticeable color differences. It also proposes a method to stabilize the Spectral Power Distributions (SPDs) of a Light-Emitting Diode (LED) lighting system. Finally, it introduces an experimental protocol for a color discrimination study that will be performed thanks to the contributions presented in this paper. Full article

(This article belongs to the Special Issue Color in Image Processing and Computer Vision)

► Show Figures

Figure 1

13 pages, 5410 KiB

Open AccessArticle

Modified Center-Edge Angle in Children with Developmental Dysplasia of the Hip

by Katharina S. Gather, Fabian Sporer, Christos Tsagkaris, Marco Götze, Simone Gantz, Sebastien Hagmann and Thomas Dreher

J. Imaging 2025, 11(1), 3; https://doi.org/10.3390/jimaging11010003 - 27 Dec 2024

Viewed by 563

Abstract

Developmental dysplasia of the hip (DDH) is a prevalent developmental condition that necessitates early detection and treatment. Follow–up, as well as therapeutic decision-making in children younger than four years, is challenging because the center–edge (CE) angle of Wiberg is not reliable in this [...] Read more.

Developmental dysplasia of the hip (DDH) is a prevalent developmental condition that necessitates early detection and treatment. Follow–up, as well as therapeutic decision-making in children younger than four years, is challenging because the center–edge (CE) angle of Wiberg is not reliable in this age group. The authors propose a modification of the CE angle (MCE) to achieve comparable reliability with the CE among children younger than four and set diagnostic thresholds for the diagnosis of DDH. 952 anteroposterior pelvic radiographs were retrospectively reviewed. The MCE is defined on X-ray pelvic overview images as the angle between the line connecting the epiphyseal joint center and the outer edge of the acetabulum, and perpendicular to the Hilgenreiner line. The MCE angle exhibited high sensitivity and specificity, as well as intrarater variability comparable to the CE among children younger and older than four years. The authors recommend cut-off values for the MCE angle; for children under four years old, the angle should be equal to or greater than 15 degrees; for those under eight years old, it should be equal to or greater than 20 degrees; and for those eight years old and older, it should be equal to or greater than 25 degrees. However, the MCE angle’s reliability diminishes around the age of nine due to the curvature of the growth plate, which complicates accurate measurement. This study showed that the MCE angle can be used adequately in children under four years and could be used as a progression parameter to diagnose DDH. Full article

(This article belongs to the Section Medical Imaging)

► Show Figures

Figure 1

54 pages, 5089 KiB

Open AccessReview

The Neural Frontier of Future Medical Imaging: A Review of Deep Learning for Brain Tumor Detection

by Tarek Berghout

J. Imaging 2025, 11(1), 2; https://doi.org/10.3390/jimaging11010002 - 24 Dec 2024

Viewed by 1095

Abstract

Brain tumor detection is crucial in medical research due to high mortality rates and treatment challenges. Early and accurate diagnosis is vital for improving patient outcomes, however, traditional methods, such as manual Magnetic Resonance Imaging (MRI) analysis, are often time-consuming and error-prone. The [...] Read more.

Brain tumor detection is crucial in medical research due to high mortality rates and treatment challenges. Early and accurate diagnosis is vital for improving patient outcomes, however, traditional methods, such as manual Magnetic Resonance Imaging (MRI) analysis, are often time-consuming and error-prone. The rise of deep learning has led to advanced models for automated brain tumor feature extraction, segmentation, and classification. Despite these advancements, comprehensive reviews synthesizing recent findings remain scarce. By analyzing over 100 research papers over past half-decade (2019–2024), this review fills that gap, exploring the latest methods and paradigms, summarizing key concepts, challenges, datasets, and offering insights into future directions for brain tumor detection using deep learning. This review also incorporates an analysis of previous reviews and targets three main aspects: feature extraction, segmentation, and classification. The results revealed that research primarily focuses on Convolutional Neural Networks (CNNs) and their variants, with a strong emphasis on transfer learning using pre-trained models. Other methods, such as Generative Adversarial Networks (GANs) and Autoencoders, are used for feature extraction, while Recurrent Neural Networks (RNNs) are employed for time-sequence modeling. Some models integrate with Internet of Things (IoT) frameworks or federated learning for real-time diagnostics and privacy, often paired with optimization algorithms. However, the adoption of eXplainable AI (XAI) remains limited, despite its importance in building trust in medical diagnostics. Finally, this review outlines future opportunities, focusing on image quality, underexplored deep learning techniques, expanding datasets, and exploring deeper learning representations and model behavior such as recurrent expansion to advance medical imaging diagnostics. Full article

(This article belongs to the Special Issue Advances in Biomedical Image Processing and Artificial Intelligence for Computer-Aided Diagnosis in Medicine)

► Show Figures

Figure 1

28 pages, 22965 KiB

Open AccessReview

Benchmarking of Multispectral Pansharpening: Reproducibility, Assessment, and Meta-Analysis

by Luciano Alparone and Andrea Garzelli

J. Imaging 2025, 11(1), 1; https://doi.org/10.3390/jimaging11010001 - 24 Dec 2024

Viewed by 469

Abstract

The term pansharpening denotes the process by which the geometric resolution of a multiband image is increased by means of a co-registered broadband panchromatic observation of the same scene having greater spatial resolution. Over time, the benchmarking of pansharpening methods has revealed itself [...] Read more.

The term pansharpening denotes the process by which the geometric resolution of a multiband image is increased by means of a co-registered broadband panchromatic observation of the same scene having greater spatial resolution. Over time, the benchmarking of pansharpening methods has revealed itself to be more challenging than the development of new methods. Their recent proliferation in the literature is mostly due to the lack of a standardized assessment. In this paper, we draw guidelines for correct and fair comparative evaluation of pansharpening methods, focusing on the reproducibility of results and resorting to concepts of meta-analysis. As a major outcome of this study, an improved version of the additive wavelet luminance proportional (AWLP) pansharpening algorithm offers all of the favorable characteristics of an ideal benchmark, namely, performance, speed, absence of adjustable running parameters, reproducibility of results with varying datasets and landscapes, and automatic correction of the path radiance term introduced by the atmosphere. The proposed benchmarking protocol employs the haze-corrected AWLP-H and exploits meta-analysis for cross-comparisons among different experiments. After assessment on five different datasets, it was found to provide reliable and consistent results in ranking different fusion methods. Full article

(This article belongs to the Special Issue Advancement in Multispectral and Hyperspectral Pansharpening Image Processing)

► Show Figures

Graphical abstract

Journal Menu

Journal Browser

J. Imaging, Volume 11, Issue 1 (January 2025) – 28 articles

Further Information

Guidelines

MDPI Initiatives

Follow MDPI