New Advances and Applications in Image Processing and Computer Vision

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "Mathematics and Computer Science".

Deadline for manuscript submissions: closed (31 October 2024) | Viewed by 22323

Special Issue Editor


E-Mail Website
Guest Editor
Faculty of Business and Information Technology, Ontario Tech University, Oshawa, ON, Canada
Interests: artificial intelligence; computer vision; machine learning; deep learning; image processing; medical image processing; AI in healthcare

Special Issue Information

Dear Colleagues,

Computer vision adoption is growing steadily. Be it video-detection systems in self-driving cars, 3D printing in manufacturing, healthcare or advanced sensors in defense and logistics, technology is playing its part. Additionally, there is a growing call for integrating theoretical research on image processing algorithms with the more applied research on image processing systems.

This Special Issue is on topics of image processing and computer vision, along with their applications in different domains and systems in different environments. Recent advancements in image processing and computer vision have enabled researchers to boost the development of intelligent applications in different domains. These intelligent applications are integrated as real-world applications in the environment in order to collecting data seamlessly continuously and performing Image Processing on the huge amount of data collected from these environments. These intelligent applications are developed to be adaptable in different unexpected conditions, which makes these applications highly useful in real-world environments. There are numerous challenging real-world applications where vision is being successfully used, both in specialized applications such as image search and autonomous navigation, as well as for fun, consumer-level tasks that can be applied on personal photos and videos.

This Special Issue aims to explore the variety of techniques used to analyze and interpret images, together with computer vision applications and advances, by providing a platform for researchers from both academia and industry to present their novel and unpublished work in the domains of computer vision and image processing. This will help to foster future research in the growing field of computer vision and its related areas.

Dr. Samaneh Mazaheri
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

Computer Vision:
  • 3D computer vision
  • adversarial attacks and robustness
  • bias, fairness and privacy
  • biometrics, face, gesture and pose
  • computational photography, image and video synthesis
  • image and video retrieval
  • learning and optimization for computer vision
  • low level and physics-based vision
  • medical and biological imaging
  • motion and tracking
  • object detection and categorization
  • representation learning for vision
  • scene analysis and understanding
  • segmentation
  • video understanding and activity analysis
  • vision for robotics and autonomous driving
  • visual reasoning and symbolic representations
  • applications
Image Processing:
  • image sensing, modelling and representation
  • statistical modeling and estimation
  • image models (structure based, morphological, graph based)
  • image processing methods (linear and non-linear filtering, transforms, wavelets, etc.)
  • inverse imaging, compressive sensing
  • image acquisition, denoising, deblurring, reconstruction
  • machine learning and image processing algorithms
  • image segmentation and representation
  • image and video retrieval
  • image processing and understanding
  • knowledge representation and high-level vision
  • graph theoretic methods
  • animation, movies, advertising, video games
  • real-time image processing and learning
  • neural network applications
  • deep learning
  • human-centric self-supervised learning
  • machine vision
  • computer-generated imagery
  • biomedical image processing
  • biomedical and health informatics

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (16 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

17 pages, 396 KiB  
Article
Robust Classification via Finite Mixtures of Matrix Variate Skew-t Distributions
by Abbas Mahdavi, Narayanaswamy Balakrishnan and Ahad Jamalizadeh
Mathematics 2024, 12(20), 3260; https://doi.org/10.3390/math12203260 - 17 Oct 2024
Viewed by 610
Abstract
Analysis of matrix variate data is becoming increasingly common in the literature, particularly in the field of clustering and classification. It is well known that real data, including real matrix variate data, often exhibit high levels of asymmetry. To address this issue, one [...] Read more.
Analysis of matrix variate data is becoming increasingly common in the literature, particularly in the field of clustering and classification. It is well known that real data, including real matrix variate data, often exhibit high levels of asymmetry. To address this issue, one common approach is to introduce a tail or skewness parameter to a symmetric distribution. In this regard, we introduce here a new distribution called the matrix variate skew-t distribution (MVST), which provides flexibility, in terms of heavy tail and skewness. We then conduct a thorough investigation of various characterizations and probabilistic properties of the MVST distribution. We also explore extensions of this distribution to a finite mixture model. To estimate the parameters of the MVST distribution, we develop an EM-type algorithm that computes maximum likelihood (ML) estimates of the model parameters. To validate the effectiveness and usefulness of the developed models and associated methods, we performed empirical experiments, using simulated data as well as three real data examples, including an application in skin cancer detection. Our results demonstrate the efficacy of the developed approach in handling asymmetric matrix variate data. Full article
(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)
Show Figures

Figure 1

24 pages, 6380 KiB  
Article
Multi-Type Self-Attention-Based Convolutional-Neural-Network Post-Filtering for AV1 Codec
by Woowoen Gwun, Kiho Choi and Gwang Hoon Park
Mathematics 2024, 12(18), 2874; https://doi.org/10.3390/math12182874 - 15 Sep 2024
Viewed by 716
Abstract
Over the past few years, there has been substantial interest and research activity surrounding the application of Convolutional Neural Networks (CNNs) for post-filtering in video coding. Most current research efforts have focused on using CNNs with various kernel sizes for post-filtering, primarily concentrating [...] Read more.
Over the past few years, there has been substantial interest and research activity surrounding the application of Convolutional Neural Networks (CNNs) for post-filtering in video coding. Most current research efforts have focused on using CNNs with various kernel sizes for post-filtering, primarily concentrating on High-Efficiency Video Coding/H.265 (HEVC) and Versatile Video Coding/H.266 (VVC). This narrow focus has limited the exploration and application of these techniques to other video coding standards such as AV1, developed by the Alliance for Open Media, which offers excellent compression efficiency, reducing bandwidth usage and improving video quality, making it highly attractive for modern streaming and media applications. This paper introduces a novel approach that extends beyond traditional CNN methods by integrating three different self-attention layers into the CNN framework. Applied to the AV1 codec, the proposed method significantly improves video quality by incorporating these distinct self-attention layers. This enhancement demonstrates the potential of self-attention mechanisms to revolutionize post-filtering techniques in video coding beyond the limitations of convolution-based methods. The experimental results show that the proposed network achieves an average BD-rate reduction of 10.40% for the Luma component and 19.22% and 16.52% for the Chroma components compared to the AV1 anchor. Visual quality assessments further validated the effectiveness of our approach, showcasing substantial artifact reduction and detail enhancement in videos. Full article
(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)
Show Figures

Figure 1

12 pages, 4607 KiB  
Article
Depth Prior-Guided 3D Voxel Feature Fusion for 3D Semantic Estimation from Monocular Videos
by Mingyun Wen and Kyungeun Cho
Mathematics 2024, 12(13), 2114; https://doi.org/10.3390/math12132114 - 5 Jul 2024
Viewed by 611
Abstract
Existing 3D semantic scene reconstruction methods utilize the same set of features extracted from deep learning networks for both 3D semantic estimation and geometry reconstruction, ignoring the differing requirements of semantic segmentation and geometry construction tasks. Additionally, current methods allocate 2D image features [...] Read more.
Existing 3D semantic scene reconstruction methods utilize the same set of features extracted from deep learning networks for both 3D semantic estimation and geometry reconstruction, ignoring the differing requirements of semantic segmentation and geometry construction tasks. Additionally, current methods allocate 2D image features to all voxels along camera rays during the back-projection process, without accounting for empty or occluded voxels. To address these issues, we propose separating the features for 3D semantic estimation from those for 3D mesh reconstruction. We use a pretrained vision transformer network for image feature extraction and depth priors estimated by a pretrained multi-view stereo-network to guide the allocation of image features within 3D voxels during the back-projection process. The back-projected image features are aggregated within each 3D voxel via averaging, creating coherent voxel features. The resulting 3D feature volume, composed of unified voxel feature vectors, is fed into a 3D CNN with a semantic classification head to produce a 3D semantic volume. This volume can be combined with existing 3D mesh reconstruction networks to produce a 3D semantic mesh. Experimental results on real-world datasets demonstrate that the proposed method significantly increases 3D semantic estimation accuracy. Full article
(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)
Show Figures

Figure 1

23 pages, 23372 KiB  
Article
Retinex Jointed Multiscale CLAHE Model for HDR Image Tone Compression
by Yu-Joong Kim, Dong-Min Son and Sung-Hak Lee
Mathematics 2024, 12(10), 1541; https://doi.org/10.3390/math12101541 - 15 May 2024
Cited by 1 | Viewed by 1162
Abstract
Tone-mapping algorithms aim to compress a wide dynamic range image into a narrower dynamic range image suitable for display on imaging devices. A representative tone-mapping algorithm, Retinex theory, reflects color constancy based on the human visual system and performs dynamic range compression. However, [...] Read more.
Tone-mapping algorithms aim to compress a wide dynamic range image into a narrower dynamic range image suitable for display on imaging devices. A representative tone-mapping algorithm, Retinex theory, reflects color constancy based on the human visual system and performs dynamic range compression. However, it may induce halo artifacts in some areas or degrade chroma and detail. Thus, this paper proposes a Retinex jointed multiscale contrast limited adaptive histogram equalization method. The proposed algorithm reduces localized halo artifacts and detail loss while maintaining the tone-compression effect via high-scale Retinex processing. A performance comparison of the experimental results between the proposed and existing methods confirms that the proposed method effectively reduces the existing problems and displays better image quality. Full article
(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)
Show Figures

Figure 1

29 pages, 21511 KiB  
Article
Enhancing Surveillance Vision with Multi-Layer Deep Learning Representation
by Dong-Min Son and Sung-Hak Lee
Mathematics 2024, 12(9), 1313; https://doi.org/10.3390/math12091313 - 25 Apr 2024
Viewed by 688
Abstract
This paper aimed to develop a method for generating sand–dust removal and dehazed images utilizing CycleGAN, facilitating object identification on roads under adverse weather conditions such as heavy dust or haze, which severely impair visibility. Initially, the study addressed the scarcity of paired [...] Read more.
This paper aimed to develop a method for generating sand–dust removal and dehazed images utilizing CycleGAN, facilitating object identification on roads under adverse weather conditions such as heavy dust or haze, which severely impair visibility. Initially, the study addressed the scarcity of paired image sets for training by employing unpaired CycleGAN training. The CycleGAN training module incorporates hierarchical single-scale Retinex (SSR) images with varying sigma sizes, facilitating multiple-scaled trainings. Refining the training data into detailed hierarchical layers for virtual paired training enhances the performance of CycleGAN training. Conventional sand–dust removal or dehazing algorithms, alongside deep learning methods, encounter challenges in simultaneously addressing sand–dust removal and dehazing with a singular algorithm. Such algorithms often necessitate resetting hyperparameters to process images from both scenarios. To overcome this limitation, we proposed a unified approach for removing sand–dust and haze phenomena using a single model, leveraging images processed hierarchically with SSR. The image quality and image sharpness metrics of the proposed method were BRIQUE, PIQE, CEIQ, MCMA, LPC-SI, and S3. In sand–dust environments, the proposed method achieved the highest scores, with an average of 21.52 in BRISQUE, 0.724 in MCMA, and 0.968 in LPC-SI compared to conventional methods. For haze images, the proposed method outperformed conventional methods with an average of 3.458 in CEIQ, 0.967 in LPC-SI, and 0.243 in S3. The images generated via this proposed method demonstrated superior performance in image quality and sharpness evaluation compared to conventional algorithms. The outcomes of this study hold particular relevance for camera images utilized in automobiles, especially in the context of self-driving cars or CCTV surveillance systems. Full article
(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)
Show Figures

Figure 1

22 pages, 3459 KiB  
Article
MDER-Net: A Multi-Scale Detail-Enhanced Reverse Attention Network for Semantic Segmentation of Bladder Tumors in Cystoscopy Images
by Chao Nie, Chao Xu and Zhengping Li
Mathematics 2024, 12(9), 1281; https://doi.org/10.3390/math12091281 - 24 Apr 2024
Cited by 1 | Viewed by 1014
Abstract
White light cystoscopy is the gold standard for the diagnosis of bladder cancer. Automatic and accurate tumor detection is essential to improve the surgical resection of bladder cancer and reduce tumor recurrence. At present, Transformer-based medical image segmentation algorithms face challenges in restoring [...] Read more.
White light cystoscopy is the gold standard for the diagnosis of bladder cancer. Automatic and accurate tumor detection is essential to improve the surgical resection of bladder cancer and reduce tumor recurrence. At present, Transformer-based medical image segmentation algorithms face challenges in restoring fine-grained detail information and local boundary information of features and have limited adaptability to multi-scale features of lesions. To address these issues, we propose a new multi-scale detail-enhanced reverse attention network, MDER-Net, for accurate and robust bladder tumor segmentation. Firstly, we propose a new multi-scale efficient channel attention module (MECA) to process four different levels of features extracted by the PVT v2 encoder to adapt to the multi-scale changes in bladder tumors; secondly, we use the dense aggregation module (DA) to aggregate multi-scale advanced semantic feature information; then, the similarity aggregation module (SAM) is used to fuse multi-scale high-level and low-level features, complementing each other in position and detail information; finally, we propose a new detail-enhanced reverse attention module (DERA) to capture non-salient boundary features and gradually explore supplementing tumor boundary feature information and fine-grained detail information; in addition, we propose a new efficient channel space attention module (ECSA) that enhances local context and improves segmentation performance by suppressing redundant information in low-level features. Extensive experiments on the bladder tumor dataset BtAMU, established in this article, and five publicly available polyp datasets show that MDER-Net outperforms eight state-of-the-art (SOTA) methods in terms of effectiveness, robustness, and generalization ability. Full article
(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)
Show Figures

Figure 1

10 pages, 318 KiB  
Article
Tiling Rectangles and the Plane Using Squares of Integral Sides
by Bahram Sadeghi Bigham, Mansoor Davoodi Monfared, Samaneh Mazaheri and Jalal Kheyrabadi
Mathematics 2024, 12(7), 1027; https://doi.org/10.3390/math12071027 - 29 Mar 2024
Viewed by 813
Abstract
We study the problem of perfect tiling in the plane and explore the possibility of tiling a rectangle using integral distinct squares. Assume a set of distinguishable squares (or equivalently a set of distinct natural numbers) is given, and one has to decide [...] Read more.
We study the problem of perfect tiling in the plane and explore the possibility of tiling a rectangle using integral distinct squares. Assume a set of distinguishable squares (or equivalently a set of distinct natural numbers) is given, and one has to decide whether it can tile the plane or a rectangle or not. Previously, it has been proved that tiling the plane is not feasible using a set of odd numbers or an infinite sequence of natural numbers including exactly two odd numbers. The problem is open for different situations in which the number of odd numbers is arbitrary. In addition to providing a solution to this special case, we discuss some open problems to tile the plane and rectangles in this paper. Full article
(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)
Show Figures

Figure 1

27 pages, 5593 KiB  
Article
Ear-Touch-Based Mobile User Authentication
by Jalil Nourmohammadi Khiarak, Samaneh Mazaheri and Rohollah Moosavi Tayebi
Mathematics 2024, 12(5), 752; https://doi.org/10.3390/math12050752 - 2 Mar 2024
Viewed by 1361
Abstract
Mobile devices have become integral to daily life, necessitating robust user authentication methods to safeguard personal information. In this study, we present a new approach to mobile user authentication utilizing ear-touch interactions. Our novel system employs an analytical algorithm to authenticate users based [...] Read more.
Mobile devices have become integral to daily life, necessitating robust user authentication methods to safeguard personal information. In this study, we present a new approach to mobile user authentication utilizing ear-touch interactions. Our novel system employs an analytical algorithm to authenticate users based on features extracted from ear-touch images. We conducted extensive evaluations on a dataset comprising ear-touch images from 92 subjects, achieving an average equal error rate of 0.04, indicative of high accuracy and reliability. Our results suggest that ear-touch-based authentication is a feasible and effective method for securing mobile devices. Full article
(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)
Show Figures

Figure 1

19 pages, 9187 KiB  
Article
Light “You Only Look Once”: An Improved Lightweight Vehicle-Detection Model for Intelligent Vehicles under Dark Conditions
by Tianrui Yin, Wei Chen, Bo Liu, Changzhen Li and Luyao Du
Mathematics 2024, 12(1), 124; https://doi.org/10.3390/math12010124 - 29 Dec 2023
Cited by 3 | Viewed by 1432
Abstract
Vehicle detection is crucial for traffic surveillance and assisted driving. To overcome the loss of efficiency, accuracy, and stability in low-light conditions, we propose a lightweight “You Only Look Once” (YOLO) detection model. A polarized self-attention-enhanced aggregation feature pyramid network is used to [...] Read more.
Vehicle detection is crucial for traffic surveillance and assisted driving. To overcome the loss of efficiency, accuracy, and stability in low-light conditions, we propose a lightweight “You Only Look Once” (YOLO) detection model. A polarized self-attention-enhanced aggregation feature pyramid network is used to improve feature extraction and fusion in low-light scenarios, and enhanced “Swift” spatial pyramid pooling is used to reduce model parameters and enhance real-time nighttime detection. To address imbalanced low-light samples, we integrate an anchor mechanism with a focal loss to improve network stability and accuracy. Ablation experiments show the superior accuracy and real-time performance of our Light-YOLO model. Compared with EfficientNetv2-YOLOv5, Light-YOLO boosts [email protected] and [email protected]:0.95 by 4.03 and 2.36%, respectively, cuts parameters by 44.37%, and increases recognition speed by 20.42%. Light-YOLO competes effectively with advanced lightweight networks and offers a solution for efficient nighttime vehicle-detection. Full article
(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)
Show Figures

Figure 1

16 pages, 7068 KiB  
Article
Enhancing Focus Volume through Perceptual Focus Factor in Shape-from-Focus
by Khurram Ashfaq and Muhammad Tariq Mahmood
Mathematics 2024, 12(1), 102; https://doi.org/10.3390/math12010102 - 27 Dec 2023
Cited by 1 | Viewed by 1044
Abstract
Shape From Focus (SFF) reconstructs a scene’s shape using a series of images with varied focus settings. However, the effectiveness of SFF largely depends on the Focus Measure (FM) used, which is prone to noise-induced inaccuracies in focus values. To address these issues, [...] Read more.
Shape From Focus (SFF) reconstructs a scene’s shape using a series of images with varied focus settings. However, the effectiveness of SFF largely depends on the Focus Measure (FM) used, which is prone to noise-induced inaccuracies in focus values. To address these issues, we introduce a perception-influenced factor to refine the traditional Focus Volume (FV) derived from a traditional FM. Owing to the strong relationship between the Difference of Gaussians (DoG) and how the visual system perceives edges in a scene, we apply it to local areas of the image sequence by segmenting the image sequence into non-overlapping blocks. This process yields a new metric, the Perceptual Focus Factor (PFF), which we combine with the traditional FV to obtain an enhanced FV and, ultimately, an enhanced depth map. Intensive experiments are conducted by using fourteen synthetic and six real-world data sets. The performance of the proposed method is evaluated using quantitative measures, such as Root Mean Square Error (RMSE) and correlation. For fourteen synthetic data sets, the average RMSE measure of 6.88 and correction measure of 0.65 are obtained, which are improved through PFF from an RMSE of 7.44 and correlation of 0.56, respectively. Experimental results and comparative analysis demonstrate that the proposed approach outperforms the traditional state-of-the-art FMs in extracting depth maps. Full article
(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)
Show Figures

Figure 1

34 pages, 9382 KiB  
Article
LRFID-Net: A Local-Region-Based Fake-Iris Detection Network for Fake Iris Images Synthesized by a Generative Adversarial Network
by Jung Soo Kim, Young Won Lee, Jin Seong Hong, Seung Gu Kim, Ganbayar Batchuluun and Kang Ryoung Park
Mathematics 2023, 11(19), 4160; https://doi.org/10.3390/math11194160 - 3 Oct 2023
Viewed by 1662
Abstract
Iris recognition is a biometric method using the pattern of the iris seated between the pupil and the sclera for recognizing people. It is widely applied in various fields owing to its high accuracy in recognition and high security. A spoof detection method [...] Read more.
Iris recognition is a biometric method using the pattern of the iris seated between the pupil and the sclera for recognizing people. It is widely applied in various fields owing to its high accuracy in recognition and high security. A spoof detection method for discriminating a spoof attack is essential in biometric recognition systems that include iris recognition. However, previous studies have mainly investigated spoofing attack detection methods based on printed or photographed images, video replaying, artificial eyes, and patterned contact lenses fabricated using iris images from information leakage. On the other hand, there have only been a few studies on spoof attack detection using iris images generated through a generative adversarial network (GAN), which is a method that has drawn considerable research interest with the recent development of deep learning, and the enhancement of spoof detection accuracy by the methods proposed in previous research is limited. To address this problem, the possibility of an attack on a conventional iris recognition system with spoofed iris images generated using cycle-consistent adversarial networks (CycleGAN), which was the motivation of this study, was investigated. In addition, a local region-based fake-iris detection network (LRFID-Net) was developed. It provides a novel method for discriminating fake iris images by segmenting the iris image into three regions based on the iris region. Experimental results using two open databases, the Warsaw LiveDet-Iris-2017 and the Notre Dame Contact Lens Detection LiveDet-Iris-2017 datasets, showed that the average classification error rate of spoof detection by the proposed method was 0.03% for the Warsaw dataset and 0.11% for the Notre Dame Contact Lens Detection dataset. The results confirmed that the proposed method outperformed the state-of-the-art methods. Full article
(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)
Show Figures

Figure 1

21 pages, 6458 KiB  
Article
CAGNet: A Multi-Scale Convolutional Attention Method for Glass Detection Based on Transformer
by Xiaohang Hu, Rui Gao, Seungjun Yang and Kyungeun Cho
Mathematics 2023, 11(19), 4084; https://doi.org/10.3390/math11194084 - 26 Sep 2023
Cited by 2 | Viewed by 1099
Abstract
Glass plays a vital role in several fields, making its accurate detection crucial. Proper detection prevents misjudgments, reduces noise from reflections, and ensures optimal performance in other computer vision tasks. However, the prevalent usage of glass in daily applications poses unique challenges for [...] Read more.
Glass plays a vital role in several fields, making its accurate detection crucial. Proper detection prevents misjudgments, reduces noise from reflections, and ensures optimal performance in other computer vision tasks. However, the prevalent usage of glass in daily applications poses unique challenges for computer vision. This study introduces a novel convolutional attention glass segmentation network (CAGNet) predicated on a transformer architecture customized for image glass detection. Based on the foundation of our prior study, CAGNet minimizes the number of training cycles and iterations, resulting in enhanced performance and efficiency. CAGNet is built upon the strategic design and integration of two types of convolutional attention mechanisms coupled with a transformer head applied for comprehensive feature analysis and fusion. To further augment segmentation precision, the network incorporates a custom edge-weighting scheme to optimize glass detection within images. Comparative studies and rigorous testing demonstrate that CAGNet outperforms several leading methodologies in glass detection, exhibiting robustness across a diverse range of conditions. Specifically, the IOU metric improves by 0.26% compared to that in our previous study and presents a 0.92% enhancement over those of other state-of-the-art methods. Full article
(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)
Show Figures

Figure 1

21 pages, 44259 KiB  
Article
Cervical Precancerous Lesion Image Enhancement Based on Retinex and Histogram Equalization
by Yuan Ren, Zhengping Li and Chao Xu
Mathematics 2023, 11(17), 3689; https://doi.org/10.3390/math11173689 - 28 Aug 2023
Viewed by 2127
Abstract
Cervical cancer is a prevalent chronic malignant tumor in gynecology, necessitating high-quality images of cervical precancerous lesions to enhance detection rates. Addressing the challenges of low contrast, uneven illumination, and indistinct lesion details in such images, this paper proposes an enhancement algorithm based [...] Read more.
Cervical cancer is a prevalent chronic malignant tumor in gynecology, necessitating high-quality images of cervical precancerous lesions to enhance detection rates. Addressing the challenges of low contrast, uneven illumination, and indistinct lesion details in such images, this paper proposes an enhancement algorithm based on retinex and histogram equalization. First, the algorithm solves the color deviation problem by modifying the quantization formula of retinex theory. Then, the contrast-limited adaptive histogram equalization algorithm is selectively conducted on blue and green channels to avoid the problem of image visual quality reduction caused by drastic darkening of local dark areas. Next, a multi-scale detail enhancement algorithm is used to further sharpen the details. Finally, the problem of noise amplification and image distortion in the process of enhancement is alleviated by dynamic weighted fusion. The experimental results confirm the effectiveness of the proposed algorithm in optimizing brightness, enhancing contrast, sharpening details, and suppressing noise in cervical precancerous lesion images. The proposed algorithm has shown superior performance compared to other traditional methods based on objective indicators such as peak signal-to-noise ratio, detail-variance–background-variance, gray square mean deviation, contrast improvement index, and enhancement quality index. Full article
(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)
Show Figures

Figure 1

12 pages, 20257 KiB  
Article
Directional Ring Difference Filter for Robust Shape-from-Focus
by Khurram Ashfaq and Muhammad Tariq Mahmood
Mathematics 2023, 11(14), 3056; https://doi.org/10.3390/math11143056 - 11 Jul 2023
Cited by 2 | Viewed by 1353
Abstract
In the shape-from-focus (SFF) method, the quality of the 3D shape generated relies heavily on the focus measure operator (FM) used. Unfortunately, most FMs are sensitive to noise and provide inaccurate depth maps. Among recent FMs, the ring difference filter (RDF) has demonstrated [...] Read more.
In the shape-from-focus (SFF) method, the quality of the 3D shape generated relies heavily on the focus measure operator (FM) used. Unfortunately, most FMs are sensitive to noise and provide inaccurate depth maps. Among recent FMs, the ring difference filter (RDF) has demonstrated excellent robustness against noise and reasonable performance in computing accurate depth maps. However, it also suffers from the response cancellation problem (RCP) encountered in multidimensional kernel-based FMs. To address this issue, we propose an effective and robust FM called the directional ring difference filter (DRDF). In DRDF, the focus quality is computed by aggregating responses of RDF from multiple kernels in different directions. We conducted experiments using synthetic and real image datasets and found that the proposed DRDF method outperforms traditional FMs in terms of noise handling and producing a higher quality 3D shape estimate of the object. Full article
(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)
Show Figures

Figure 1

15 pages, 1329 KiB  
Article
Optimal Robot Pose Estimation Using Scan Matching by Turning Function
by Bahram Sadeghi Bigham, Omid Abbaszadeh, Mazyar Zahedi-Seresht, Shahrzad Khosravi and Elham Zarezadeh
Mathematics 2023, 11(6), 1449; https://doi.org/10.3390/math11061449 - 16 Mar 2023
Viewed by 1720
Abstract
The turning function is a tool in image processing that measures the difference between two polygonal shapes. We propose a localization algorithm for the optimal pose estimation of autonomous mobile robots using the scan-matching method based on the turning function algorithm. There are [...] Read more.
The turning function is a tool in image processing that measures the difference between two polygonal shapes. We propose a localization algorithm for the optimal pose estimation of autonomous mobile robots using the scan-matching method based on the turning function algorithm. There are several methodologies aimed at moving the robots in the right way and carrying out their missions well, which involves the integration of localization and control. In the proposed method, the localization problem is implemented in the form of an optimization problem. Afterwards, the turning function algorithm and the simplex method are applied to estimate the localization and orientation of the robot. The proposed algorithm first receives the polygons extracted from two sensors’ data and then allocates a histogram to each sensor scan. This algorithm attempts to maximize the similarity of the two histograms by converting them to a unified coordinate system. In this way, the estimate of the difference between the two situations is calculated. In more detail, the main objective of this study is to provide an algorithm aimed at reducing errors in the localization and orientation of mobile robots. The simulation results indicate the great performance of this algorithm. Experimental results on simulated and real datasets show that the proposed algorithms achieve better results in terms of both position and orientation metrics. Full article
(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)
Show Figures

Figure 1

Review

Jump to: Research

26 pages, 5886 KiB  
Review
Progress in Blind Image Quality Assessment: A Brief Review
by Pei Yang, Jordan Sturtz and Letu Qingge
Mathematics 2023, 11(12), 2766; https://doi.org/10.3390/math11122766 - 19 Jun 2023
Cited by 5 | Viewed by 2369
Abstract
As a fundamental research problem, blind image quality assessment (BIQA) has attracted increasing interest in recent years. Although great progress has been made, BIQA still remains a challenge. To better understand the research progress and challenges in this field, we review BIQA methods [...] Read more.
As a fundamental research problem, blind image quality assessment (BIQA) has attracted increasing interest in recent years. Although great progress has been made, BIQA still remains a challenge. To better understand the research progress and challenges in this field, we review BIQA methods in this paper. First, we introduce the BIQA problem definition and related methods. Second, we provide a detailed review of the existing BIQA methods in terms of representative hand-crafted features, learning-based features and quality regressors for two-stage methods, as well as one-stage DNN models with various architectures. Moreover, we also present and analyze the performance of competing BIQA methods on six public IQA datasets. Finally, we conclude our paper with possible future research directions based on a performance analysis of the BIQA methods. This review will provide valuable references for researchers interested in the BIQA problem. Full article
(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)
Show Figures

Figure 1

Back to TopTop