sensors-logo

Journal Browser

Journal Browser

Artificial Intelligence and Machine Learning in Sensing and Image Processing

A special issue of Sensors (ISSN 1424-8220). This special issue belongs to the section "Sensing and Imaging".

Deadline for manuscript submissions: closed (20 September 2023) | Viewed by 28814

Special Issue Editors


E-Mail Website
Guest Editor
Institute of Information Science and Engineering, Huaqiao University, Quanzhou 362021, China
Interests: control theory and control engineering

E-Mail Website
Guest Editor
School of Information Engineering, Shenzhen University, Shenzhen 518052, China
Interests: computer vision and machine learning; image/video processing
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Visual data have grown volumetrically at a rate like never before. Artificial intelligence and machine learning technologies are turning massive amounts of redundant visual data into valuable information with unprecedented capabilities. Image sensors have enabled smartphone, automotive, computing, industrial and IoT devices to redefine the way they compress, recover, analyze, search, encrypt and assess videos and images. The integrated image sensor with their built-in high-processing power and memory allow the machine- and human-vision applications to run faster, be more energy efficient and more secure without sending any data to remote servers.

This Special Issue presents a forum for the publication of articles describing the use of classical and modern artificial intelligence methods in image processing applications.

The purpose of this special collection is to present state-of-the-art developments on image processing along with artificial intelligence and machine learning to satisfy various vision tasks in the above-mentioned areas. This Special Issue aims to provide an adequate collection to researchers in both academia and industry. We are pleased to invite our colleagues to contribute original research papers as well as review papers that focus on the applications of artificial intelligence methods, including traditional machine learning methods, advanced deep learning approaches, metaheuristic optimization algorithms and other AI-based methods for solving image processing problems.

Topics include but are not limited to:

  • Machine learning-based image compression methods;
  • Machine learning-based image restortation methods;
  • Machine learning-based image classification/recognition/segmentation methods;
  • Machine learning-based image retrieval methods;
  • Machine learning-based image forensics methods;
  • Machine learning-based image quality assessment methods;
  • Machine learning for image processing;
  • Metaheuristic optimization algorithms for image processing;
  • Remote sensing image classification;
  • Medical image classification;
  • Neural computing for image processing;
  • Evolutionary algorithms for image processing.

Dr. Jing Chen
Dr. Miaohui Wang
Prof. Dr. Chih-Hsien Hsia
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (13 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

23 pages, 7096 KiB  
Article
Resampling-Detection-Network-Based Robust Image Watermarking against Scaling and Cutting
by Hao-Lai Li, Xu-Qing Zhang, Zong-Hui Wang, Zhe-Ming Lu and Jia-Lin Cui
Sensors 2023, 23(19), 8195; https://doi.org/10.3390/s23198195 - 30 Sep 2023
Viewed by 1083
Abstract
Watermarking is an excellent solution to protect multimedia privacy but will be damaged by attacks such as noise adding, image filtering, compression, and especially scaling and cutting. In this paper, we propose a watermarking scheme to embed the watermark in the DWT-DCT composite [...] Read more.
Watermarking is an excellent solution to protect multimedia privacy but will be damaged by attacks such as noise adding, image filtering, compression, and especially scaling and cutting. In this paper, we propose a watermarking scheme to embed the watermark in the DWT-DCT composite transform coefficients, which is robust against normal image processing operations and geometric attacks. To make our scheme robust to scaling operations, a resampling detection network is trained to detect the scaling factor and then rescale the scaling-attacked image before watermark detection. To make our scheme robust to cutting operations, a template watermark is embedded in the Y channel to locate the cutting position. Experiments for various low- and high-resolution images reveal that our scheme has excellent performance in terms of imperceptibility and robustness. Full article
Show Figures

Figure 1

16 pages, 6429 KiB  
Article
Qualitative Classification of Proximal Femoral Bone Using Geometric Features and Texture Analysis in Collected MRI Images for Bone Density Evaluation
by Mojtaba Najafi, Tohid Yousefi Rezaii, Sebelan Danishvar and Seyed Naser Razavi
Sensors 2023, 23(17), 7612; https://doi.org/10.3390/s23177612 - 2 Sep 2023
Viewed by 1550
Abstract
The aim of this study was to use geometric features and texture analysis to discriminate between healthy and unhealthy femurs and to identify the most influential features. We scanned proximal femoral bone (PFB) of 284 Iranian cases (21 to 83 years old) using [...] Read more.
The aim of this study was to use geometric features and texture analysis to discriminate between healthy and unhealthy femurs and to identify the most influential features. We scanned proximal femoral bone (PFB) of 284 Iranian cases (21 to 83 years old) using different dual-energy X-ray absorptiometry (DEXA) scanners and magnetic resonance imaging (MRI) machines. Subjects were labeled as “healthy” (T-score > −0.9) and “unhealthy” based on the results of DEXA scans. Based on the geometry and texture of the PFB in MRI, 204 features were retrieved. We used support vector machine (SVM) with different kernels, decision tree, and logistic regression algorithms as classifiers and the Genetic algorithm (GA) to select the best set of features and to maximize accuracy. There were 185 participants classified as healthy and 99 as unhealthy. The SVM with radial basis function kernels had the best performance (89.08%) and the most influential features were geometrical ones. Even though our findings show the high performance of this model, further investigation with more subjects is suggested. To our knowledge, this is the first study that investigates qualitative classification of PFBs based on MRI with reference to DEXA scans using machine learning methods and the GA. Full article
Show Figures

Figure 1

21 pages, 8948 KiB  
Article
Automatic Liver Tumor Segmentation from CT Images Using Graph Convolutional Network
by Maryam Khoshkhabar, Saeed Meshgini, Reza Afrouzian and Sebelan Danishvar
Sensors 2023, 23(17), 7561; https://doi.org/10.3390/s23177561 - 1 Sep 2023
Cited by 10 | Viewed by 3362
Abstract
Segmenting the liver and liver tumors in computed tomography (CT) images is an important step toward quantifiable biomarkers for a computer-aided decision-making system and precise medical diagnosis. Radiologists and specialized physicians use CT images to diagnose and classify liver organs and tumors. Because [...] Read more.
Segmenting the liver and liver tumors in computed tomography (CT) images is an important step toward quantifiable biomarkers for a computer-aided decision-making system and precise medical diagnosis. Radiologists and specialized physicians use CT images to diagnose and classify liver organs and tumors. Because these organs have similar characteristics in form, texture, and light intensity values, other internal organs such as the heart, spleen, stomach, and kidneys confuse visual recognition of the liver and tumor division. Furthermore, visual identification of liver tumors is time-consuming, complicated, and error-prone, and incorrect diagnosis and segmentation can hurt the patient’s life. Many automatic and semi-automatic methods based on machine learning algorithms have recently been suggested for liver organ recognition and tumor segmentation. However, there are still difficulties due to poor recognition precision and speed and a lack of dependability. This paper presents a novel deep learning-based technique for segmenting liver tumors and identifying liver organs in computed tomography maps. Based on the LiTS17 database, the suggested technique comprises four Chebyshev graph convolution layers and a fully connected layer that can accurately segment the liver and liver tumors. Thus, the accuracy, Dice coefficient, mean IoU, sensitivity, precision, and recall obtained based on the proposed method according to the LiTS17 dataset are around 99.1%, 91.1%, 90.8%, 99.4%, 99.4%, and 91.2%, respectively. In addition, the effectiveness of the proposed method was evaluated in a noisy environment, and the proposed network could withstand a wide range of environmental signal-to-noise ratios (SNRs). Thus, at SNR = −4 dB, the accuracy of the proposed method for liver organ segmentation remained around 90%. The proposed model has obtained satisfactory and favorable results compared to previous research. According to the positive results, the proposed model is expected to be used to assist radiologists and specialist doctors in the near future. Full article
Show Figures

Figure 1

15 pages, 14997 KiB  
Article
Data Augmentation of X-ray Images for Automatic Cargo Inspection of Nuclear Items
by Haneol Jang, Chansuh Lee, Hansol Ko and KyungTae Lim
Sensors 2023, 23(17), 7537; https://doi.org/10.3390/s23177537 - 30 Aug 2023
Cited by 1 | Viewed by 1832
Abstract
As part of establishing a management system to prevent the illegal transfer of nuclear items, automatic nuclear item detection technology is required during customs clearance. However, it is challenging to acquire X-ray images of major nuclear items (e.g., nuclear fuel and gas centrifuges) [...] Read more.
As part of establishing a management system to prevent the illegal transfer of nuclear items, automatic nuclear item detection technology is required during customs clearance. However, it is challenging to acquire X-ray images of major nuclear items (e.g., nuclear fuel and gas centrifuges) loaded in cargo with which to train a cargo inspection model. In this work, we propose a new means of data augmentation to alleviate the lack of X-ray training data. The proposed augmentation method generates synthetic X-ray images for the training of semantic segmentation models combining the X-ray images of nuclear items and X-ray cargo background images. To evaluate the effectiveness of the proposed data augmentation technique, we trained representative semantic segmentation models and performed extensive experiments to assess its quantitative and qualitative performance capabilities. Our findings show that multiple item insertions to respond to actual X-ray cargo inspection situations and the resulting occlusion expressions significantly affect the performance of the segmentation models. We believe that this augmentation research will enhance automatic cargo inspections to prevent the illegal transfer of nuclear items at airports and ports. Full article
Show Figures

Figure 1

16 pages, 3781 KiB  
Article
Cascaded Degradation-Aware Blind Super-Resolution
by Ding Zhang, Ni Tang, Dongxiao Zhang and Yanyun Qu
Sensors 2023, 23(11), 5338; https://doi.org/10.3390/s23115338 - 5 Jun 2023
Cited by 2 | Viewed by 1712
Abstract
Image super-resolution (SR) usually synthesizes degraded low-resolution images with a predefined degradation model for training. Existing SR methods inevitably perform poorly when the true degradation does not follow the predefined degradation, especially in the case of the real world. To tackle this robustness [...] Read more.
Image super-resolution (SR) usually synthesizes degraded low-resolution images with a predefined degradation model for training. Existing SR methods inevitably perform poorly when the true degradation does not follow the predefined degradation, especially in the case of the real world. To tackle this robustness issue, we propose a cascaded degradation-aware blind super-resolution network (CDASRN), which not only eliminates the influence of noise on blur kernel estimation but also can estimate the spatially varying blur kernel. With the addition of contrastive learning, our CDASRN can further distinguish the differences between local blur kernels, greatly improving its practicality. Experiments in various settings show that CDASRN outperforms state-of-the-art methods on both heavily degraded synthetic datasets and real-world datasets. Full article
Show Figures

Figure 1

22 pages, 1233 KiB  
Article
A Hybrid Rule-Based and Machine Learning System for Arabic Check Courtesy Amount Recognition
by Irfan Ahmad
Sensors 2023, 23(9), 4260; https://doi.org/10.3390/s23094260 - 25 Apr 2023
Cited by 1 | Viewed by 1629
Abstract
Courtesy amount recognition from bank checks is an important application of pattern recognition. Although much progress has been made on isolated digit recognition for Indian digits, there is no work reported in the literature on courtesy amount recognition for Arabic checks using Indian [...] Read more.
Courtesy amount recognition from bank checks is an important application of pattern recognition. Although much progress has been made on isolated digit recognition for Indian digits, there is no work reported in the literature on courtesy amount recognition for Arabic checks using Indian digits. Arabic check courtesy amount recognition comes with its own unique challenges that are not seen in isolated digit recognition tasks and, accordingly, need specific approaches to deal with them. This paper presents an end-to-end system for courtesy amount recognition starting from check images as input to recognizing amounts as a sequence of digits. The system is a hybrid system, combining rule-based modules as well as machine learning modules. For the amount recognition system, both segmentation-based and segmentation-free approaches were investigated and compared. We evaluated our system on the CENPARMI dataset of real bank checks in Arabic. We achieve 67.4% accuracy at the amount level and 87.15% accuracy at the digit level on the test set consisting of 626 check images. The results are presented with detailed analysis, and some possible future work is identified. This work can be used as a baseline to benchmark future research in Arabic check courtesy amount recognition. Full article
Show Figures

Figure 1

16 pages, 2331 KiB  
Article
Automatic Modulation Classification Using Hybrid Data Augmentation and Lightweight Neural Network
by Fan Wang, Tao Shang, Chenhan Hu and Qing Liu
Sensors 2023, 23(9), 4187; https://doi.org/10.3390/s23094187 - 22 Apr 2023
Cited by 9 | Viewed by 2311
Abstract
Automatic modulation classification (AMC) plays an important role in intelligent wireless communications. With the rapid development of deep learning in recent years, neural network-based automatic modulation classification methods have become increasingly mature. However, the high complexity and large number of parameters of neural [...] Read more.
Automatic modulation classification (AMC) plays an important role in intelligent wireless communications. With the rapid development of deep learning in recent years, neural network-based automatic modulation classification methods have become increasingly mature. However, the high complexity and large number of parameters of neural networks make them difficult to deploy in scenarios and receiver devices with strict requirements for low latency and storage. Therefore, this paper proposes a lightweight neural network-based AMC framework. To improve classification performance, the framework combines complex convolution with residual networks. To achieve a lightweight design, depthwise separable convolution is used. To compensate for any performance loss resulting from a lightweight design, a hybrid data augmentation scheme is proposed. The simulation results demonstrate that the lightweight AMC framework reduces the number of parameters by approximately 83.34% and the FLOPs by approximately 83.77%, without a degradation in performance. Full article
Show Figures

Figure 1

17 pages, 2614 KiB  
Article
Intraclass Image Augmentation for Defect Detection Using Generative Adversarial Neural Networks
by Vignesh Sampath, Iñaki Maurtua, Juan José Aguilar Martín, Ander Iriondo, Iker Lluvia and Gotzone Aizpurua
Sensors 2023, 23(4), 1861; https://doi.org/10.3390/s23041861 - 7 Feb 2023
Cited by 12 | Viewed by 3817
Abstract
Surface defect identification based on computer vision algorithms often leads to inadequate generalization ability due to large intraclass variation. Diversity in lighting conditions, noise components, defect size, shape, and position make the problem challenging. To solve the problem, this paper develops a pixel-level [...] Read more.
Surface defect identification based on computer vision algorithms often leads to inadequate generalization ability due to large intraclass variation. Diversity in lighting conditions, noise components, defect size, shape, and position make the problem challenging. To solve the problem, this paper develops a pixel-level image augmentation method that is based on image-to-image translation with generative adversarial neural networks (GANs) conditioned on fine-grained labels. The GAN model proposed in this work, referred to as Magna-Defect-GAN, is capable of taking control of the image generation process and producing image samples that are highly realistic in terms of variations. Firstly, the surface defect dataset based on the magnetic particle inspection (MPI) method is acquired in a controlled environment. Then, the Magna-Defect-GAN model is trained, and new synthetic image samples with large intraclass variations are generated. These synthetic image samples artificially inflate the training dataset size in terms of intraclass diversity. Finally, the enlarged dataset is used to train a defect identification model. Experimental results demonstrate that the Magna-Defect-GAN model can generate realistic and high-resolution surface defect images up to the resolution of 512 × 512 in a controlled manner. We also show that this augmentation method can boost accuracy and be easily adapted to any other surface defect identification models. Full article
Show Figures

Figure 1

17 pages, 1463 KiB  
Article
Margin-Based Modal Adaptive Learning for Visible-Infrared Person Re-Identification
by Qianqian Zhao, Hanxiao Wu and Jianqing Zhu
Sensors 2023, 23(3), 1426; https://doi.org/10.3390/s23031426 - 27 Jan 2023
Cited by 3 | Viewed by 2306
Abstract
Visible-infrared person re-identification (VIPR) has great potential for intelligent transportation systems for constructing smart cities, but it is challenging to utilize due to the huge modal discrepancy between visible and infrared images. Although visible and infrared data can appear to be two domains, [...] Read more.
Visible-infrared person re-identification (VIPR) has great potential for intelligent transportation systems for constructing smart cities, but it is challenging to utilize due to the huge modal discrepancy between visible and infrared images. Although visible and infrared data can appear to be two domains, VIPR is not identical to domain adaptation as it can massively eliminate modal discrepancies. Because VIPR has complete identity information on both visible and infrared modalities, once the domain adaption is overemphasized, the discriminative appearance information on the visible and infrared domains would drain. For that, we propose a novel margin-based modal adaptive learning (MMAL) method for VIPR in this paper. On each domain, we apply triplet and label smoothing cross-entropy functions to learn appearance-discriminative features. Between the two domains, we design a simple yet effective marginal maximum mean discrepancy (M3D) loss function to avoid an excessive suppression of modal discrepancies to protect the features’ discriminative ability on each domain. As a result, our MMAL method could learn modal-invariant yet appearance-discriminative features for improving VIPR. The experimental results show that our MMAL method acquires state-of-the-art VIPR performance, e.g., on the RegDB dataset in the visible-to-infrared retrieval mode, the rank-1 accuracy is 93.24% and the mean average precision is 83.77%. Full article
Show Figures

Figure 1

18 pages, 7501 KiB  
Article
Joint Cross-Consistency Learning and Multi-Feature Fusion for Person Re-Identification
by Danping Ren, Tingting He and Huisheng Dong
Sensors 2022, 22(23), 9387; https://doi.org/10.3390/s22239387 - 1 Dec 2022
Cited by 2 | Viewed by 1518
Abstract
To solve the problem of inadequate feature extraction by the model due to factors such as occlusion and illumination in person re-identification tasks, this paper proposed a model with a joint cross-consistency learning and multi-feature fusion person re-identification. The attention mechanism and the [...] Read more.
To solve the problem of inadequate feature extraction by the model due to factors such as occlusion and illumination in person re-identification tasks, this paper proposed a model with a joint cross-consistency learning and multi-feature fusion person re-identification. The attention mechanism and the mixed pooling module were first embedded in the residual network so that the model adaptively focuses on the more valid information in the person images. Secondly, the dataset was randomly divided into two categories according to the camera perspective, and a feature classifier was trained for the two types of datasets respectively. Then, two classifiers with specific knowledge were used to guide the model to extract features unrelated to the camera perspective for the two types of datasets so that the obtained image features were endowed with domain invariance by the model, and the differences in the perspective, attitude, background, and other related information of different images were alleviated. Then, the multi-level features were fused through the feature pyramid to concern the more critical information of the image. Finally, a combination of Cosine Softmax loss, triplet loss, and cluster center loss was proposed to train the model to address the differences of multiple losses in the optimization space. The first accuracy of the proposed model reached 95.9% and 89.7% on the datasets Market-1501 and DukeMTMC-reID, respectively. The results indicated that the proposed model has good feature extraction capability. Full article
Show Figures

Figure 1

17 pages, 4793 KiB  
Article
Single-Shot Object Detection via Feature Enhancement and Channel Attention
by Yi Li, Lingna Wang and Zeji Wang
Sensors 2022, 22(18), 6857; https://doi.org/10.3390/s22186857 - 10 Sep 2022
Cited by 3 | Viewed by 1858
Abstract
Features play a critical role in computer vision tasks. Deep learning methods have resulted in significant breakthroughs in the field of object detection, but it is still an extremely challenging obstacle when an object is very small. In this work, we propose a [...] Read more.
Features play a critical role in computer vision tasks. Deep learning methods have resulted in significant breakthroughs in the field of object detection, but it is still an extremely challenging obstacle when an object is very small. In this work, we propose a feature-enhancement- and channel-attention-guided single-shot detector called the FCSSD with four modules to improve object detection performance. Specifically, inspired by the structure of atrous convolution, we built an efficient feature-extraction module (EFM) in order to explore contextual information along the spatial dimension, and then pyramidal aggregation module (PAM) is presented to explore the semantic features of deep layers, thus reducing the semantic gap between multi-scale features. Furthermore, we construct an effective feature pyramid refinement fusion (FPRF) to refine the multi-scale features and create benefits for richer object knowledge. Finally, an attention-guided module (AGM) is developed to balance the channel weights and optimize the final integrated features on each level; this alleviates the aliasing effects of the FPN with negligible computational costs. The FCSSD exploits richer information of shallow layers and higher layers by using our designed modules, thus accomplishing excellent detection performance for multi-scale object detection and reaching a better tradeoff between accuracy and inference time. Experiments on PASCAL VOC and MS COCO datasets were conducted to evaluate the performance, showing that our FCSSD achieves competitive detection performance compared with existing mainstream object detection methods. Full article
Show Figures

Figure 1

14 pages, 13894 KiB  
Article
Scale-Space Feature Recalibration Network for Single Image Deraining
by Pengpeng Li, Jiyu Jin, Guiyue Jin and Lei Fan
Sensors 2022, 22(18), 6823; https://doi.org/10.3390/s22186823 - 9 Sep 2022
Cited by 3 | Viewed by 1742
Abstract
Computer vision technology is increasingly being used in areas such as intelligent security and autonomous driving. Users need accurate and reliable visual information, but the images obtained under severe weather conditions are often disturbed by rainy weather, causing image scenes to look blurry. [...] Read more.
Computer vision technology is increasingly being used in areas such as intelligent security and autonomous driving. Users need accurate and reliable visual information, but the images obtained under severe weather conditions are often disturbed by rainy weather, causing image scenes to look blurry. Many current single image deraining algorithms achieve good performance but have limitations in retaining detailed image information. In this paper, we design a Scale-space Feature Recalibration Network (SFR-Net) for single image deraining. The proposed network improves the image feature extraction and characterization capability of a Multi-scale Extraction Recalibration Block (MERB) using dilated convolution with different convolution kernel sizes, which results in rich multi-scale rain streaks features. In addition, we develop a Subspace Coordinated Attention Mechanism (SCAM) and embed it into MERB, which combines coordinated attention recalibration and a subspace attention mechanism to recalibrate the rain streaks feature information learned from the feature extraction phase and eliminate redundant feature information to enhance the transfer of important feature information. Meanwhile, the overall SFR-Net structure uses dense connection and cross-layer feature fusion to repeatedly utilize the feature maps, thus enhancing the understanding of the network and avoiding gradient disappearance. Through extensive experiments on synthetic and real datasets, the proposed method outperforms the recent state-of-the-art deraining algorithms in terms of both the rain removal effect and the preservation of image detail information. Full article
Show Figures

Figure 1

15 pages, 4577 KiB  
Article
Multi-Level Cycle-Consistent Adversarial Networks with Attention Mechanism for Face Sketch-Photo Synthesis
by Danping Ren, Jiajun Yang and Zhongcheng Wei
Sensors 2022, 22(18), 6725; https://doi.org/10.3390/s22186725 - 6 Sep 2022
Cited by 2 | Viewed by 1587
Abstract
The synthesis between face sketches and face photos has important application values in law enforcement and digital entertainment. In cases of a lack of paired sketch-photo data, this paper proposes an unsupervised model to solve the problems of missing key facial details and [...] Read more.
The synthesis between face sketches and face photos has important application values in law enforcement and digital entertainment. In cases of a lack of paired sketch-photo data, this paper proposes an unsupervised model to solve the problems of missing key facial details and a lack of realism in the synthesized images of existing methods. The model is built on the CycleGAN architecture. To retain more semantic information in the target domain, a multi-scale feature extraction module is inserted before the generator. In addition, the convolutional block attention module is introduced into the generator to enhance the ability of the model to extract important feature information. Via CBAM, the model improves the quality of the converted image and reduces the artifacts caused by image background interference. Next, in order to preserve more identity information in the generated photo, this paper constructs the multi-level cycle consistency loss function. Qualitative experiments on CUFS and CUFSF public datasets show that the facial details and edge structures synthesized by our model are clearer and more realistic. Meanwhile the performance indexes of structural similarity and peak signal-to-noise ratio in quantitative experiments are also significantly improved compared with other methods. Full article
Show Figures

Figure 1

Back to TopTop