applsci-logo

Journal Browser

Journal Browser

Computational Intelligence in Image and Video Analysis

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: closed (20 September 2023) | Viewed by 26362

Special Issue Editors


E-Mail Website
Guest Editor
Department of Computer Science and Information Engineering, National Taichung University of Science and Technology, Taichung City 404, Taiwan
Interests: video and image analysis; digital watermarking; data compression

E-Mail Website
Guest Editor
School of Information Engineering, Jimei University, Xiamen 361021, China
Interests: wireless sensor networks; security; intelligent information processing

Special Issue Information

Dear Colleagues,

An enormous number of images and videos can be obtained nowadays by sensor measurements or by image acquisition devices. These images and videos can be used to detect, identify, or track targets of interest, and the findings can be used for plant monitoring, face recognition, ecological conservation, and more. With the development of computational intelligence, a variety of effective computational techniques have emerged, including expert systems, machine learning, deep learning and other intelligent algorithms. These techniques can be used for image and video analysis, including feature extraction, data mining, or for decision making. However, properly analyzing and interpreting the obtained data remains a challenging task. Proper application or design of computational intelligence algorithms can help improve the effectiveness and speed of image and video processing.

The purpose of this Special Issue is to apply computational intelligence to image and video analysis. This Special Issue will provide new insights into related research and broaden the horizons of computational intelligence researchers. We welcome original research or review articles.

Potential topics include, but are not limited to, the following:

  • Computational intelligence in video detection / tracking;
  • Deep learning for image and video analysis;
  • Image and video compression;
  • Image sampling and quantization;
  • Image and video segmentation and texture models;
  • Algorithms for boundary and region segmentation from images and video;
  • Intelligent information processing;
  • Signal processing for image and video analysis;
  • Image and video watermarking;
  • Data hiding or authentication for image and videos.

Prof. Dr. Wien Hong
Prof. Dr. Guangsong Yang
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • computational intelligence
  • images
  • video
  • algorithms
  • watermarking
  • authentication

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (12 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

36 pages, 12592 KiB  
Article
A Novel Gradient-Weighted Voting Approach for Classical and Fuzzy Circular Hough Transforms and Their Application in Medical Image Analysis—Case Study: Colonoscopy
by Raneem Ismail and Szilvia Nagy
Appl. Sci. 2023, 13(16), 9066; https://doi.org/10.3390/app13169066 - 8 Aug 2023
Cited by 1 | Viewed by 1108
Abstract
Classical circular Hough transform was proven to be effective for some types of colorectal polyps. However, the polyps are very rarely perfectly circular, so some tolerance is needed, that can be ensured by applying fuzzy Hough transform instead of the classical one. In [...] Read more.
Classical circular Hough transform was proven to be effective for some types of colorectal polyps. However, the polyps are very rarely perfectly circular, so some tolerance is needed, that can be ensured by applying fuzzy Hough transform instead of the classical one. In addition, the edge detection method, which is used as a preprocessing step of the Hough transforms, was changed from the generally used Canny method to Prewitt that detects fewer edge points outside of the polyp contours and also a smaller number of points to be transformed based on statistical data from three colonoscopy databases. According to the statistical study we performed, in the colonoscopy images the polyp contours usually belong to gradient domain of neither too large, nor too small gradients, though they can also have stronger or weaker segments. In order to prioritize the gradient domain typical for the polyps, a relative gradient-based thresholding as well as a gradient-weighted voting was introduced in this paper. For evaluating the improvement of the shape deviation tolerance of the classical and fuzzy Hough transforms, the maximum radial displacement and the average radius were used to characterize the roundness of the objects to be detected. The gradient thresholding proved to decrease the calculation time to less than 50% of the full Hough transforms, and the number of the resulting circles outside the polyp’s environment also decreased, especially for low resolution images. Full article
(This article belongs to the Special Issue Computational Intelligence in Image and Video Analysis)
Show Figures

Figure 1

11 pages, 1893 KiB  
Article
Video-Based Recognition of Human Activity Using Novel Feature Extraction Techniques
by Obada Issa and Tamer Shanableh
Appl. Sci. 2023, 13(11), 6856; https://doi.org/10.3390/app13116856 - 5 Jun 2023
Cited by 7 | Viewed by 1852
Abstract
This paper proposes a novel approach to activity recognition where videos are compressed using video coding to generate feature vectors based on compression variables. We propose to eliminate the temporal domain of feature vectors by computing the mean and standard deviation of each [...] Read more.
This paper proposes a novel approach to activity recognition where videos are compressed using video coding to generate feature vectors based on compression variables. We propose to eliminate the temporal domain of feature vectors by computing the mean and standard deviation of each variable across all video frames. Thus, each video is represented by a single feature vector of 67 variables. As for the motion vectors, we eliminated their temporal domain by projecting their phases using PCA, thus representing each video by a single feature vector with a length equal to the number of frames in a video. Consequently, complex classifiers such as LSTM can be avoided and classical machine learning techniques can be used instead. Experimental results on the JHMDB dataset resulted in average classification accuracies of 68.8% and 74.2% when using the projected phases of motion vectors and video coding feature variables, respectively. The advantage of the proposed solution is the use of FVs with low dimensionality and simple machine learning techniques. Full article
(This article belongs to the Special Issue Computational Intelligence in Image and Video Analysis)
Show Figures

Figure 1

12 pages, 9709 KiB  
Article
Enhancement of Underwater Images with Retinex Transmission Map and Adaptive Color Correction
by Erkang Chen, Tian Ye, Qianru Chen, Bin Huang and Yendo Hu
Appl. Sci. 2023, 13(3), 1973; https://doi.org/10.3390/app13031973 - 3 Feb 2023
Cited by 3 | Viewed by 2231
Abstract
Underwater images often suffer from low contrast, low visibility, and color deviation. In this work, we propose a hybrid underwater enhancement method consisting of addressing an inverse problem with novel Retinex transmission map estimation and adaptive color correction. Retinex transmission map estimation does [...] Read more.
Underwater images often suffer from low contrast, low visibility, and color deviation. In this work, we propose a hybrid underwater enhancement method consisting of addressing an inverse problem with novel Retinex transmission map estimation and adaptive color correction. Retinex transmission map estimation does not rely on channel priors and aims to decouple from the unknown background light, thus avoiding error accumulation problem. To this end, global white balance is performed before estimating the transmission map using multi-scale Retinex. To further improve the enhancement performance, we design the adaptive color correction which cleverly chooses between two color correction procedures and prevents channel stretching imbalance. Quantitative and qualitative comparisons of our method and state-of-the-art underwater image enhancement methods demonstrate superiority of the proposed method. It achieves the best performance in terms of full-reference image quality assessment. In addition, it also achieves superior performance in the non-reference evaluation. Full article
(This article belongs to the Special Issue Computational Intelligence in Image and Video Analysis)
Show Figures

Figure 1

16 pages, 4670 KiB  
Article
An Authentication Method for AMBTC Compressed Images Using Dual Embedding Strategies
by Xiaoyu Zhou, Jeanne Chen, Guangsong Yang, Zheng-Feng Lin and Wien Hong
Appl. Sci. 2023, 13(3), 1402; https://doi.org/10.3390/app13031402 - 20 Jan 2023
Cited by 1 | Viewed by 1383
Abstract
In this paper, we proposed an efficient authentication method with dual embedment strategies for absolute moment block truncation coding (AMBTC) compressed images. Prior authentication works did not take the smoothness of blocks into account and only used single embedding strategies for embedment, thereby [...] Read more.
In this paper, we proposed an efficient authentication method with dual embedment strategies for absolute moment block truncation coding (AMBTC) compressed images. Prior authentication works did not take the smoothness of blocks into account and only used single embedding strategies for embedment, thereby limiting image quality. In the proposed method, blocks were classified as either smooth or complex ones, and dual embedding strategies used for embedment. Respectively, bitmaps and quantized values were embedded with authentication codes, while recognizing that embedment in bitmaps of complex blocks resulted in higher distortion than in smooth blocks. Therefore, authentication codes were embedded into bitmaps of smooth blocks and quantized values of complex blocks. In addition, our method exploited to-be-protected contents to generate authentication codes, thereby providing satisfactory detection results. Experimental results showed that some special tampering, undetected by prior works, were detected by the proposed method and the averaged image quality was significantly improved by at least 1.39 dB. Full article
(This article belongs to the Special Issue Computational Intelligence in Image and Video Analysis)
Show Figures

Figure 1

18 pages, 3825 KiB  
Article
A Novel Video Transmission Latency Measurement Method for Intelligent Cloud Computing
by Yiliang Wu, Xue Bai, Yendo Hu and Minghong Chen
Appl. Sci. 2022, 12(24), 12884; https://doi.org/10.3390/app122412884 - 15 Dec 2022
Cited by 3 | Viewed by 2569
Abstract
Low latency video transmission is gaining importance in time-critical applications using real-time cloud-based systems. Cloud-based Virtual Reality (VR), remote control, and AI response systems are emerging use cases that demand low latency and good reliability. Although there are many video transmission schemes that [...] Read more.
Low latency video transmission is gaining importance in time-critical applications using real-time cloud-based systems. Cloud-based Virtual Reality (VR), remote control, and AI response systems are emerging use cases that demand low latency and good reliability. Although there are many video transmission schemes that claim low latency, they vary over different network conditions. Therefore, it is necessary to develop methods that can accurately measure end-to-end latency online, continuously, without any content modification. This research brings these applications one step closer to addressing these next generation use cases. This paper analyzes the cause of end-to-end latency within a video transmission system, and then proposes three methods to measure the latency: timecode, remote online, and lossless remote video online. The corresponding equipment was designed and implemented. The actual measurement of the three methods using related equipment proved that our proposed method can accurately and effectively measure the end-to-end latency of the video transmission system. Full article
(This article belongs to the Special Issue Computational Intelligence in Image and Video Analysis)
Show Figures

Figure 1

15 pages, 2880 KiB  
Article
Privacy-Preserved Image Protection Supporting Different Access Rights
by Ya-Fen Chang, Wei-Liang Tai and Yu-Tzu Huang
Appl. Sci. 2022, 12(23), 12335; https://doi.org/10.3390/app122312335 - 2 Dec 2022
Cited by 3 | Viewed by 1625
Abstract
The boom in cloud computing and social networking has led to a large number of online users in the networks. It is necessary to use appropriate privacy protection mechanisms to prevent personal privacy leakage. In general, image privacy protection techniques proceed with the [...] Read more.
The boom in cloud computing and social networking has led to a large number of online users in the networks. It is necessary to use appropriate privacy protection mechanisms to prevent personal privacy leakage. In general, image privacy protection techniques proceed with the whole image. However, the image may contain multiple users’ information, and different people’s data may require different privacy protection. In addition, cloud servers are semi-trusted instead of trusted service providers due to unknown security vulnerabilities and the possibility of grabbing sensitive data from the cloud server. In order to provide image protection with different privacy-preserved access rights and to prevent the cloud server from retrieving sensitive information from the image, we propose a privacy-preserved scheme that supports different privacy-preserved access rights in a single image based on the difficulty of solving the factorization problem and the discrete logarithm problem. The cloud server performed the image management method without knowing the privacy information contained in the protected images. By working out the proposed scheme in detail, we experimentally validate the scheme and discuss various options of access rights in practice. Full article
(This article belongs to the Special Issue Computational Intelligence in Image and Video Analysis)
Show Figures

Figure 1

18 pages, 1541 KiB  
Article
No-Reference Quality Assessment of Transmitted Stereoscopic Videos Based on Human Visual System
by Md Mehedi Hasan, Md. Ariful Islam, Sejuti Rahman, Michael R. Frater and John F. Arnold
Appl. Sci. 2022, 12(19), 10090; https://doi.org/10.3390/app121910090 - 7 Oct 2022
Cited by 2 | Viewed by 1695
Abstract
Provisioning the stereoscopic 3D (S3D) video transmission services of admissible quality in a wireless environment is an immense challenge for video service providers. Unlike for 2D videos, a widely accepted No-reference objective model for assessing transmitted 3D videos that explores the Human Visual [...] Read more.
Provisioning the stereoscopic 3D (S3D) video transmission services of admissible quality in a wireless environment is an immense challenge for video service providers. Unlike for 2D videos, a widely accepted No-reference objective model for assessing transmitted 3D videos that explores the Human Visual System (HVS) appropriately has not been developed yet. Distortions perceived in 2D and 3D videos are significantly different due to the sophisticated manner in which the HVS handles the dissimilarities between the two different views. In real-time video transmission, viewers only have the distorted or receiver end content of the original video acquired through the communication medium. In this paper, we propose a No-reference quality assessment method that can estimate the quality of a stereoscopic 3D video based on HVS. By evaluating perceptual aspects and correlations of visual binocular impacts in a stereoscopic movie, the approach creates a way for the objective quality measure to assess impairments similarly to a human observer who would experience the similar material. Firstly, the disparity is measured and quantified by the region-based similarity matching algorithm, and then, the magnitude of the edge difference is calculated to delimit the visually perceptible areas of an image. Finally, an objective metric is approximated by extracting these significant perceptual image features. Experimental analysis with standard S3D video datasets demonstrates the lower computational complexity for the video decoder and comparison with the state-of-the-art algorithms shows the efficiency of the proposed approach for 3D video transmission at different quantization (QP 26 and QP 32) and loss rate (1% and 3% packet loss) parameters along with the perceptual distortion features. Full article
(This article belongs to the Special Issue Computational Intelligence in Image and Video Analysis)
Show Figures

Figure 1

27 pages, 7410 KiB  
Article
Green Care Achievement Based on Aquaponics Combined with Human–Computer Interaction
by Wei-Ling Lin, Shu-Ching Wang, Li-Syuan Chen, Tzu-Ling Lin and Jian-Le Lee
Appl. Sci. 2022, 12(19), 9809; https://doi.org/10.3390/app12199809 - 29 Sep 2022
Viewed by 2649
Abstract
According to the “World Population Prospects 2022” released by the United Nations in August 2022, the world will officially enter an “aging society”. In order to provide the elderly with an improved quality of daily life, “health promotion” and “prevention of disease” will [...] Read more.
According to the “World Population Prospects 2022” released by the United Nations in August 2022, the world will officially enter an “aging society”. In order to provide the elderly with an improved quality of daily life, “health promotion” and “prevention of disease” will be important. With respect to care of the elderly, the concepts of “therapeutic environment” and “green care” have been explored and developed. Therefore, in this study, we combine the currently popular Internet of Things (IoT) into an aquaponics system and proposes a smart green care system (SGCS). The proposed system uses face recognition technology to record the labor and rehabilitation history of the elderly, in combination with environmental data analysis, to enable automatic control decisions for equipment in conjunction with a voice control system to reduce the obstacles faced by the elderly in operating the information system. It also uses image recognition technology to monitor and notify about plant diseases and insect pests to achieve automatic management and enhance the interaction between the elderly and the SGCS through human–computer interaction. The SGCS allows the elderly to guide it to participate in appropriate activities through direct contact with the natural environment, thereby enhancing the quality of green healing life. In this study, taking long-term care institutions as an example, we verified proof of concept (PoC), proof of service (PoS), and proof of business (PoB), confirming the feasibility of the SGCS. The SGCS proposed in this study can be successfully used in long-term care institutions and various other environments, such as medical units and home care contexts. It can take full advantage of the functions associated with the concept of “healing environment” and “green care” widely recognized by users. Therefore, it can be widely used in the field of long-term care in the future. Full article
(This article belongs to the Special Issue Computational Intelligence in Image and Video Analysis)
Show Figures

Figure 1

17 pages, 7676 KiB  
Article
FAECCD-CNet: Fast Automotive Engine Components Crack Detection and Classification Using ConvNet on Images
by Michael Abebe Berwo, Yong Fang, Jabar Mahmood, Nan Yang, Zhijie Liu and Yimeng Li
Appl. Sci. 2022, 12(19), 9713; https://doi.org/10.3390/app12199713 - 27 Sep 2022
Cited by 3 | Viewed by 1807
Abstract
Crack inspections of automotive engine components are usually conducted manually; this is often tedious, with a high degree of subjectivity and cost. Therefore, establishing a robust and efficient method will improve the accuracy and minimize the subjectivity of the inspection. This paper presents [...] Read more.
Crack inspections of automotive engine components are usually conducted manually; this is often tedious, with a high degree of subjectivity and cost. Therefore, establishing a robust and efficient method will improve the accuracy and minimize the subjectivity of the inspection. This paper presents a robust approach towards crack classification, using transfer learning and fine-tuning to train a pre-trained ConvNet model. Two deep convolutional neural network (DCNN) approaches to training a crack classifier—namely, via (1) a Light ConvNet architecture from scratch, and (2) fined-tuned and transfer learning top layers of the ConvNet architectures of AlexNet, InceptionV3, and MobileNet—are investigated. Data augmentation was utilized to minimize over-fitting caused by an imbalanced and inadequate training sample. Data augmentation improved the accuracy index by 4%, 5%, 7%, and 4%, respectively, for the proposed four approaches. The transfer learning and fine-tuning approach achieved better recall and precision scores. The transfer learning approach using the fine-tuned features of MobileNet attained better classification accuracy and is thus proposed for the training of crack classifiers. Moreover, we employed an up-to-date YOLOv5s object detector with transfer learning to detect the crack region. We obtained a mean average precision (mAP) of 91.20% on the validation set, indicating that the model effectively distinguished diverse engine part cracks. Full article
(This article belongs to the Special Issue Computational Intelligence in Image and Video Analysis)
Show Figures

Figure 1

16 pages, 2576 KiB  
Article
Breast Geometry Characterization of Young American Females Using 3D Image Analysis
by Minyoung Suh and Jung Hyun Park
Appl. Sci. 2022, 12(17), 8578; https://doi.org/10.3390/app12178578 - 27 Aug 2022
Cited by 2 | Viewed by 2864
Abstract
The current research deals with the characterization of breast geometries in young American populations. Breast measurements using 3D image analysis tools are focused on spatial assessments, such as quadrant evaluations of angle, surface area, and volume, together with traditional linear measurements. Through the [...] Read more.
The current research deals with the characterization of breast geometries in young American populations. Breast measurements using 3D image analysis tools are focused on spatial assessments, such as quadrant evaluations of angle, surface area, and volume, together with traditional linear measurements. Through the statistical analysis, different types of breast shapes and placements are clustered, and characteristic breast anthropometry was identified for each cluster. The research findings indicate that there are four shape clusters and three placement clusters. Among the American females aged 26 to 35, four different breast shapes are identified: droopy breasts (31%), small/flat breasts (19%), upward breasts (24%), and large/inward breasts (26%). Taking 36%, 44%, and 20% of the population, respectively, their breast placement characteristics are either high, medium, or low/open. Breast shapes and placement are highly associated with each other. Larger breasts are located relatively lower, while most smaller/flat breasts are positioned relatively high. Full article
(This article belongs to the Special Issue Computational Intelligence in Image and Video Analysis)
Show Figures

Figure 1

19 pages, 3184 KiB  
Article
One-Stage Disease Detection Method for Maize Leaf Based on Multi-Scale Feature Fusion
by Ying Li, Shiyu Sun, Changshe Zhang, Guangsong Yang and Qiubo Ye
Appl. Sci. 2022, 12(16), 7960; https://doi.org/10.3390/app12167960 - 9 Aug 2022
Cited by 27 | Viewed by 3250
Abstract
Plant diseases such as drought stress and pest diseases significantly impact crops’ growth and yield levels. By detecting the surface characteristics of plant leaves, we can judge the growth state of plants and whether diseases occur. Traditional manual detection methods are limited by [...] Read more.
Plant diseases such as drought stress and pest diseases significantly impact crops’ growth and yield levels. By detecting the surface characteristics of plant leaves, we can judge the growth state of plants and whether diseases occur. Traditional manual detection methods are limited by the professional knowledge and practical experience of operators. In recent years, a detection method based on deep learning has been applied to improve detection accuracy and reduce detection time. In this paper, we propose a disease detection method using a convolutional neural network (CNN) with multi-scale feature fusion for maize leaf disease detection. Based on the one-stage plant disease network YoLov5s, the coordinate attention (CA) attention module is added, along with a key feature weight to enhance the effective information of the feature map, and the spatial pyramid pooling (SSP) module is modified by data augmentation to reduce the loss of feature information. Three experiments are conducted under complex conditions such as overlapping occlusion, sparse distribution of detection targets, and similar textures and backgrounds of disease areas. The experimental results show that the average accuracy of the MFF-CNN is higher than that of currently used methods such as YoLov5s, Faster RCNN, CenterNet, and DETR, and the detection time is also reduced. The proposed method provides a feasible solution not only for the diagnosis of maize leaf diseases, but also for the detection of other plant diseases. Full article
(This article belongs to the Special Issue Computational Intelligence in Image and Video Analysis)
Show Figures

Figure 1

13 pages, 8962 KiB  
Article
A Study on the Application of Walking Posture for Identifying Persons with Gait Recognition
by Yu-Shiuan Tsai and Si-Jie Chen
Appl. Sci. 2022, 12(15), 7909; https://doi.org/10.3390/app12157909 - 7 Aug 2022
Cited by 1 | Viewed by 2005
Abstract
In terms of gait recognition, face recognition is currently the most commonly used technology with high accuracy. However, in an image, there is not necessarily a face. Therefore, face recognition cannot be used if there is no face at all. However, when we [...] Read more.
In terms of gait recognition, face recognition is currently the most commonly used technology with high accuracy. However, in an image, there is not necessarily a face. Therefore, face recognition cannot be used if there is no face at all. However, when we cannot obtain facial information, we still want to know the person’s identity. Thus, we must use information other than facial features to identify the person. Since each person’s behavior will be somewhat different, we hope to learn the difference between one specific human body and others and use this behavior to identify the human body because deep learning technology advances this idea. Therefore, we used OpenPose along with LSTM for personal identification. We found that using people’s walking posture is feasible for identifying their identities. Presently, the environment for making judgments is limited, in terms of height, and there will be restrictions on distance. In the future, using various angles and distances will be explored. This method can also solve the problem of half-body identification and is also helpful for finding people. Full article
(This article belongs to the Special Issue Computational Intelligence in Image and Video Analysis)
Show Figures

Figure 1

Back to TopTop