Hierarchical Image Quality Improvement Based on Illumination, Resolution, and Noise Factors for Improving Object Detection

Wang, Tae-su; Kim, Gi-Tae; Shin, Jungpil; Jang, Si-Woong

doi:10.3390/electronics13224438

Open AccessArticle

Hierarchical Image Quality Improvement Based on Illumination, Resolution, and Noise Factors for Improving Object Detection

¹

Department of Computer Engineering, Dong-Eui University, Busan 47340, Republic of Korea

²

School of Computer Science and Engineering, The University of Aizu, Aizuwakamatsu, Fukushima 965-8580, Japan

³

Institute of Smart Automobile Research, Dong-Eui University, Busan 47340, Republic of Korea

^*

Author to whom correspondence should be addressed.

Electronics 2024, 13(22), 4438; https://doi.org/10.3390/electronics13224438

Submission received: 7 October 2024 / Revised: 29 October 2024 / Accepted: 5 November 2024 / Published: 12 November 2024

(This article belongs to the Special Issue Machine Learning and Deep Learning Based Pattern Recognition)

Download

Browse Figures

Versions Notes

Abstract

:

Object detection performance is significantly impacted by image quality factors such as illumination, resolution, and noise. This paper proposes a hierarchical image quality improvement process that dynamically prioritizes these factors based on severity, enhancing detection accuracy in diverse conditions. The process evaluates each factor—illumination, resolution, and noise—using discriminators that analyze brightness, edge strength, and noise levels. Improvements are applied iteratively with an adaptive weight update mechanism that adjusts factor importance based on improvement effectiveness. Following each improvement, a quality assessment is conducted, updating weights to fine-tune subsequent adjustments. This allows the process to learn optimal parameters for varying conditions, enhancing adaptability. The image improved through the proposed process shows improved quality through quality index (PSNR, SSIM) evaluation, and the object detection accuracy is significantly improved when the performance is measured using deep learning models called YOLOv8 and RT-DETR. The detection rate is improved by 7% for the ‘Bottle’ object in a high-light environment, and by 4% and 2.5% for the ‘Bicycle’ and ‘Car’ objects in a low-light environment, respectively. Additionally, segmentation accuracy saw a 9.45% gain, supporting the effectiveness of this method in real-world applications.

Keywords:

hierarchical improvement; image quality; illumination; resolution; noise

1. Introduction

In recent years, object detection has played a crucial role in various applications such as autonomous driving, surveillance systems, and industrial automation. As a result, it has become a central focus of ongoing research in the fields of computer vision and artificial intelligence [1,2,3,4,5,6,7,8,9,10]. Object detection models increasingly rely on high-quality input images, making image quality factors such as illumination, resolution, and noise even more critical in determining detection accuracy [11]. These factors significantly impact the effectiveness of detection algorithms, as issues in illumination, resolution, and noise can distort object features and degrade overall performance [12,13].

Illumination is a crucial factor that defines the shape and features of objects in an image. Variations in illumination, such as low-light conditions or overexposure, can blur object features, reducing detection accuracy. For instance, surveillance systems face significant detection challenges due to lighting differences between day and night. Resolution also plays an important role, as low-resolution images cause objects to appear blurry, reducing the detectability of fine details and small or distant objects. In the case of autonomous vehicles, inadequate resolution can severely hinder detection of distant obstacles. Noise, often caused by image acquisition devices or environmental factors, distorts pixel values, making it challenging to differentiate between objects and backgrounds, especially in low-light environments [14,15,16,17].

While existing image enhancement methods often address individual aspects, such as illumination correction or resolution enhancement, these single-factor approaches may overlook the complex, interdependent nature of image quality factors [18,19]. Recent studies have attempted to tackle these interdependencies with multi-factor enhancement methods. For example, dual-encoding Y-ResNet has been introduced to mitigate issues like lens flare, which disrupts object boundaries, while GAN-based approaches have been used to generate synthetic images with controlled defects, training models with diverse data. These advancements indicate the ongoing need for adaptive image enhancement that addresses multiple quality factors to meet the demands of diverse environments [20,21].

This paper proposes a hierarchical image quality improvement process that dynamically prioritizes and adjusts the enhancement order of illumination, resolution, and noise based on their severity levels. The process evaluates each factor’s necessity for improvement through discriminators that analyze brightness, edge strength, and noise distribution, and applies enhancements with intensity levels that are dynamically adapted. Additionally, the process incorporates an iterative learning mechanism, which adjusts weights based on quality assessment results, allowing for continuous optimization across diverse image conditions.

The contributions of this study are as follows: (1) A dynamic prioritization mechanism that determines the improvement order of illumination, resolution, and noise based on calculated severity levels, addressing the complex interplay between these factors. (2) An iterative improvement framework with adaptive weight adjustments, enabling the process to fine-tune the importance of each factor according to its impact on overall image quality and detection accuracy. (3) Extensive validation through experiments, demonstrating enhanced detection performance, especially in challenging environments with varying illumination and noise levels.

This hierarchical approach aims to improve object detection performance by addressing the most severe quality issues first, leading to more robust detection results in real-world applications.

2. Literature Review

Object detection through image quality enhancement has been an important field of research for a long time. Various techniques have been proposed to improve factors that degrade image quality, such as illumination, resolution, and noise. This section reviews major studies addressing the impact of these three factors on object detection performance and introduces recent advancements in multi-factor image enhancement techniques.

2.1. Illumination Correction for Object Detection

Illumination variation is one of the factors that most significantly affect object detection. Irregular lighting conditions such as insufficient illumination, shadows, or overexposure can greatly degrade the performance of detection systems. Various methods have been proposed to address this issue. Histogram Equalization (HE) and Adaptive Histogram Equalization (AHE) improve the contrast of an image by redistributing brightness levels, effectively correcting illumination variations [22,23]. However, these techniques often excessively enhance contrast, resulting in unnatural outcomes [24]. Contrast Limited Adaptive Histogram Equalization (CLAHE) limits the amount of contrast enhancement in local areas, producing more natural and visually pleasing results, thus overcoming the drawbacks of traditional methods [25]. Recently, deep learning-based illumination correction techniques have emerged and gained significant attention. For example, Retinex-based models work by separating the reflectance and illumination of an image, correcting the illumination while preserving details [26,27]. Additionally, methods utilizing Generative Adversarial Networks (GANs) are used to convert low-light images into bright images, thereby enhancing object detection performance [28,29]. However, these deep learning approaches primarily focus on illumination issues and have limitations in simultaneously considering other factors such as resolution and noise [30].

2.2. Resolution Enhancement for Object Detection

Resolution directly affects the ability to detect and distinguish the fine details of objects. Low-resolution images lose important structural information, making it difficult for detection algorithms to extract meaningful features. Traditional resolution enhancement methods include bicubic interpolation, Lanczos resampling, and bilinear interpolation, which are techniques that up-sample images based on neighboring pixel information [31,32,33]. However, these methods often fail to restore sharp edges and the essential high-frequency details of objects. Recent studies have focused on deep learning-based super-resolution techniques to generate high-resolution images from low-resolution ones. Techniques like SRCNN (Super-Resolution Convolutional Neural Network) and EDSR (Enhanced Deep Super-Resolution Network) have been applied to object detection, demonstrating high performance [34,35]. However, these models also tend to focus solely on resolution, often neglecting other quality issues such as noise and illumination.

2.3. Noise Reduction Techniques

Noise in images can distort pixel values and negatively affect object detection performance. Various noise reduction techniques have been developed, with several notable methods among them. Traditional noise reduction techniques, such as Gaussian smoothing, median filtering, and Wiener filtering, reduce noise by averaging pixel values or removing high-frequency components [36]. However, these methods can lead to the loss of important details. More recent techniques, like Non-Local Means (NLM) filtering and Block Matching and 3D Filtering (BM3D), utilize the self-similarity within the image to reduce noise while preserving details effectively [37,38]. Additionally, deep learning-based methods such as DnCNN (Denoising Convolutional Neural Network) and FFDNet learn complex noise patterns, demonstrating excellent noise reduction performance [39,40]. However, these techniques also have limitations in simultaneously addressing other issues such as illumination and resolution.

2.4. Multi-Factor Image Enhancement

In real-world environments, issues like insufficient illumination, reduced resolution, and noise often occur simultaneously. To address this, multi-factor image enhancement techniques have emerged, aiming to resolve multiple quality issues at once. For instance, the multi-scale Retinex algorithm has gained attention for addressing contrast enhancement and illumination issues simultaneously [41]. Some studies have also attempted to integrate noise reduction techniques to achieve better performance [42]. Frameworks that improve both resolution and noise simultaneously have also been developed. These approaches, often based on GANs, upscale low-resolution images while reducing noise [43,44]. However, these methods still have limitations, such as the need for high-quality training images and not dynamically considering the combined impact of each factor [45,46,47].

2.5. Recent Advances in Image Quality Assessment and Object Detection

In addition to traditional and multi-factor enhancement approaches, recent studies have proposed innovative methods for image quality assessment and object detection, offering new perspectives relevant to our work. For example, No-Reference Image Quality Assessment by Hallucinating Pristine Features [48] introduces a no-reference IQA method that hallucinates pristine features to evaluate degraded images without requiring original reference images. This technique could complement our approach by offering an alternative means of assessing quality, particularly in cases where original images are unavailable for comparison. Integrating this method could potentially enhance the assessment accuracy of noise and distortion in our framework.

Further, object detection models have seen significant advancements, such as the recent findings in DETRs Beat YOLOs on Real-time Object Detection [49]. This study highlights the strengths of DETR models over YOLO models in specific contexts, particularly in terms of accuracy and robustness. While our study utilizes YOLOv8 due to its established performance in real-time applications, comparing our method’s performance on DETR-based models could offer additional insights into the applicability of our hierarchical improvement process across different detection architectures.

Lastly, Pixel-inconsistency Modeling for Image Manipulation Localization [50] provides a technique to detect pixel-level inconsistencies for identifying manipulated content. Although this work primarily addresses image manipulation, the concept of pixel-level analysis aligns with our goal of detecting subtle artifacts introduced by noise and resolution adjustments. Incorporating these techniques could further refine our process, minimizing unintended artifacts during enhancement.

These studies underscore the rapid progress in image quality assessment and object detection, providing valuable reference points for our hierarchical improvement model. Our proposed method is positioned to extend these advances by dynamically addressing the simultaneous challenges posed by illumination, resolution, and noise, prioritizing improvement based on severity to achieve optimal image quality for object detection applications.

3. Environmental Factors and Clustering

To optimize object recognition performance, understanding the environmental factors that impact image quality is crucial. Each factor influences the visual quality of an image and, consequently, affects object recognition performance. This section explains the key environmental factors that impact image quality and discusses a conceptual clustering approach to group these factors systematically, providing a structured foundation for the image quality enhancement strategy.

3.1. Environmental Factors Affecting Image Quality

The main environmental factors that degrade image quality include illumination, resolution, noise, color distortion, contrast, blur, compression artifacts, and other factors like weather, viewing angle, and object size. Each factor uniquely affects the visual quality of an image and plays a critical role in determining the accuracy of object detection.

Illumination: Illumination affects the brightness and color of an image. Depending on the lighting conditions, the appearance of an object may vary, and in dark environments, the object may not be visible enough. Excessively bright or dark lighting can cause the loss of object details or create shadows, making object recognition difficult. Uneven illumination can cause specific areas to be overly bright or dark, confusing recognition performance.
Resolution: Resolution refers to the number of pixels in an image and indicates how well it can represent the details of the image, determining its sharpness and detail. Low resolution causes the image to appear blurry, making pixels noticeable, which hinders the representation of object details and lowers object recognition performance. Conversely, higher resolution results in a sharper image that can capture fine details.
Noise: Noise is unnecessary signals or information caused by imperfections in the image sensor, compression/transmission processes, or shooting environment. Noise degrades the image’s clarity and makes the object’s boundaries unclear, distorting the original image data and reducing object recognition accuracy. Especially in low-light environments, noise occurs more severely, further degrading image quality.
Color Distortion: Color distortion refers to a phenomenon where the original color tone of an image is damaged and displayed differently. This can occur due to various factors such as illumination, camera settings, lens quality, and compression processes. When colors are distorted, the performance of color-based object recognition is reduced. For example, errors can occur in classification tasks based on specific colors.
Contrast: Contrast refers to the difference between the brightest and darkest parts of an image. Low contrast reduces the difference between light and dark areas, making the image look flat and lacking in detail. This makes it difficult to distinguish between the object and the background, reducing object recognition performance. On the other hand, too much contrast can create excessive boundaries, causing the loss of important information in the image.
Blur: Blur is the phenomenon where an image appears blurry, which can occur for various reasons such as movement, focus mismatch, and lens quality issues. Severe blur makes the object’s boundaries and details unclear, reducing recognition performance. This is especially problematic in the recognition of small objects or in boundary-based recognition.
Compression Artifacts: Compression artifacts refer to the phenomenon where data loss occurs during the image compression process, damaging the original image. This mainly occurs in lossy compression formats such as JPEG. Compression artifacts can cause blocky artifacts or noise in the image, distorting the shape of the object and reducing recognition performance. Especially in highly compressed images, boundaries may become smeared, or color distortion may occur.
Distortion: Distortion refers to abnormal deformation that occurs due to lens asphericity, incorrect focus, uneven illumination, or image processing. It causes the shape of the object to appear differently than it actually is. When an object is distorted, the recognition system has difficulty perceiving the object’s actual form, leading to performance degradation, especially in shape-based classification.
Exposure: Exposure refers to the amount of light that the image sensor is exposed to. Too much exposure causes overexposure, while too little causes underexposure. Overexposure can result in areas that are too bright, losing detail in those pixels, while underexposure can turn dark areas black, losing detail. This can cause important features of the object to be missed, making information indistinguishable and reducing recognition performance.
Viewing Angle: Viewing angle refers to the angle at which the camera captures the object. Images captured from the front, side, or oblique angles can all look different. When the viewing angle changes, the shape, size, and even the identifiable features of the object can change. In particular, objects captured from angles not included in the training data are difficult to recognize.
Background Complexity: Background complexity refers to how complex and varied the background around the object is. A complex background makes it difficult to distinguish between the object and the background, reducing recognition performance. The problem is exacerbated when the color or texture of the object is similar to the background.
Object Position and Size: The position and size of the object in the image are important factors for object recognition. If the object is at the edge of the image or very small, it is difficult for the recognition system to accurately detect and classify the object. If the object is too large, some parts may be cut off, leading to the loss of important features.
Camera Settings: Camera settings include focus, shutter speed, and ISO sensitivity, which directly affect image quality. Incorrect camera settings can degrade image quality, negatively affecting object recognition performance. For example, if the focus is incorrect, the image will appear blurry, and too-high ISO sensitivity increases noise.
Object Characteristics: The characteristics of the object itself, such as reflectivity, transparency, color, and texture, affect recognition performance. Reflective objects can reflect light, distorting their shape, and transparent objects are difficult to distinguish from the background, making recognition difficult. A complex texture can make it difficult to distinguish the object’s boundaries.
Weather: Weather conditions affect the environment in which the image is captured. For example, rain, snow, fog, or sunlight can affect image quality. Rain or snow obstructs the view, fog reduces image sharpness, and strong sunlight can cause overexposure. All these factors can reduce object recognition performance.
Movement Speed: The movement speed of the object or the camera can cause blur in the image. Fast-moving objects cause blur in the image, blurring the boundaries and reducing object recognition performance.
Occlusion: Occlusion refers to the phenomenon where an object is partially covered by another object, making only part of the object visible. When only part of an object is visible, it becomes difficult for the object recognition system to accurately recognize the entire object. Problems arise especially when important features are obscured.
Reflection: Reflection refers to the phenomenon where light or images of other objects around the object are reflected on the surface. Reflected light or images can distort the shape or color of the object, causing confusion in recognition performance. This issue can arise especially when the surface of glass or objects is shiny.
Specular Highlighting: Specular highlighting refers to the phenomenon where light is strongly reflected on the surface of a glossy object, causing specific areas to appear very bright. This can distort the color information or shape of the object. The recognition system may mistake these bright spots or areas as part of the object.
Contour Confusion: Contour confusion refers to the phenomenon where the object’s contours appear ambiguous or similar to the background, especially in complex backgrounds. When the boundary between the object and the background becomes unclear, it becomes difficult to recognize the object’s shape. This can lead to performance degradation, especially in boundary-based recognition algorithms.
Object Surface Texture: The surface texture of an object is an important factor in forming the object’s features in an image. If the surface texture is complex or irregular, the shape of the object may become unclear, making recognition difficult. Conversely, a simple texture is advantageous for object recognition.

These environmental factors each have a significant impact on image quality and, as a result, directly or indirectly affect the performance of deep learning-based object detection systems. Understanding and controlling these factors are key to achieving high detection performance.

3.2. Selection of Environmental Factors (Clustering)

Improving image quality for object detection requires a targeted approach to address relevant environmental factors based on their impact. Due to the interdependencies among these factors, clustering similar factors together can help in systematically addressing them and prioritizing those with the most substantial influence on recognition performance.

The clustering process here is conceptual rather than algorithmic, aiming to categorize environmental factors into five groups (or “clusters”) based on their functional impact on image quality and object recognition. This allows us to focus enhancement efforts on clusters with the highest influence on quality, specifically the Image Quality Factors cluster.

In this study, the factors are grouped into the following clusters (illustrated in Figure 1).

Cluster 1: Image Quality Factors

This cluster includes the factors that most directly affect the visual quality of an image. Specifically, it includes illumination, resolution, noise, contrast, blur, compression artifacts, color distortion, exposure, distortion, specular highlighting, and object surface texture. These factors are crucial in determining the overall visual quality of an image and play a key role in the process by which an object recognition system processes images. Illumination adjusts the brightness and color of the image to clarify the shape and boundaries of objects, while resolution expresses the details of the image. Noise distorts the boundaries or details of objects. Contrast and color distortion affect the clarity and color representation of the image, and compression artifacts can degrade quality due to losses occurring during the image compression process. Specular highlighting and object surface texture define the visual characteristics of objects. These factors work together to determine the overall quality of an image.

This cluster is a group of factors that need to be directly improved to enhance image quality, and improving these factors can significantly improve object recognition performance.

Cluster 2: Shooting Conditions and Settings Factors

This cluster includes factors related to the physical environment and camera settings at the time of image capture. Specifically, it includes illumination, camera settings, exposure, viewing angle, weather, reflection, and specular highlighting. In this cluster, illumination determines the amount of light in the shooting environment, and camera settings include factors such as focus, shutter speed, and ISO sensitivity. These settings have a significant impact on image quality, and appropriate settings are necessary to prevent issues such as blur or overexposure. Weather and reflection represent the impact of the external environment on image quality, and weather conditions like strong sunlight, rain, or snow can significantly degrade image quality.

The factors in this cluster are mainly variables directly related to the shooting environment, and strategies are needed to minimize image quality degradation caused by environmental factors.

Cluster 3: Scene Composition and Object-Related Factors

This cluster consists of factors related to the composition of objects and the background within an image. It includes background complexity, object position and size, object characteristics, occlusion, contour confusion, object surface texture, and reflection. Background complexity is a crucial factor in distinguishing between the object and the background; the more complex the background, the lower the object recognition performance. The position and size of the object determine the difficulty of recognition depending on how the object is positioned in the image and its size. Small objects are difficult to detect, and large objects can be partially cut off, hindering recognition.

This cluster focuses on scene composition and the physical characteristics of objects, and improving these factors can contribute to more clearly distinguishing between the object and the background, thereby enhancing recognition performance.

Cluster 4: Movement and Dynamism Factors

This cluster consists of factors that affect movement or dynamic situations within an image. It includes movement speed, blur, occlusion, viewing angle, and distortion. Movement speed refers to the speed at which the camera or object moves; fast movement causes blur, making object recognition difficult. Occlusion represents the impact on recognition performance when an object is obscured by another object. If an important part is obscured, the overall recognition performance of the object may be reduced.

This cluster plays an important role in developing strategies to improve object recognition in environments with a lot of movement or change.

Cluster 5: Environmental and External Condition Factors

Lastly, this cluster focuses on how image quality is affected by external environments and shooting conditions. It includes weather, illumination, background complexity, color distortion, noise, and compression artifacts. Weather and illumination are variables caused by the external environment. Lighting conditions can vary depending on the weather, which has a significant impact on the quality of the image. Noise and compression artifacts are factors that can be caused by external conditions and shooting equipment, which can reduce the reliability of the image.

This cluster focuses on controlling and optimizing factors caused by the environment to improve shooting conditions.

By organizing factors into these clusters, we enable a targeted improvement approach. For instance, focusing on the Image Quality Factors cluster directly addresses visual quality, which can offset or amplify the effects of other factors, improving recognition performance in diverse conditions.

Importance of Adopting the Image Quality Factors Cluster

The Image Quality Factors cluster directly impacts visual quality and is thus prioritized for improvement. This cluster’s elements (illumination, resolution, noise, etc.) are the most visually apparent and foundational for clear and accurate object recognition. By focusing on this cluster, we can significantly enhance recognition performance while reducing the impact of other clustered factors, such as movement or environmental conditions.

4. Hierarchical Improvement Process Design for Environmental Factors

The hierarchical improvement proposed in this paper refers to the method of sequentially enhancing various factors according to their priority to improve image quality. This means that the three main environmental factors—illumination, resolution, and noise—are each independently evaluated and then sequentially improved in a hierarchical structure that considers their interactivity. Since the environmental factors of illumination, resolution, and noise interact with each other and have low independence, it is crucial to clearly define the criteria for determining whether each factor needs improvement. This chapter explains the criteria for each factor and proposes specific methods to determine whether improvement is needed. By doing so, a process is designed to identify and improve environmental factors in the input image.

Figure 2 illustrates the hierarchical improvement process for environmental factors, representing the overall procedure for dynamically enhancing the quality of the original image. The process is divided into three main parts: “Environmental Factor Identification”, “Hierarchical Image Enhancement”, and “Quality Assessment and Weight Update”.

Figure 3 provides an overall flowchart of the improvement process, breaking down the content of the diagram into detailed steps. It covers the process from the beginning to the output of the final improved image. The improvement process starts with the analysis of the input image data. This data goes through the Environmental Factor Identification phase, where three discriminators (illumination, resolution, and noise) perform a characteristic analysis of each environmental factor.

In the Environmental Factor Identification phase, the Illumination Discriminator calculates the brightness deviation and contrast of the image to determine whether illumination improvement is needed. The Resolution Discriminator analyzes the image size (number of pixels), edge strength, and texture uniformity to assess the need for resolution enhancement. The Noise Discriminator calculates the noise distribution and high-frequency noise ratio to evaluate the necessity of noise reduction. The result of each discriminator extracts whether improvement is needed and its severity.

For factors deemed “necessary for improvement”, severity is calculated and used to set the priority and intensity of the improvement process. For instance, if the Illumination Discriminator determines that brightness or contrast enhancement is necessary, the severity of that factor is calculated, setting the illumination severity. Similarly, the severity of resolution and noise is calculated based on their respective discriminators. If “improvement is unnecessary”, no enhancement is performed for that factor, and the severity is set to 0.

The next step is Hierarchical Image Enhancement. In this phase, the improvement priority and intensity for each factor are set based on the necessity and severity obtained from the discriminators. The higher the severity, the higher the priority of the factor. The severity values are generally designed to range between 0 and 1 for most images, enabling a balanced response across a wide range of typical image conditions. However, in extreme cases—such as images with very low brightness, significant noise, or exceptionally low resolution—severity values may exceed this range, triggering the highest level of intensity for improvement. The intensity of improvement is divided into three levels as follows:

Intensity 1: [0 < severity < 0.3]
Intensity 2: [0.3 ≤ severity < 0.6]
Intensity 3: [0.6 ≤ severity]

This structured intensity division ensures that minor issues receive lighter processing, while severe degradation in extreme cases activates more intensive enhancement procedures, optimizing image quality across varying conditions.

Improvements are performed sequentially, starting with the factor with the highest priority, and the improvement process is applied in order to illumination, resolution, and noise. For example, if illumination has the highest severity among the three factors, illumination enhancement is performed first. The Illumination Enhancement phase optimizes the brightness and contrast of the image using illumination correction techniques. The Resolution Enhancement phase applies resolution correction algorithms to better represent image details. The Noise Reduction phase applies noise removal algorithms to reduce noise in the image.

The subsequent step involves Quality Assessment and Weight Update. In this phase, the Quality Assessment Module measures the degree of improvement after each enhancement step, and the weights are dynamically updated based on the results. Based on the quality assessment results, it is determined whether the improvement was successful, and if necessary, learning parameter values are adjusted to reflect the dynamically changed weight values in the next improvement process. This enhances the efficiency of future improvement processes and applies the optimal improvement strategy by adjusting the learning parameter values, thereby improving image quality through enhancement. If there are no more factors requiring improvement or the improvement process exceeds the specified number of iterations, the final improved image is output, and the process terminates.

4.1. Criteria for Improvement Necessity of Each Factor

In situations where the independence among environmental factors is low, it is crucial to establish clear criteria for accurately determining the need for improvement. Illumination, resolution, and noise are key factors that affect the pixel environment of the entire image. Although they do not directly cause the loss of physical information in the image, they can indirectly degrade object recognition performance. For this reason, it is essential to establish clear criteria for determining the necessity of improvement for each factor. This study proposes the following specific criteria to determine the necessity for improvement of each factor:

4.1.1. Criteria for Illumination Improvement Determination

Illumination has a significant impact on the overall brightness and contrast of an image, and improper illumination can reduce the accuracy of object recognition. To determine the need for improvement of the illumination factor, the following criteria are proposed:

Brightness Deviation

The brightness deviation of an image is measured by calculating the average brightness of the entire image and then determining the deviation from the ideal brightness value (the median value, 128). A larger deviation indicates that the image is either too dark or too bright, which is considered a signal that improvement is needed. To calculate the brightness deviation, first compute the average brightness value of all pixels, and then find the absolute difference between this value and the median (128). For example, if the average brightness is 100, the brightness deviation is “|100 − 128| = 28”. If this deviation value exceeds a certain threshold, it can be determined that illumination improvement is necessary. Equation (1) represents the formula for calculating brightness deviation, where μ_B is the mean value of image brightness (the median), and L is the maximum brightness value (typically 256 for an 8-bit image). Equation (2) is the formula for determining the necessity of brightness improvement. By using a user-defined brightness deviation threshold, T_b, we can consider the central concentration of the brightness histogram to determine the need for improvement.

B r i g h t n e s s D e v i a t i o n = |μ_{B} - \frac{L}{2}|

(1)

N e e d I m p r o v e B r i g h t n e s s = \{\begin{matrix} 1 i f B r i g h t n e s s D e v i a t i o n > T_{b} \\ 0 i f B r i g h t n e s s D e v i a t i o n \leq T_{b} \end{matrix}

(2)

Contrast

Contrast is an indicator that represents the difference between the bright and dark areas of an image. By analyzing the overall contrast level of the image, we can determine whether there is an illumination imbalance. Low contrast may indicate improper illumination or excessive shadows. On the other hand, excessively high contrast can lead to the loss of image details. To evaluate the contrast level of an image, a specific threshold is set, and if the contrast deviates from this threshold, it is considered a signal that improvement is needed. Equation (3) is the formula for measuring contrast, where I_max and I_min represent the maximum and minimum brightness values of the image, respectively. Equation (4) is the formula for determining the necessity of contrast improvement. Using a user-defined contrast threshold, T_c, we can assess the need for improvement.

C o n t r a s t = \frac{I_{m a x} - I_{m i n}}{I_{m a x} + I_{m i n}}

(3)

N e e d I m p r o v e C o n t r a s t = \{\begin{matrix} 1 i f C o n t r a s t < T_{c} \\ 0 i f C o n t r a s t \geq T_{c} \end{matrix}

(4)

In the end, for the illumination factor, an “OR” gate is applied to the results of the brightness deviation and contrast improvement necessity determinations (True/False) to ultimately decide whether improvement is required.

4.1.2. Criteria for Resolution Improvement Determination

Resolution plays a crucial role in representing the details of an image, and low-resolution images can negatively impact object recognition performance. The criteria for determining the necessity of resolution improvement are as follows:

Image Size (Pixel Count)

Image size or pixel count is a direct indicator of resolution and is used to detect low-resolution images. If the horizontal or vertical resolution (pixel count) is lower than the set minimum resolution threshold, it is considered low resolution and requires improvement. For example, if the horizontal or vertical resolution of an image falls below 256 pixels, it is considered low resolution, signaling the need for upscaling. Equation (5) is the formula for measuring the size of the input image, where W and H represent the width and height of the image, respectively. Equation (6) is the formula for determining the necessity of image size improvement. Using a user-defined image size threshold, T_i, we can assess the need for improvement.

I m a g e S i z e = W \times H

(5)

N e e d I m p r o v e I m a g e S i z e = \{\begin{matrix} 1 i f I m a g e S i z e < T_{i} \\ 0 i f I m a g e S i z e \geq T_{i} \end{matrix}

(6)

Edge Strength

Edges in the image are detected to evaluate sharpness (strength). If edges appear indistinct and blurry, it can be determined that the resolution is low. To achieve this, edge detection algorithms like the Laplacian filter are used to quantify the sharpness of the image. A specific threshold is set, and if the sharpness value falls below this threshold, it is decided that sharpness enhancement is necessary. Equation (7) is the formula for measuring edge strength, where I represents the input image. Equation (8) is the formula for determining the necessity of edge strength improvement. Using a user-defined edge strength threshold, T_e, we can assess the need for improvement.

E d g e S t r e n g t h = m e a n (|L a p l a c i a n (I)|)

(7)

N e e d I m p r o v e E d g e = \{\begin{matrix} 1 i f E d g e S t r e n g t h < T_{e} \\ 0 i f E d g e S t r e n g t h \geq T_{e} \end{matrix}

(8)

Texture Uniformity

Texture uniformity is an indicator that evaluates how consistently and evenly textures are distributed within an image. In cases of low resolution, the details of the texture may appear unclear or blurred, resulting in decreased texture uniformity. To assess this, the Local Binary Pattern (LBP) method is used to analyze the texture of the image. If the texture is found to be non-uniform and irregularly distributed, it is determined that resolution enhancement is necessary [51]. This serves as an important criterion for determining the need for resolution improvement and helps to express image details more clearly. Equation (9) is the formula for measuring texture uniformity. Equation (10) is the formula for determining the necessity of texture improvement. Using a user-defined texture uniformity threshold, T_t, we can assess the need for improvement.

T e x t u r e U n i f o r m i t y = m e a n (L B P (I))

(9)

N e e d I m p r o v e T e x t u r e = \{\begin{matrix} 1 i f T e x t u r e U n i f o r m i t y < T_{t} \\ 0 i f T e x t u r e U n i f o r m i t y \geq T_{t} \end{matrix}

(10)

In the end, for the resolution factor, an “AND” gate is applied to the results of the image size, edge strength, and texture improvement necessity determinations (True/False) to ultimately decide whether improvement is required. Specifically, the resolution factor applies an “AND” gate instead of an “OR” gate. This is to prevent indiscriminate image upscaling and to avoid overhead caused by a drastic increase in computational load during the improvement process.

4.1.3. Criteria for Noise Improvement Determination

Noise is a major factor that degrades image quality, and a high level of noise can negatively impact object recognition performance. The criteria for determining the necessity of noise improvement are as follows:

Noise Distribution (BRISQUE Score)

To quantitatively evaluate noise distribution, the BRISQUE (Blind Referenceless Image Spatial Quality Evaluator) algorithm is used [52]. BRISQUE is an objective metric designed to evaluate the visual quality of an image without any reference to the original image. It is sensitive to various quality degradation factors, especially noise, making it effective for measuring image noise. This algorithm extracts features based on local contrast normalization and the Natural Scene Statistics (NSS) of the image and predicts an image quality score through a Support Vector Regression (SVR) model. It ultimately quantifies the degree of image distortion to produce a quality score. Generally, a higher BRISQUE score indicates lower image quality and a higher amount of noise. In this paper, the BRISQUE score is used as an indicator for measuring noise, and it is compared against a specific threshold. If the score exceeds this threshold, it is determined that noise removal is necessary. Equation (11) represents the BRISQUE score for measuring noise distribution, and Equation (12) is the formula for determining the necessity of noise improvement. Using a user-defined noise distribution threshold, T_n, we can assess the need for improvement.

B R I S Q U E S c o r e = B R I S Q U E (I)

(11)

N e e d I m p r o v e N o i s e D i s t r i b u t i o n = \{\begin{matrix} 1 i f B R I S Q U E S c o r e > T_{n} \\ 0 i f B R I S Q U E S c o r e \leq T_{n} \end{matrix}

(12)

High-Frequency Noise Ratio

In the frequency domain of the image, high-frequency noise components are detected to evaluate the level of noise. High-frequency components include not only the details of the image but also noise, and the amount of noise in the high-frequency region can be measured through frequency analysis. Using the Fourier transform, the frequency spectrum of the image is analyzed, and if an excessive amount of noise is detected in the high-frequency band, it is determined that noise improvement is needed [53]. In this case, if the detected noise exceeds the threshold, it is considered necessary for improvement. Equation (13) is the formula for measuring the high-frequency noise ratio based on the Fourier transform, where f_c represents the cutoff frequency for the high-frequency band. Equation (14) is the formula for determining the necessity of high-frequency noise improvement. Using a user-defined high-frequency noise ratio threshold, T_f, we can assess the need for improvement.

H i g h F r e q N o i s e = \frac{\sum ({|F o u r i e r T r a n s f o r m (I)|}_{> f_{c}})}{\sum (|F o u r i e r T r a n s f o r m (I)|)}

(13)

N e e d I m p r o v e H i g h F r e q N o i s e = \{\begin{matrix} 1 i f H i g h F r e q N o i s e > T_{f} \\ 0 i f H i g h F r e q N o i s e \leq T_{f} \end{matrix}

(14)

In the end, for the noise factor, an “OR” gate is applied to the results of the noise distribution and high-frequency noise improvement necessity determinations (True/False) to ultimately decide whether improvement is required.

Based on the above determination criteria, each discriminator performs a quantitative evaluation of the environmental factors to measure severity.

4.2. Environmental Factor Discriminators: Severity Calculation

Environmental factor discriminators evaluate factors that directly affect image quality, such as illumination, resolution, and noise, to determine if improvement is needed and to quantitatively calculate their severity. Severity is a key element that determines the priority and intensity of improvement tasks for each factor, making it an essential process for efficient image quality enhancement. This section provides a detailed explanation of the methods for calculating the severity of major factors like illumination, resolution, and noise.

4.2.1. Illumination Severity

Illumination severity analyzes the lighting conditions related to the overall brightness and contrast of an image to determine the necessity of improvement and calculate its severity. It directly impacts image visibility and object recognition performance, and appropriate illumination can enhance image quality.

S_{I l l u m i n a t i o n} = α_{I} \times (\frac{B r i g h t n e s s D e v i a t i o n}{T_{b}}) + β_{I} \times (\frac{T_{c} - C o n t r a s t}{T_{c}})

(15)

Equation (15) is the formula for calculating the severity of the illumination factor. The severity of the illumination factor, S_Illumination, is expressed as a weighted sum of brightness deviation and contrast. Here, α_I and β_I are the weights that reflect the importance of each element in the illumination factor. These weights are used to relatively evaluate the impact of brightness deviation and contrast on illumination severity. Through this weighted sum, the necessity for illumination improvement is quantified, and the priority and intensity of illumination enhancement tasks are determined based on the severity.

4.2.2. Resolution Severity

Resolution severity evaluates resolution-related elements such as image sharpness, size, and texture uniformity to determine whether resolution improvement is necessary and to calculate its severity. Resolution is a crucial factor in representing the details of an image and identifying objects.

S_{R e s o l u t i o n} = α_{R} \times (\frac{T_{i} - I m a g e S i z e}{T_{i}}) + β_{R} \times (\frac{T_{e} - E d g e S t r e n g t h}{T_{e}}) + γ_{R} \times (\frac{T_{t} - T e x t u r e U n i f o r m i t y}{T_{t}})

(16)

Equation (16) is the formula for calculating the severity of the resolution factor. The severity of the resolution factor, S_Resolution, is expressed as a weighted sum of the analysis results of image size, edge strength, and texture uniformity. Here, α_R, β_R, and γ_R are the weights that reflect the importance of each element of the resolution factor. These weights are used to relatively evaluate the impact of image size, edge strength, and texture uniformity on resolution severity. Through this weighted sum, the necessity and severity of resolution improvement are quantified, and based on this value, the priority and intensity of the improvement tasks are determined.

4.2.3. Noise Severity

Noise severity evaluates the level of noise present in the image to measure the necessity and severity of improvement. Noise degrades the visual quality of the image and can negatively impact object detection.

S_{N o i s e} = α_{N} \times (\frac{T_{n} - B R I S Q U E S c o r e}{T_{n}}) + β_{N} \times (\frac{T_{f} - H i g h F r e q N o i s e}{T_{f}})

(17)

Equation (17) is the formula for calculating the severity of the noise factor. The severity of the noise factor, S_Noise, is expressed as a weighted sum of the analysis results of noise distribution (BRISQUE score) and high-frequency noise ratio. Here, α_N and β_N are the weights that reflect the importance of each element in the noise factor. These weights are used to relatively evaluate the impact of noise distribution and high-frequency noise ratio on severity. This weighted sum quantifies the severity of noise, and based on this value, the priority and intensity of the improvement tasks are determined.

The Role of Weights in Severity Calculation Formulas
By using weights in the severity calculation formulas for each factor, the relative importance of each element affecting the severity can be reflected.

Reflecting the Importance of Factors:
Not all factors may have the same level of importance. Weights adjust the impact of each element on the overall severity, helping to determine which element plays a more critical role. For instance, in the case of resolution, if edge strength is more important than texture uniformity, a higher weight can be assigned to β_R.

2.: Establishing a Flexible Improvement Strategy:
By adjusting the weights, a flexible improvement strategy that can adapt to various image characteristics and situations can be developed. This helps identify which element needs more focused improvement in a specific environment. For example, in certain images, the proportion of darkness may be a more significant issue than brightness deviation in illumination. By adjusting the weights, these differences can be accounted for.

4.3. Image Improvement Strategy

4.3.1. Setting Thresholds for Each Element

Thresholds for each element are crucial criteria for determining the need for improvement of each factor. These thresholds are set based on general image quality assessments and empirical data and have a direct impact on the characteristics of the image and the accuracy of object detection. Setting appropriate thresholds is vital for enhancing the effectiveness of the improvement process, allowing it to adapt to various image environments.

For illumination, the following two elements were considered. First, the threshold T_b for brightness deviation is set to 64. This value indicates how much the overall brightness of the image deviates from the median value of 128. A value of 64 is chosen to detect noticeable illumination imbalances, and if the deviation exceeds this range, illumination improvement is deemed necessary. Secondly, the threshold T_c for contrast is set to 50. Contrast represents the difference between the bright and dark areas of an image, and 50 is set as a criterion to distinguish between images with sufficient contrast and those without. This value is empirically derived from various images, and if the contrast is below 50, illumination improvement is considered necessary.

For resolution, the threshold T_i for image size is set to 4,000,000 pixels. This corresponds to a resolution of approximately 2000 × 2000, and images below this resolution are more likely to lose the detailed features of objects, so this serves as the basis. The threshold T_e for edge strength is set to 15, which is derived as the average edge strength calculated using the Laplacian filter. If the edge strength is below 15, it is determined that the image lacks sharpness, indicating a need for resolution enhancement. Lastly, the threshold T_t for texture uniformity is set to 15, based on an analysis using the Local Binary Pattern (LBP). This value is set as a criterion for determining whether the texture of the image is sufficiently expressed.

For noise, thresholds are set based on two elements. First, the threshold T_n for the BRISQUE score is set to 30. The BRISQUE score evaluates the overall quality of the image, with higher scores indicating lower quality. The value of 30 is chosen as a criterion that can objectively detect quality degradation in various images. Secondly, the threshold T_f for the high-frequency noise ratio is set to 0.1. High-frequency noise appears in the high-frequency components of an image and is measured through frequency analysis. This value indicates the proportion of high-frequency noise in the total frequency, and if it exceeds this value, noise improvement is considered necessary.

4.3.2. Initial Weight Setting

Initial weights are set to reflect the relative importance of each element. These weights quantify the impact of each element on image quality and play a crucial role in enhancing the effectiveness of the improvement process. They are established based on various experiments and empirical data and can be adjusted depending on the situation.

In the illumination severity formula, the weights for Brightness Deviation and Contrast were set equally at 0.5. Brightness Deviation (α_I = 0.5) indicates how much the brightness state of the image deviates from the median value. Since maintaining illumination balance is important, a high weight was assigned to this element. Contrast (β_I = 0.5) represents the difference between the bright and dark areas of an image, and sufficient contrast helps in more clearly recognizing objects. Therefore, this element was also given an equal weight when considering illumination improvement.

For the resolution severity formula, weights were assigned to three elements. Image Size (α_R = 0.4) was assigned a weight of 0.4 because the size of the image is important for representing the details of objects. This means that while resolution has a significant impact on image quality, it must be considered in balance with other elements. Edge Strength (β_R = 0.4) reflects the sharpness and detail of the image and plays a crucial role in object identification, so it was also assigned a weight of 0.4. Texture Uniformity (γ_R = 0.2) is important for expressing the texture of the image, but it is relatively less important than the other elements, so a weight of 0.2 was set to control its influence on texture improvement.

In the noise severity formula, weights were set for the BRISQUE Score (α_N = 0.6) and High-Frequency Noise (β_N = 0.4). The BRISQUE Score evaluates the overall quality of the image, and images with high noise have higher scores. Thus, a weight of 0.6 was assigned to this element to increase sensitivity to noise. High-Frequency Noise, which appears in the high-frequency components of the image and can distort the details of the image, was assigned a weight of 0.4. This allows for the improvement process to consider various aspects of noise.

The initially set weights can be dynamically adjusted during the image improvement process and are designed to achieve optimal improvement results by applying them to various image environments.

4.3.3. Setting Improvement Intensity and Image Enhancement

In the image enhancement stage, the improvement intensity is set according to the severity of each factor, and based on this, appropriate enhancement algorithms are applied. The intensity of improvement is flexibly adjusted depending on the severity, and various algorithms are employed to achieve the optimal improvement for each level of intensity. Although more efficient or cutting-edge algorithms exist, the specific algorithms for illumination, resolution, and noise improvement in this paper are chosen for the following reasons. First, the selected algorithms have relatively low computational complexity and fast processing speeds, making them suitable for real-time image enhancement applications. Second, these algorithms provide stability and reliability across various image environments and have been proven effective in many use cases. Third, since the parameters of the selected algorithms can be adjusted, they allow fine-tuned enhancements in response to different image conditions, enabling flexible improvement strategies.

For illumination enhancement, three algorithms are applied depending on the severity level. The first intensity level (severity < 0.3) uses Histogram Equalization (HE), which improves the overall contrast by equalizing the brightness distribution. This is a simple yet effective initial enhancement method, targeting images where the lighting is relatively evenly distributed without causing significant distortion in contrast. The second intensity level (0.3 ≤ severity < 0.6) applies gamma correction, which balances the bright and dark areas of the image by adjusting mid-tone brightness, enhancing contrast without significant detail loss. A gamma value of 0.5 is used to maintain the image’s details. Finally, for the third intensity level (severity ≥ 0.6), Contrast-Limited Adaptive Histogram Equalization (CLAHE) is used, which adjusts local contrast to suppress noise while improving overall lighting. CLAHE is particularly useful in images with severely unbalanced lighting, with a clip limit of 2.0 to prevent noise amplification and achieve natural enhancement.

In resolution improvement, three levels of algorithms are applied to enhance image sharpness and detail representation depending on the severity. The first intensity (severity < 0.3) uses linear interpolation, which enlarges the image by a (1.3×) factor in width and height, adding soft details with minimal computation. Linear interpolation is well-suited for real-time applications due to its speed and smooth results. For the second intensity (0.3 ≤ severity < 0.6), bicubic interpolation is applied to enlarge the image by a (1.6×) factor. Bicubic interpolation, using information from 16 surrounding pixels, produces smoother and more natural results than linear interpolation while maintaining some level of original detail. Lastly, for the third intensity (severity ≥ 0.6), B-spline interpolation is used to enlarge the image by a (2×) factor, providing even smoother results, especially in curved or edge areas, while maximizing detail preservation. This method offers high-quality enlargement but requires more computation, making it suitable for cases of high severity.

For noise reduction, three algorithms are applied based on the severity of the noise. The first intensity level (severity < 0.3) uses Gaussian blur with a (1 × 1) kernel to gently reduce fine noise by averaging neighboring pixels. This method is simple and fast, suitable for minimal noise. At the second intensity (0.3 ≤ severity < 0.6), the non-local means filter is applied, which effectively reduces noise while preserving texture and detail by referencing the entire image. A low filter strength of three is set to accommodate the iterative improvement process. Finally, for the third intensity (severity ≥ 0.6), a stronger non-local means filter with a strength of seven is used to aggressively mitigate severe noise that significantly hinders detail, focusing on restoring critical image information.

4.3.4. Image Quality Assessment

After image enhancement, it is necessary to evaluate the quality of the enhanced image to quantitatively measure the improvement effect and adjust the weights and learning rate based on this evaluation. Quality assessment is conducted by comparing the quality changes before and after enhancement in terms of illumination, resolution, and noise.

For illumination improvement, the changes in brightness deviation and contrast are evaluated. By measuring whether the brightness deviation has decreased and the contrast has improved after enhancement, the effectiveness of the illumination improvement is confirmed. This allows us to determine whether the illumination of the image has actually improved, and the results are reflected in the weight adjustment.

In resolution improvement, the changes in edge strength and texture uniformity before and after enhancement are evaluated. Since the goal of resolution improvement is to enhance the sharpness and detail representation of the image, we measure whether edge strength has been increased and texture uniformity has been improved to verify the effect. This helps assess how well the resolution improvement has preserved the image’s details.

For noise reduction, the BRISQUE score and high-frequency noise ratio are used to measure noise reduction before and after enhancement. If the BRISQUE score decreases and the high-frequency noise ratio is reduced, it indicates that noise reduction has been performed effectively. This confirms the positive impact of noise removal on the visual quality of the image.

The results obtained through quality assessment are used to monitor the effectiveness of the improvement tasks and are employed in subsequent steps such as learning parameter adjustment and weight update. Through this process, the enhancement tasks can be iteratively learned and evolved, enabling optimization of the improvement for the characteristics of each image.

4.3.5. Weights Update: Characteristic Analysis-Based Learning Parameter Adjustment

In this session, we propose a method for updating weight values after quality assessment when image improvement tasks are repeated. This process dynamically adjusts the importance of each element and is designed to adapt to various image characteristics.

α_{I} (t + 1) = α_{I} (t) + η_{b} \times (\frac{I m p r o v e m e n t B r i g h t n e s s}{T_{b}})

(18)

Equation (18) is the formula for adjusting the weight for brightness deviation. η_b represents the learning rate for brightness deviation, and T_b denotes the threshold for brightness deviation.

β_{I} (t + 1) = β_{I} (t) + η_{c} \times (\frac{I m p r o v e m e n t C o n t r a s t}{T_{c}})

(19)

Equation (19) is the formula for adjusting the weight for contrast. η_c represents the learning rate for contrast, and T_c denotes the threshold for contrast.

α_{R} (t + 1) = α_{R} (t) + η_{i} \times (\frac{I m p r o v e m e n t I m a g e S i z e}{T_{i}})

(20)

Equation (20) is the formula for adjusting the weight for image size. η_i represents the learning rate for image size, and T_i denotes the threshold for image size.

β_{R} (t + 1) = β_{R} (t) + η_{e} \times (\frac{I m p r o v e m e n t E d g e S t r e n g t h}{T_{e}})

(21)

Equation (21) is the formula for adjusting the weight for edge strength. η_e represents the learning rate for edge strength, and T_e denotes the threshold for edge strength.

γ_{R} (t + 1) = γ_{R} (t) + η_{t} \times (\frac{I m p r o v e m e n t T e x t u r e U n i f o r m i t y}{T_{t}})

(22)

Equation (22) is the formula for adjusting the weight for texture uniformity. η_t represents the learning rate for texture uniformity, and T_t denotes the threshold for texture uniformity.

α_{N} (t + 1) = α_{N} (t) + η_{n} \times (\frac{I m p r o v e m e n t B R I S Q U E S c o r e}{T_{n}})

(23)

Equation (23) is the formula for adjusting the weight for noise distribution (BRISQUE score). η_n represents the learning rate for the BRISQUE score, and T_n denotes the threshold for the BRISQUE score.

β_{N} (t + 1) = β_{N} (t) + η_{f} \times (\frac{I m p r o v e m e n t H i g h F r e q N o i s e}{T_{f}})

(24)

Equation (24) is the formula for adjusting the weight for the high-frequency noise ratio. η_f represents the learning rate for high-frequency noise, and T_f denotes the threshold for high-frequency noise.

Weight updates are carried out after the image enhancement process based on the quality assessment results. In this update process, weights and learning rates are adjusted within a specific range to prevent them from becoming too large or too small, allowing for appropriate updating of learning parameters at a proper pace. For example, if both brightness deviation and contrast improve after performing illumination enhancement, the weights α_I and β_I for these elements are adjusted to decrease their importance. In this case, the weights are adjusted by multiplying the existing values by the learning rate, and the learning rate is decreased by multiplying the current learning rate by 0.9.

If the quality indicators do not improve after the enhancement process, the weight for the corresponding element is increased. This is done by adding the product of the current weight and the learning rate to the existing weight. The learning rate is increased by multiplying the current learning rate by 1.5. In other words, if the improvement of a specific element is achieved properly, the weight for that element is reduced to decrease its importance in future enhancement tasks. Conversely, if the improvement is insufficient, the weight for that element is increased to raise its importance.

During this adjustment process, weights and learning parameters are restricted from exceeding their upper and lower bounds. For weights, the upper limit is set to 1.0 and the lower limit to 0.1 to prevent them from becoming excessively large or small. Similarly, the learning rate is adjusted within an upper limit of 0.05 and a lower limit of 0.001.

This weight adjustment process continues until there are no more factors requiring improvement or until the improvement process exceeds the specified number of iterations. The aim is to allow the weights of each element to gradually approach their optimal values through iterative improvement tasks, building a flexible improvement process that can adapt to various images. As a result, the weights of each element dynamically change according to the situation, and with repeated enhancement tasks, the weights are optimized to achieve better image quality. This dynamic weight adjustment and iterative improvement process play a crucial role in deriving optimal enhancement results tailored to various image conditions.

Algorithm 1 concisely and systematically summarizes the hierarchical image quality improvement process proposed in this paper and intuitively expresses the iterative improvement process, quality evaluation, and weight update steps described in this paper.

Algorithm 1: Hierarchical Image Quality Improvement Process

Input: Original Image

Output: Improved Image

1: Initialize weights and learning rates for illumination, resolution, and noise factors

2: Define threshold values for severity levels and improvement intensities

3: Repeat

//Step 1: Calculate Severity for Each Factor
4: Calculate illumination severity:

5: Brightness Deviation (compare with threshold)

6: Contrast (compare with threshold)

7: Calculate resolution severity:
8: Image Size (compare with threshold)
9: Edge Strength (compare with threshold)
10: Texture Uniformity (compare with threshold)
11: Calculate noise severity:
12: BRISQUE Score (compare with threshold)
13: High-Frequency Noise Ratio (compare with threshold)
//Step 2: Sort Factors by Severity and Determine Improvement Order
14: Sort illumination, resolution, and noise factors based on calculated severity in
descending order
15: Set improvement intensity based on severity levels
//Step 3: Apply Improvement Functions Based on Severity
16: for each factor in order of severity do
17: if factor is Illumination then
18: Apply illumination improvement based on severity:
19: Intensity 1: Histogram Equalization
20: Intensity 2: Gamma Correction
21: Intensity 3: CLAHE

22: else if factor is Resolution then

23: Apply resolution improvement based on severity:

24: Intensity 1: Linear Interpolation

25: Intensity 2: Bicubic Interpolation

26: Intensity 3: B-Spline Interpolation
27: else if factor is Noise then
28: Apply noise reduction based on severity:
29: Intensity 1: Gaussian Blur
30: Intensity 2: Non-local Means Filter (low strength)
31: Intensity 3: Non-local Means Filter (high strength)
32: end if
33: end for
//Step 4: Perform Quality Assessment for Each Factor
34: Measure improvements in brightness deviation, contrast, edge strength,
texture uniformity, BRISQUE score, and high-frequency noise ratio

//Step 5: Adjust Weights and Learning Rates Based on Quality Assessment
35: Update weights for each factor based on improvement results:
36: Increase weight if improvement is insufficient
37: Decrease weight if improvement is effective
38: Adjust learning rates based on weight updates
39: Until convergence or maximum iterations reached
40: Return Improved Image

5. Experimental Results

In this section, the performance of the hierarchical improvement process is verified by applying it to various images and analyzing the results [54,55,56,57].

Figure 4 shows the result images after 10 iterations of the hierarchical improvement process.

(A) Tulips Image: For this image, the illumination was evenly distributed, so the illumination severity converged to 0, indicating that no lighting improvement was needed. For the resolution, however, the severity was high, leading to the application of the second intensity improvement (bicubic interpolation) once, followed by the third intensity improvement (B-spline interpolation) once, resulting in an increase in resolution. Initially, the noise severity was 0, so no improvement was applied, but due to the resolution enhancement process increasing the image size, the noise severity increased. After applying the first intensity improvement (Gaussian blur) once, the severity converged to 0.

(B) Rope Image: Similar to the Tulips image, the illumination severity converged to 0, as no lighting improvement was necessary. In terms of resolution, the initial severity was higher than the noise severity, so the third intensity improvement (B-spline interpolation) was applied first, leading to a reduction in severity and eventual convergence to 0. Noise severity increased significantly during the resolution enhancement process, but after applying the first intensity improvement (Gaussian blur) twice, the second intensity improvement (Non-local Means) twice, and the third intensity improvement (Stronger Non-local Means) once, the severity decreased and finally converged to 0.

(C) Apple Image: The initial noise level was low, so noise improvement was not deemed necessary. However, the illumination and resolution improvements caused an increase in noise severity. After applying the first intensity improvement (Gaussian blur) once, the noise severity decreased and converged to 0. Unlike images (A) and (B), image (C) shows an overall increase in brightness due to the illumination improvement compared to the original.

(D) Bike and (E) Car Images: These images are from low-light environments. For the Bike image, the severity was high for resolution, illumination, and noise, leading to the application of hierarchical improvements for all three factors. The severities gradually decreased and converged to 0. The improved image demonstrates enhanced visibility of the Bike object. The Car image was taken at night, making it difficult to recognize the object with the naked eye, and its initial illumination severity was extremely high due to the very low light level. The third intensity improvement (CLAHE) was applied 10 times for the illumination factor, reducing the severity, but due to the extremely low initial lighting level, the severity remained relatively high after improvement. For resolution, the third intensity improvement (B-spline interpolation) was applied once initially. The noise severity increased during the lighting and resolution improvement processes, but after repeated noise improvements, it stabilized at a lower level compared to the initial severity. The final improved image shows a dramatic increase in visibility compared to the original, allowing the Car object to be recognized by the naked eye. However, some of the original red color of the object was lost during the enhancement process, indicating the need for additional correction measures, such as chromacity preservation, to address color distortion issues in low-light image improvements.

Table 1 provides the results of the individual factors after applying the hierarchical improvement process. It confirms that the proposed process generally improves the numerical values across each factor.

Figure 5 and Figure 6 show the edge density difference for resolution and noise factors between the original images and the various algorithm results. The clearer the edges in the difference images, the higher the edge density, indicating clearer boundaries. By examining the edge density values in Table 2, it is clear that the values obtained through the proposed process are higher compared to other algorithms.

Moreover, when comparing the values of PSNR (Peak Signal-to-Noise Ratio) and SSIM (Structure Similarity Index Map), which are key image quality metrics, the proposed hierarchical improvement process consistently ranks in the upper tier of performance [58,59].

Table 2. Resolution and noise factor improvement algorithm performance comparison table.

Algorithms	Figure 5			Figure 6
Algorithms	PSNR	SSIM	Edge Density (0.02302)	PSNR	SSIM	Edge Density (0.00692)
Linear Interpolation [60]	48.53	0.9987	0.02188	49.69	0.9982	0.00662
Bicubic Interpolation [31]	48.56	0.9981	0.02263	49.43	0.998	0.00676
B-spline Interpolation [61]	46.49	0.9975	0.02274	47.96	0.9964	0.00684
Gaussian Blur (5 × 5) [62]	43.87	0.9952	0.01811	49.57	0.9974	0.0058
NLM [37]	38.08	0.966	0.0209	38.65	0.997	0.00687
S-NLM	35.48	0.9353	0.0197	35.84	0.9502	0.00716
WNNM [63]	48.42	0.9993	0.02305	48.39	0.9991	0.00682
BM3D [38]	35.81	0.9449	0.02136	37.39	0.9711	0.0074
Ours	46.38	0.9957	0.03592	43.21	0.9922	0.00698

Figure 7, Figure 8 and Figure 9 illustrate the improvement results for illumination factors using various algorithms. CLAHE, based on the default cliplimit parameter value of 40, significantly increased brightness but introduced considerable noise and artifacts during the enhancement process. Similarly, the Retinex-based MSRCR (Multi-Scale Retinex with Color Restoration) and LIME algorithms improved brightness but generated noticeable noise and artifacts throughout the image. However, Jeon and Eom, GCP-MCS and our method can be seen to produce clean images with less noise.

When reviewing the performance of the lighting enhancement algorithms in Table 3, the proposed process, along with the traditional and lated algorithms, ranks among the top-performing methods.

Table 3. Illumination factor improvement algorithm performance comparison table.

Algorithms	Figure 7		Figure 8		Figure 9
Algorithms	PSNR	SSIM	PSNR	SSIM	PSNR	SSIM
CLAHE [25]	27.42	0.4867	28	0.6112	28.01	0.2957
MSRCR [64]	27.73	0.4087	27.82	0.5606	27.73	0.3887
LIME [65]	28.03	0.5729	27.83	0.5798	28.13	0.5741
Jeon and Eom [66]	28.58	0.5349	27.9	0.5475	27.97	0.5423
GCP-MCS [67]	27.71	0.5458	28.08	0.5886	29.1	0.5958
Ours	28.11	0.5475	28.06	0.6159	28.01	0.5612

The hierarchical improvement process proposed in this paper is ultimately designed to be applied to object detection tasks, with the aim of enhancing detection performance. To measure the improvement in object detection performance, the YOLOv8 “Medium” model [68] and the RT-DETR model [49] pretrained on the COCO (Common Objects in Context) dataset [69] were used.

Figure 10, Figure 11, Figure 12 and Figure 13 show images comparing the object detection task performance. Figure 10 depicts a high-illumination image. After applying the hierarchical improvement process, the detection accuracy of the YOLOv8 model for the ‘Bottle’ object improved by 0.12, 0.02, and 0.07, respectively, while the prediction accuracy of the RT-DETR model improved by 0.01, −0.01, and 0.02, respectively.

Figure 11, Figure 12 and Figure 13 depict images in low-light conditions, where the objects are difficult to distinguish with the naked eye. After applying the hierarchical improvement process, the detection accuracy of the YOLOv8 model for the ‘Bicycle’ object in Figure 11 improved by 0.04, while the detection accuracy of the RT-DETR model improved by 0.01. Additionally, an additional ‘Bicycle’ object was detected for both models with improved low-light conditions. In Figure 12, for the ‘Car’ object, the detection accuracy of the YOLOv8 model was improved by 0.02 and 0.03, and the detection accuracy of the RT-DETR model was improved by 0.02 and 0.06. In Figure 13, we compare the detection and segmentation performance for the ‘Motorcycle’ object. After applying the hierarchical improvement process, the detection accuracy of the YOLOv8 model for the ‘Motorcycle’ object in Figure 13 was improved by 0.06, and the RT-DETR model detection accuracy was improved by 0.05.

Finally, we compared the segmentation performance for the ‘Motorcycle’ object through additional experiments, as shown in Figure 14. After applying the hierarchical improvement process, the segmentation of the object The area is much larger than the original image. For a segmentation performance comparison, the deep-learning-based feature-matching algorithm LoFTR (Detector-Free Local Feature Matching with Transformers) [70] was used to extract the matching accuracy. The feature-matching accuracy for the segmentation area in the original image was 87.07%, while it improved to 96.52% in the enhanced image, a 9.45% increase.

In conclusion, the hierarchical improvement process proposed in this paper enables quality enhancements for illumination, resolution, and noise factors. And the enhanced images contribute to better object detection performance. However, the results also show that in some cases, such as shadow areas near object boundaries, local improvements were insufficient, potentially causing false detections or segmentation errors in object related tasks. Therefore, additional improvement processes targeting such areas may be necessary.

6. Conclusions

In this paper, we proposed a hierarchical improvement process targeting key image quality factors—illumination, resolution, and noise—to enhance object detection performance. The experimental results demonstrated that the proposed hierarchical process improved image quality metrics, such as PSNR and SSIM, across various images, and also yielded tangible improvements in object detection performance.

In experiments using the YOLOv8 model, detection rates for the ‘Bottle’ object increased by an average of 7% in high-light environments, while in low-light environments, detection rates for ‘Bicycle’ and ‘Car’ objects improved by 4% and an average of 2.5%, respectively. And the detection rate of the ‘Motorcycle’ object was improved by 6% in low-light environments.

In addition, in the experiments using the RT-DETR model, the detection rate of the ‘Bottle’ object increased by an average of 0.67% in a high-light environment, and the detection rates of the ‘Bicycle’ and ‘Car’ objects improved by an average of 1% and 4% in a low-light environment, respectively. And the detection rate of the ‘Motorcycle’ object improved by 5% in a low-light environment.

Additionally, segmentation performance for the ‘Motorcycle’ object showed a 9.45% improvement in matching accuracy in enhanced images, confirming that hierarchical image enhancement can contribute to better object recognition.

While the proposed process effectively improves environmental factors such as illumination imbalance, low resolution, and noise, limitations remain, such as false detections in shadowed areas and some degree of color distortion. These limitations are particularly pronounced in complex backgrounds and low-light conditions, where they can reduce object detection accuracy. To address these issues, future research should incorporate additional processes for shadow recognition and color correction, as well as explore methods to maintain high detection performance in complex backgrounds.

Moving forward, we plan to explore real-time optimization of the proposed hierarchical improvement process and validate its performance across diverse object detection models and datasets. Specifically, we aim to enhance processing speed for real-time applications, such as autonomous driving and surveillance systems, and to systematize responses to environmental factors. Through these efforts, we anticipate that this approach will further improve object detection performance across a wide range of real-world applications.

Author Contributions

Conceptualization, methodology, software, writing—original draft preparation, data curation, formal analysis, T.-s.W.; validation, investigation, visualization, T.-s.W. and G.-T.K.; writing—review and editing, J.S.; resources, supervision, project administration, funding acquisition, S.-W.J. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Innovative Human Resource Development for Local Intellectualization program through the Institute of Information and Communications Technology Planning and Evaluation (IITP) grant funded by the Korean government (MSIT) (IITP-2024-RS-2020-II201791).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The image data used to measure the performance of the proposed improved process are available in the link below. AI-Hub Dataset (marine sediment debris, low-light environment images). Available online: https://aihub.or.kr/ (accessed on 14 June 2023); RealSR dataset. Available online: https://github.com/csjcai/RealSR (accessed on 17 June 2019); Low-Light dataset (LOL). Available online: https://daooshee.github.io/BMVC2018website/ (accessed on 5 September 2018); AIM 2020 Real Image Super-Resolution Challenge Dataset. Available online: https://competitions.codalab.org/competitions/24680 (accessed on 28 August 2020).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zou, Z.; Chen, K.; Shi, Z.; Guo, Y.; Ye, J. Object detection in 20 years: A survey. Proc. IEEE 2023, 111, 257–276. [Google Scholar] [CrossRef]
Feng, D.; Haase-Schütz, C.; Rosenbaum, L.; Hertlein, H.; Glaeser, C.; Timm, F.; Wiesbeck, W.; Dietmayer, K. Deep multi-modal object detection and semantic segmentation for autonomous driving: Datasets, methods, and challenges. IEEE Trans. Intell. Transp. Syst. 2021, 22, 1341–1360. [Google Scholar] [CrossRef]
Zhu, W.; Zhou, J.; Wang, Z.; Zhou, X.; Zhou, F.; Sun, J.; Song, M.; Zhou, Z. Three-Dimensional Object Detection Network Based on Multi-Layer and Multi-Modal Fusion. Electronics 2024, 13, 3512. [Google Scholar] [CrossRef]
Sun, P.; Qi, X.; Zhong, R. A Roadside Precision Monocular Measurement Technology for Vehicle-to-Everything (V2X). Sensors 2024, 24, 5730. [Google Scholar] [CrossRef]
Monteiro, G.; Camelo, L.; Aquino, G.; Fernandes, R.d.A.; Gomes, R.; Printes, A.; Torné, I.; Silva, H.; Oliveira, J.; Figueiredo, C. A Comprehensive Framework for Industrial Sticker Information Recognition Using Advanced OCR and Object Detection Techniques. Appl. Sci. 2023, 13, 7320. [Google Scholar] [CrossRef]
Jha, S.; Seo, C.; Yang, E.; Joshi, G.P. Real-time object detection and tracking system for video surveillance system. Multimed. Tools Appl. 2021, 80, 3981–3996. [Google Scholar] [CrossRef]
Shokri, D.; Larouche, C.; Homayouni, S. Proposing an Efficient Deep Learning Algorithm Based on Segment Anything Model for Detection and Tracking of Vehicles through Uncalibrated Urban Traffic Surveillance Cameras. Electronics 2024, 13, 2883. [Google Scholar] [CrossRef]
Ouardirhi, Z.; Mahmoudi, S.A.; Zbakh, M. Enhancing Object Detection in Smart Video Surveillance: A Survey of Occlusion-Handling Approaches. Electronics 2024, 13, 541. [Google Scholar] [CrossRef]
Khanam, R.; Hussain, M.; Hill, R.; Allen, P. A comprehensive review of convolutional neural networks for defect detection in industrial applications. IEEE Access 2024, 12, 94250–94295. [Google Scholar] [CrossRef]
Xia, K.; Saidy, C.; Kirkpatrick, M.; Anumbe, N.; Sheth, A.; Harik, R. Towards Semantic Integration of Machine Vision Systems to Aid Manufacturing Event Understanding. Sensors 2021, 21, 4276. [Google Scholar] [CrossRef]
He, C.; Li, K.; Xu, G.; Yan, J.; Tang, L.; Zhang, Y.; Wang, Y.; Li, X. HQG-Net: Unpaired medical image enhancement with high-quality guidance. IEEE Trans. Neural Netw. Learn. Syst. 2023, 1–15. [Google Scholar] [CrossRef] [PubMed]
Yu, X.; Li, H.; Yang, H. Two-stage image decomposition and color regulator for low-light image enhancement. Vis. Comput. 2023, 39, 4165–4175. [Google Scholar] [CrossRef]
Hao, Y.; Pei, H.; Lyu, Y.; Yuan, Z.; Rizzo, J.-R.; Wang, Y.; Fang, Y. Understanding the impact of image quality and distance of objects to object detection performance. In Proceedings of the 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Detroit, MI, USA, 1–5 October 2023; pp. 11436–11442. [Google Scholar] [CrossRef]
Tian, Z.; Qu, P.; Li, J.; Sun, Y.; Li, G.; Liang, Z.; Zhang, W. A Survey of Deep Learning-Based Low-Light Image Enhancement. Sensors 2023, 23, 7763. [Google Scholar] [CrossRef] [PubMed]
Feng, X.; Li, J.; Hua, Z.; Zhang, F. Low-light image enhancement based on multi-illumination estimation. Appl. Intell. 2021, 51, 5111–5131. [Google Scholar] [CrossRef]
Li, M.; Zhao, Y.; Gui, G.; Zhang, F.; Luo, B.; Yang, C.; Gui, W.; Chang, K.; Wang, H. Object detection on low-resolution images with two-stage enhancement. Knowl.-Based Syst. 2024, 299, 111985. [Google Scholar] [CrossRef]
Al Mudhafar, R.A.; El Abbadi, N.K. Comprehensive Approach for Image Noise Analysis: Detection, Classification, Estimation, and Denoising. In Proceedings of the International Conference on Micro-Electronics and Telecommunication Engineering (ICMETE 2023), Ghaziabad, India, 22–23 September 2023; pp. 601–616. [Google Scholar] [CrossRef]
More, S.; Singla, J. Machine Learning Approaches for Image Quality Improvement. In Proceedings of the Second International Conference on Image Processing and Capsule Networks (ICIPCN 2021), Bangkok, Thailand, 27–28 May 2021; pp. 44–55. [Google Scholar] [CrossRef]
Liang, Z.; Ruan, R.; Wang, C.; Zhuang, P. Single Image Quality Improvement via Joint Local Structure Dehazing and Local Texture Enhancement. IEEE Trans. Geosci. Remote Sens. 2024, 62, 4210117. [Google Scholar] [CrossRef]
Polap, D.; Jaszcz, A.; Srivastava, G. Dual-Encoding Y-ResNet for generating a lens flare effect in images. In Proceedings of the 2024 International Joint Conference on Neural Networks (IJCNN), Yokohama, Japan, 30 June–5 July 2024; pp. 1–7. [Google Scholar] [CrossRef]
Niu, S.; Li, B.; Wang, X.; Lin, H. Defect Image Sample Generation with GAN for Improving Defect Recognition. IEEE Trans. Autom. Sci. Eng. 2020, 17, 1611–1622. [Google Scholar] [CrossRef]
Trahanias, P.E.; Venetsanopoulos, A.N. Color image enhancement through 3-D histogram equalization. In Proceedings of the 11th IAPR International Conference on Pattern Recognition, Vol. III. Conference C: Image, Speech and Signal Analysis, The Hague, The Netherlands, 30 August–1 September 1992; pp. 545–548. [Google Scholar] [CrossRef]
Pizer, S.M.; Amburn, E.P.; Austin, J.D.; Cromartie, R.; Geselowitz, A.; Greer, T.; ter Haar Romeny, B.; Zimmerman, J.B.; Zuiderveld, K. Adaptive histogram equalization and its variations. Comput. Vis. Graph. Image Process. 1987, 39, 355–368. [Google Scholar] [CrossRef]
Kadhum, Z.A. Equalize the histogram equalization for Image enhancement. J. Kufa Math. Comput. 2012, 1, 14–21. [Google Scholar] [CrossRef]
Zuiderveld, K. Contrast limited adaptive histogram equalization. In Graphics Gems IV; Academic Press: Cambridge, MA, USA, 1994; pp. 474–485. [Google Scholar] [CrossRef]
Fan, M.; Wang, W.; Yang, W.; Liu, J. Integrating semantic segmentation and retinex model for low-light image enhancement. In Proceedings of the 28th ACM International Conference on Multimedia, Seattle, WA, USA, 12–16 October 2020; pp. 2317–2325. [Google Scholar] [CrossRef]
Ma, Q.; Wang, Y.; Zeng, T. Retinex-Based Variational Framework for Low-Light Image Enhancement and Denoising. IEEE Trans. Multimed. 2022, 25, 5580–5588. [Google Scholar] [CrossRef]
Nath, N.; Behzadan, A.H. Deep Generative Adversarial Network to Enhance Image Quality for Fast Object Detection in Construction Sites. In Proceedings of the 2020 Winter Simulation Conference (WSC), Orlando, FL, USA, 14–18 December 2020; pp. 2447–2459. [Google Scholar] [CrossRef]
Weligampola, H.; Jayatilaka, G.; Sritharan, S.; Godaliyadda, R.; Ekanayaka, P.; Ragel, R.; Herath, V. A Retinex based GAN Pipeline to Utilize Paired and Unpaired Datasets for Enhancing Low Light Images. In Proceedings of the 2020 Moratuwa Engineering Research Conference (MERCon), Moratuwa, Sri Lanka, 28–30 July 2020; pp. 224–229. [Google Scholar] [CrossRef]
Cai, Y.; Liu, X.; Li, H.; Lu, F.; Gu, X.; Qin, K. Research on Unsupervised Low-Light Railway Fastener Image Enhancement Method Based on Contrastive Learning GAN. Sensors 2024, 24, 3794. [Google Scholar] [CrossRef] [PubMed]
Keys, R. Cubic convolution interpolation for digital image processing. IEEE Trans. Acoust. Speech Signal Process. 1981, 29, 1153–1160. [Google Scholar] [CrossRef]
Duchon, C.E. Lanczos Filtering in One and Two Dimensions. J. Appl. Meteorol. Climatol. 1979, 18, 1016–1022. [Google Scholar] [CrossRef]
Mastyło, M. Bilinear interpolation theorems and applications. J. Funct. Anal. 2013, 265, 185–207. [Google Scholar] [CrossRef]
Dong, C.; Loy, C.C.; He, K.; Tang, X. Image Super-Resolution Using Deep Convolutional Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 38, 295–307. [Google Scholar] [CrossRef]
Lim, B.; Son, S.; Kim, H.; Nah, S.; Lee, K.M. Enhanced Deep Residual Networks for Single Image Super-Resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, 21–26 July 2017; pp. 1132–1140. [Google Scholar] [CrossRef]
Abdulah, C.S.K.; Rohani, M.N.K.H.; Ismail, B.; Isa, M.A.M.; Rosmi, A.S.; Mustafa, W.A. Comparison of Image Restoration using Median, Wiener, and Gaussian Filtering Techniques based on Electrical Tree. In Proceedings of the 2021 IEEE Industrial Electronics and Applications Conference (IEACon), Penang, Malaysia, 22–23 November 2021; pp. 163–168. [Google Scholar] [CrossRef]
Buades, A.; Coll, B.; Morel, J.-M. A non-local algorithm for image denoising. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 20–25 June 2005; pp. 60–65. [Google Scholar] [CrossRef]
Dabov, K.; Foi, A.; Katkovnik, V.; Egiazarian, K. Image Denoising by Sparse 3-D Transform-Domain Collaborative Filtering. IEEE Trans. Image Process. 2007, 16, 2080–2095. [Google Scholar] [CrossRef]
Zhang, K.; Zuo, W.; Chen, Y.; Meng, D.; Zhang, L. Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising. IEEE Trans. Image Process. 2017, 26, 3142–3155. [Google Scholar] [CrossRef]
Zhang, K.; Zuo, W.; Zhang, L. FFDNet: Toward a Fast and Flexible Solution for CNN-Based Image Denoising. IEEE Trans. Image Process. 2018, 27, 4608–4622. [Google Scholar] [CrossRef]
Li, J.; Hao, S.; Li, T.; Zhuo, L.; Zhang, J. RDMA: Low-light image enhancement based on retinex decomposition and multi-scale adjustment. Int. J. Mach. Learn. Cyber 2024, 15, 1693–1709. [Google Scholar] [CrossRef]
Gödrich, A.; König, D.; Eilertsen, G.; Teutsch, M. Joint tone mapping and denoising of thermal infrared images via multi-scale Retinex and multi-task learning. Infrared Technol. Appl. XLIX 2023, 12534, 275–291. [Google Scholar] [CrossRef]
Geetha, R.; Jebamalar, G.B.; Shiney, S.A.; Dao, N.N.; Moon, H.; Cho, S. Enhancing Upscaled Image Resolution Using Hybrid Generative Adversarial Network-Enabled Frameworks. IEEE Access 2024, 12, 27784–27793. [Google Scholar] [CrossRef]
Li, P.; Li, Z.; Pang, X.; Wang, H.; Lin, W.; Wu, W. Multi-scale residual denoising GAN model for producing super-resolution CTA images. J. Ambient Intell. Humaniz. Comput. 2022, 13, 1515–1524. [Google Scholar] [CrossRef]
Anderegg, J.; Zenkl, R.; Walter, A.; Hund, A.; McDonald, B.A. Combining High-Resolution Imaging, Deep Learning, and Dynamic Modeling to Separate Disease and Senescence in Wheat Canopies. Plant Phenomics 2023, 5, 0053. [Google Scholar] [CrossRef]
Wang, Y.; Guo, J.; Gao, H.; Yue, H. UIEC^2-Net: CNN-based underwater image enhancement using two color space. Signal Process. Image Commun. 2021, 96, 116250. [Google Scholar] [CrossRef]
Liu, P.; Zhang, C.; Qi, H.; Wang, G.; Zheng, H. Multi-Attention DenseNet: A Scattering Medium Imaging Optimization Framework for Visual Data Pre-Processing of Autonomous Driving Systems. IEEE Trans. Intell. Transp. Syst. 2022, 23, 25396–25407. [Google Scholar] [CrossRef]
Chen, B.; Zhu, L.; Kong, C.; Zhu, H.; Wang, S.; Zhu, L. No-Reference Image Quality Assessment by Hallucinating Pristine Features. IEEE Trans. Image Process. 2022, 31, 6139–6151. [Google Scholar] [CrossRef]
Zhao, Y.; Lv, W.; Xu, S.; Wei, J.; Wang, G.; Dang, Q. DETRs Beat YOLOs on Real-time Object Detection. In Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–22 June 2024; pp. 16965–16974. [Google Scholar] [CrossRef]
Kong, C.; Luo, A.; Wang, S.; Li, H.; Rocha, A.; Kot, A.C. Pixel-Inconsistency Modeling for Image Manipulation Localization. arXiv 2023, arXiv:2310.00234. [Google Scholar]
Zhou, Z.; Xu, Y.; Wan, X.; Quan, Y.; Xu, R.; Li, J.; Callet, P.L. No-Reference Image Quality Assessment Using Local Binary Patterns: A Comprehensive Performance Evaluation. In Proceedings of the 3rd Workshop on Quality of Experience in Visual Multimedia Applications (QoEVMA’24), Melbourne, Australia, 28 October–1 November 2024; pp. 2–11. [Google Scholar] [CrossRef]
Shim, J.; Lee, Y. No-Reference-Based and Noise Level Evaluations of Cinematic Rendering in Bone Computed Tomography. Bioengineering 2024, 11, 563. [Google Scholar] [CrossRef]
Zhang, K.; Long, M.; Chen, J.; Liu, M.; Li, J. CFPNet: A Denoising Network for Complex Frequency Band Signal Processing. IEEE Trans. Multimed. 2023, 25, 8212–8224. [Google Scholar] [CrossRef]
AI-Hub Dataset (Marine Sediment Debris, Low-Light Environment Images). Available online: https://aihub.or.kr/ (accessed on 14 June 2023).
Cai, J.; Zeng, H.; Yong, H.; Cao, Z.; Zhang, L. Toward real-world single image super-resolution: A new benchmark and a new model. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 3086–3095. [Google Scholar] [CrossRef]
Wei, C.; Wang, W.; Yang, W.; Liu, J. Deep retinex decomposition for low-light enhancement. arXiv 2018, arXiv:1808.04560. [Google Scholar]
Wei, P.; Lu, H.; Timofte, R.; Lin, L.; Zuo, W.; Pan, Z.; Debeir, O. AIM 2020 challenge on real image super-resolution: Methods and results. In Proceedings of the Computer Vision–ECCV 2020 Workshops, Glasgow, UK, 23–28 August 2020; pp. 392–422. [Google Scholar] [CrossRef]
Sara, U.; Akter, M.; Uddin, M.S. Image quality assessment through FSIM, SSIM, MSE and PSNR—A comparative study. J. Comput. Commun. 2019, 7, 8–18. [Google Scholar] [CrossRef]
Lone, M.R.; Sandhu, A.K. Enhancing image quality: A nearest neighbor median filter approach for impulse noise reduction. Multimed. Tools Appl. 2024, 83, 56865–56881. [Google Scholar] [CrossRef]
Cuevas, E.; Luque, A.; Escobar, H. Interpolation and Polynomials. Comput. Methods MATLAB^® 2023, 1, 77–101. [Google Scholar] [CrossRef]
Zhang, Y.; Wang, L.; Zhao, J.; Yao, W. Python-based cubic B-spline interpolation algorithm for pump characteristic curves. In Proceedings of the Third International Conference on Mechanical Design and Simulation (MDS 2023), Xi’an, China, 3–5 March 2023; pp. 395–400. [Google Scholar] [CrossRef]
Hummel, R.A.; Kimia, B.; Zucker, S.W. Deblurring gaussian blur. Comput. Vis. Graph. Image Process. 1987, 38, 66–80. [Google Scholar] [CrossRef]
Gu, S.; Zhang, L.; Zuo, W.; Feng, X. Weighted nuclear norm minimization with application to image denoising. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, OH, USA, 23–28 June 2014; pp. 2862–2869. [Google Scholar] [CrossRef]
Petro, A.B.; Sbert, C.; Morel, J.M. Multiscale retinex. Image Process. Line 2014, 1, 71–88. [Google Scholar] [CrossRef]
Guo, X.; Li, Y.; Ling, H. LIME: Low-Light Image Enhancement via Illumination Map Estimation. IEEE Trans. Image Process. 2017, 26, 982–993. [Google Scholar] [CrossRef]
Jeon, J.J.; Eom, I.K. Low-light image enhancement using inverted image normalized by atmospheric light. Signal Process. 2022, 196, 108523. [Google Scholar] [CrossRef]
Jeon, J.J.; Park, J.Y.; Eom, I.K. Low-light image enhancement using gamma correction prior in mixed color spaces. Pattern Recognit. 2024, 146, 110001. [Google Scholar] [CrossRef]
Ultralytics YOLO (Version 8.0.0). Available online: https://github.com/ultralytics/ultralytics (accessed on 5 September 2023).
Lin, T.-Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft Coco: Common Objects in Context. In European Conference on Computer Vision; Springer: Berlin/Heidelberg, Germany, 2014; pp. 740–755. [Google Scholar] [CrossRef]
Sun, J.; Shen, Z.; Wang, Y.; Bao, H.; Zhou, X. LoFTR: Detector-Free Local Feature Matching with Transformers. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 8918–8927. [Google Scholar] [CrossRef]

Figure 1. Environmental factor clustering.

Figure 2. Hierarchical improvement process diagram.

Figure 3. Flowchart of the hierarchical improvement process.

Figure 4. Image improvement performance results (10 iterations of improvement): (A) Tulips image; (B) Rope image; (C) Apple image; (D) Bike image; (E) Car image.

Figure 5. Image 1 of edge density difference from original by resolution and noise factor algorithm.

Figure 6. Image 2 of edge density difference from original by resolution and noise factor algorithm.

Figure 7. Image 1 of improvement results by illumination factor algorithm.

Figure 8. Image 2 of improvement results by illumination factor algorithm.

Figure 9. Image 3 of improvement results by illumination factor algorithm.

Figure 10. Object detection task performance comparison image 1.

Figure 11. Object detection task performance comparison image 2.

Figure 12. Object detection task performance comparison image 3.

Figure 13. Object detection task performance comparison image 4.

Figure 14. Object segmentation task performance comparison image 5.

Table 1. Element-by-element results table according to image improvement.

Image	Brightness Deviation		Contrast		Image Size		Edge Strength		Texture Uniformity		BRISQUE Score		High Freq Noise Ratio
Image	First	Last	First	Last	First	Last	First	Last	First	Last	First	Last	First	Last
(A) Tulips	23.82	24.25	60.69	61.19	393,216	4,026,204	12.12	5.83	4.82	4.85	13.22	45.44	1.303 × 10⁻⁴	0.896 × 10⁻⁴
(B) Rope	1.94	1.06	21.59	21.66	2,073,600	8,294,400	1.51	7.57	5.91	5.96	51.68	47.98	4.553 × 10⁻⁴	1.795 × 10⁻⁴
(C) Apple	67.06	47.74	40.66	49.82	409,600	4,194,304	12.69	6.38	4.71	4.64	21.81	45.6	1.293 × 10⁻⁴	0.745 × 10⁻⁴
(D) Bike	87	41.08	62.72	65.72	2,073,600	8,294,400	3.96	84.65	6.4	6.03	63.62	110.26	1.077 × 10⁻⁴	0.129 × 10⁻⁴
(E) Car	126.07	97.82	8.77	30.95	2,073,600	8,294,400	0.87	2.46	7.26	7.36	89.16	82.37	3.578 × 10⁻⁴	0.911 × 10⁻⁴

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, T.-s.; Kim, G.-T.; Shin, J.; Jang, S.-W. Hierarchical Image Quality Improvement Based on Illumination, Resolution, and Noise Factors for Improving Object Detection. Electronics 2024, 13, 4438. https://doi.org/10.3390/electronics13224438

AMA Style

Wang T-s, Kim G-T, Shin J, Jang S-W. Hierarchical Image Quality Improvement Based on Illumination, Resolution, and Noise Factors for Improving Object Detection. Electronics. 2024; 13(22):4438. https://doi.org/10.3390/electronics13224438

Chicago/Turabian Style

Wang, Tae-su, Gi-Tae Kim, Jungpil Shin, and Si-Woong Jang. 2024. "Hierarchical Image Quality Improvement Based on Illumination, Resolution, and Noise Factors for Improving Object Detection" Electronics 13, no. 22: 4438. https://doi.org/10.3390/electronics13224438

APA Style

Wang, T. -s., Kim, G. -T., Shin, J., & Jang, S. -W. (2024). Hierarchical Image Quality Improvement Based on Illumination, Resolution, and Noise Factors for Improving Object Detection. Electronics, 13(22), 4438. https://doi.org/10.3390/electronics13224438

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Hierarchical Image Quality Improvement Based on Illumination, Resolution, and Noise Factors for Improving Object Detection

Abstract

1. Introduction

2. Literature Review

2.1. Illumination Correction for Object Detection

2.2. Resolution Enhancement for Object Detection

2.3. Noise Reduction Techniques

2.4. Multi-Factor Image Enhancement

2.5. Recent Advances in Image Quality Assessment and Object Detection

3. Environmental Factors and Clustering

3.1. Environmental Factors Affecting Image Quality

3.2. Selection of Environmental Factors (Clustering)

Importance of Adopting the Image Quality Factors Cluster

4. Hierarchical Improvement Process Design for Environmental Factors

4.1. Criteria for Improvement Necessity of Each Factor

4.1.1. Criteria for Illumination Improvement Determination

4.1.2. Criteria for Resolution Improvement Determination

4.1.3. Criteria for Noise Improvement Determination

4.2. Environmental Factor Discriminators: Severity Calculation

4.2.1. Illumination Severity

4.2.2. Resolution Severity

4.2.3. Noise Severity

4.3. Image Improvement Strategy

4.3.1. Setting Thresholds for Each Element

4.3.2. Initial Weight Setting

4.3.3. Setting Improvement Intensity and Image Enhancement

4.3.4. Image Quality Assessment

4.3.5. Weights Update: Characteristic Analysis-Based Learning Parameter Adjustment

5. Experimental Results

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI