1. Introduction
Despite witnessing a notable reduction of 50% in shipping losses over the course of the past decade, it is imperative to underscore that fires onboard vessels persist as one of the most substantial safety concerns within the maritime industry. This enduring challenge necessitates a continued focus on comprehensive safety measures and innovative solutions to effectively address and mitigate the risks associated with onboard fires. A recently published report by the international insurance conglomerate Allianz sheds light on the alarming statistics surrounding fires on large shipping vessels. According to the report, a staggering 200 fire incidents were documented in 2022 alone, marking the highest annual total in a decade. Notably, of these incidents, 43 were specifically identified as occurring on cargo or container ships, underscoring the heightened risk within this sector of the maritime industry. Implementing computer vision algorithms for advanced fire detection, monitoring, and response systems in the context of fire and smoke detection offers a comprehensive approach to enhancing ship safety. Computer vision algorithms can be trained to recognize specific patterns associated with smoke. Through the meticulous examination of video streams captured by cameras installed on board, these sophisticated algorithms are adept at rapidly and precisely detecting the manifestation of smoke across different sections of the vessel. This capability extends to recognizing smoke emanations resulting from a multitude of sources, including, but not limited to, combustion of fuel, leakage of lubricating oils, and the malfunctioning of pipes along with their associated fittings [
1]. The algorithms leverage advanced image processing and machine learning techniques to analyze the visual data, enabling them to discern the subtle nuances of smoke appearance under varying lighting and environmental conditions. This capability is crucial to detect ship fires accurately, allowing the system to alert the crew or trigger automated responses before a fire escalates. The integration of smoke detection algorithms enhances the overall effectiveness of fire prevention measures.
Computer vision algorithms, as part of an alert system, play a critical role in providing timely notifications about potential fire incidents. These alerts can be sent to relevant personnel, both onboard and onshore, through various communication channels. The system can differentiate between normal activities and emergency situations, ensuring that alarms are triggered only in response to genuine threats. The rapid dissemination of alerts enables quick decision-making and response coordination, contributing to effective firefighting efforts. Computer vision algorithms can be integrated with the ship’s fire suppression systems. In the event of smoke or fire detection, the system can automatically activate fire extinguishing mechanisms, such as sprinklers or suppressant agents. This seamless integration ensures a swift and targeted response, minimizing the potential damage caused by fires. Such automation is crucial for situations where immediate human intervention might be challenging or delayed. The machine learning component of computer vision systems enables continuous improvement over time. As the algorithms process more data and encounter various scenarios, they can adapt and refine their capabilities. This self-learning aspect contributes to the system’s accuracy in detecting smoke patterns and anomalies, reducing false alarms and enhancing overall reliability. The potential for fire danger zones to manifest onboard ships is heightened due to various factors. The intricate machinery and systems within a vessel, coupled with the presence of combustible materials, create an environment susceptible to the initiation and rapid spread of fires. The engine room, being the nucleus of a ship’s power generation, is particularly prone to fire incidents due to the intricate network of components and the inherent combustibility of certain materials present. Additionally, electrical systems, machinery malfunction, and human error can act as catalysts for the emergence of fire danger zones, further emphasizing the need for robust detection and prevention mechanisms within the maritime setting. The pivotal compartment of a vessel responsible for powering and ensuring its seamless operation is the engine room. Nevertheless, owing to its intricate structure and the presence of flammable materials, 75% of all ship fires originate in the engine room and nearly two-thirds of these engine room fires specifically occur in the primary and auxiliary engines, as well as in closely associated components such as turbochargers [
2]. Given this context, the detection of engine room fires holds paramount significance. The swift and precise identification of fires within the engine room is crucial to mitigating potential harm to individuals and property resulting from maritime accidents. Moreover, it can contribute positively to the ongoing enhancement of ship damage control systems, as well as the advancement of technology in ship fire prevention and control.
An alternative to traditional fire alarm systems is the adoption of AI-driven fire detection. In recent times, there has been a notable integration of deep learning algorithms in the identification of fires through visual data. Current research substantiates the efficacy of methods rooted in computer vision and deep learning for the purpose of fire detection [
3,
4,
5,
6,
7]. Deep learning technology possesses the capability to autonomously extract object features from images, facilitating the acquisition of generalized information. These methodologies exhibit exceptional learning capacities and adaptability. Prominent among the common deep learning algorithms is YOLO [
8,
9]. Deep-learning-based target detection offers an automated process for extracting intricate details and features from images. This approach proves particularly effective in overcoming the challenges of redundancy and interference associated with the manual extraction of image features in the context of fire detection [
10]. Traditional methods of fire detection technology involve the amalgamation of data from various indoor sensors. Alerts are generated when the parameter values detected by these sensors exceed the predefined thresholds. In the initial stages of sensor technology, emphasis was placed on the concept of a “point sensor”, primarily relying on particle activation related to essential fire characteristics, including heat, gas, flames, smoke, and other pertinent factors [
11]. In recent years, propelled by the rapid advancement of computer vision, image processing technologies, the continuous enhancement of hardware computing capabilities, and the widespread adoption of video surveillance networks, there has been a discernible shift in attention towards the evolution of fire detection technologies. Notably, video fire testing, underpinned by deep learning principles, has emerged as a prominent research area characterized by its swift response times and heightened accuracy. This transition is underscored by the increasing intelligence and automation of modern ships, coupled with the maturation of video surveillance systems. This confluence presents a viable prospect for leveraging monitoring and deep learning technologies in the detection of fires within engine rooms.
Moreover, the successful application of video-oriented fire detection in diverse settings, ranging from indoor office spaces to outdoor environments like forested areas, lays a robust foundation for its potential adaptation in the maritime domain. In alignment with these advancements, this paper contributes to the discourse by proposing the application of the YOLO algorithm for ship fire detection. By harnessing the capabilities of YOLO, which excels in real-time object detection, this research aims to enhance the efficacy of fire detection in engine rooms through the analysis of real-time video surveillance feeds. The YOLO algorithm stands out as a highly efficient real-time object detection method. It operates by dividing an image into a grid system, with each grid autonomously responsible for detecting objects within its designated area. What distinguishes YOLO is its capacity for real-time inference and, notably, it achieves this feat while demanding minimal computational resources. The persistent threat of onboard fires remains a significant concern in the maritime industry, despite a commendable reduction in overall shipping losses over the past decade. Engine rooms, being vital components of vessels, are particularly susceptible to fire incidents, emphasizing the critical need for robust detection and prevention measures. The integration of computer vision algorithms, particularly those rooted in deep learning and exemplified by the YOLO algorithm, presents a transformative approach to enhancing fire detection and response systems on ships. The evolution of fire detection technology from traditional sensor-based methods to advanced computer vision algorithms signifies a paradigm shift in maritime safety. Leveraging the capabilities of YOLO and other deep learning models enables real-time, accurate detection of fire and smoke in complex environments like engine rooms, as shown in
Figure 1. This transition is aligned with the increasing intelligence and automation of modern ships and the maturity of video surveillance systems, creating a conducive environment for the adoption of cutting-edge technologies.
This paper contributes significantly to the field by proposing the application of the YOLO algorithm for ship fire detection.
Introduction of YOLO Algorithm for Ship Fire Detection: This paper significantly contributes to the field by introducing the application of the YOLO algorithm for ship fire detection. The utilization of YOLO’s real-time object detection capabilities marks a pivotal step in enhancing the methodology employed for fire detection on ships.
Enhancing Effectiveness through Real-time Analysis: By leveraging YOLO’s capabilities, the research aims to elevate the overall effectiveness of fire detection systems. The emphasis on real-time analysis of video surveillance data signifies a critical advancement, allowing for swift and accurate detection of fire incidents as they unfold on ships.
Promise of YOLO-Based Algorithms in Maritime Safety: The application of YOLO-based algorithms holds immense promise for advancing safety measures on ships. This contribution introduces a novel and sophisticated approach to maritime fire detection, highlighting the potential for YOLO algorithms to redefine safety protocols within the maritime industry.
Utilization of Custom Datasets for Robust Algorithm Performance: The incorporation of custom datasets in the research is a strategic move to further contribute to the robustness and adaptability of the proposed YOLO algorithm in real-world scenarios.
Pioneering Further Exploration of Computer Vision Techniques with the Application of Histogram Equalization Technique: Equalizing the histogram, subtle details and features in 2D images, in the case of sea transports, provide better detection of smoke and fire in high water vapor representation in the air in the oceanic environment.
In this study, we propose to enhance maritime safety by utilizing the YOLO algorithm for detecting fires on ships. By fine-tuning the real-time detection strengths of YOLO, following image equalization, this research seeks to significantly improve the efficiency of fire detection. The promise held by YOLO-based algorithms represents a significant leap forward in enhancing safety measures, paving the way for the future integration of advanced computer vision techniques in maritime security. Furthermore, the strategic use of custom datasets underscores the commitment to robustness and adaptability, offering valuable insights for ongoing improvements in maritime safety protocols. As the maritime industry embraces these technological advancements, this paper serves as a pivotal contribution to the evolution of safety standards, ensuring a safer and more resilient maritime environment.
2. Related Work
Over the past decade, there has been a notable shift in fire detection technology, with the emergence of deep learning techniques, particularly the YOLO algorithms, proving instrumental in addressing significant challenges in object detection. The development of YOLO’s framework, evolving from the initial YOLOv1 [
12] to the latest YOLOv8 algorithms, reflects key innovations and differences that enhance its proficiency in executing detection tasks. These advancements are intricately tied to the paradigm shift in the realm of fire detection technology. Object detection and recognition algorithms primarily rely on specific types of deep neural networks (DNNs) and convolutional neural networks (CNNs). Learnable neural networks comprise multiple layers, each assigned distinct functions such as area analysis, feature extraction, data identification, and anomaly detection to achieve precise object detection. Noteworthy advancements in this field include the early fire warning mechanism proposed by Chen et al. [
13], utilizing video processing to detect fire and smoke pixels through chromaticity and disorder measurements within the RGB model. A typical ship’s fire detection system incorporates sensors for fire and smoke, heat detectors, and gas detectors, in conjunction with an alarm panel [
14]. Engineered to provide both visible and audible alerts, these fire detectors play a crucial role in indicating the precise location of a fire on the vessel. The network of detectors spanning the entire ship is intricately linked to a fire control panel. This central control unit not only issues visual and auditory alarms but also has the capacity to trigger alarms in various other sections of the vessel for comprehensive alerting and response. The progression towards advanced fire detection methodologies is exemplified by Foggia et al.’s [
15] work, presenting a method that analyzes surveillance camera videos. Their approach integrates information from color, shape alterations, and motion analysis through multiple expert systems, showcasing the convergence of various technological elements. Furthermore, the research conducted by Arthur K et al. [
16] on video flame segmentation and recognition underscores the industry’s dedication to exploring innovative techniques in fire detection. In a parallel development, Wu et al. [
17] introduced a dynamic fire detection algorithm for surveillance videos, incorporating radiation domain feature models.
In the realm of fire detection through image segmentation, the fundamental task entails the allocation of individual pixels within an image to distinct categories, distinguishing between those constituting the fire region and those comprising the background. This segmentation objective is systematically addressed by employing semantic segmentation networks, which undergo end-to-end training to directly assimilate the capacity for delineating segmentation masks from the original image. Noteworthy instances of this approach include the utilization of frameworks such as GAN [
18]. Not only does instance segmentation involve classifying each pixel into specific categories but this CNN architecture can also distinguish individual instances of those categories. The original U-Net architecture was introduced for biomedical image segmentation, particularly for the segmentation of neuronal structures in electron microscopy images. Consequently, this methodological paradigm empowers the network to discern and classify elements related to fire at a granular level of individual pixels, thereby facilitating the meticulous identification of fire regions against the backdrop of the overall visual data. It is important to note that, while U-Net provides a strong foundation, there are other dedicated architectures for instance segmentation tasks, such as Mask R-CNN [
19], or Region-based Convolutional Neural Network, which explicitly addresses the challenges of segmenting and distinguishing individual instances within a given class. Guan et al. [
20] proposed an innovative approach to instance segmentation, denoted as MaskSU R-CNN, designed specifically for the early detection and segmentation of forest fires. Research endeavors focused on addressing critical challenges in computer vision related to forest fire detection using UAV-captured video frames from the FLAME dataset. The approach proposed innovative solutions for binary image classification (fire vs. no fire) and fire instance segmentation. The semantic segmentation method for fire smoke, leveraging global information and the U-Net network, is designed to accurately delineate and identify regions associated with fire smoke within images. Semantic segmentation involves classifying each pixel in an image into distinct categories, in this case, differentiating between fire smoke and the background. The integration of global information and the U-Net architecture enhances the model’s ability to capture contextual details and spatial relationships crucial for effective segmentation [
21]. As exemplified by Zheng et al. [
22], a sophisticated approach to semantic segmentation in the context of fire smoke has been introduced. Their method intricately integrates global contextual information and leverages the U-Net network architecture. The algorithm, characterized by its utilization of Multi-Scale Residual Group Attention (MRGA), is adept at concurrently exploiting contextual understanding and intricate spatial relationships within the image data. By synergistically incorporating MRGA with the U-Net framework, the model adeptly captures multi-scale smoke features, thereby augmenting its capacity to discern subtle nuances within small-scale smoke instances. This amalgamation of methodologies significantly enhances the model’s perceptual acuity, particularly when confronted with the challenges associated with detecting and segmenting small-scale smoke regions. The scholars referenced in [
23] have introduced an innovative algorithm denoted as “fire-YOLO”. This algorithm constitutes an augmentation to YOLOv4, integrating depth-separable convolution techniques. This augmentation serves the dual purpose of mitigating the computational costs associated with the model while concurrently enhancing the perceptual field of the feature layer. Notably, the inclusion of a cavity convolution method further refines the model’s efficiency.
The influence of ocean proximity in areas near the ocean often means higher humidity levels due to the proximity of a great deal of water evaporation, which makes object detection less accurate. Humidity can affect the performance of the sensors used in imaging systems, degrading the quality of images captured by cameras. Therefore, contrast is instrumental for visual processing and understanding of the information contents within the images in various environmental settings [
24]. Chen et al. [
25] introduced a histogram equalization (HE)-based approach, called quadrant dynamic histogram equalization (QDHE), for captured images from devices. This method was mainly applied in the area where images were captured in low-light environments; the QDHE algorithm enhances images without any intensity saturation, noise amplification, or over-enhanced images.
3. Proposed Methods, Model Architecture
Our primary goal is to effectively detect fire on ships by training a model that detects ships, smoke, and fire, and mainly focuses on detecting ships that are on fire or with smoke without fire. We created custom dataset for various sea transports that are with fire, without fire, and with smoke, and fine-tuned YOLOv8 state-of-the-art single-shot detector model.
Moreover, since real-time object detection is relatively challenging due to its variances in object sizes and aspect ratios, inference speed and noise occurrences significantly affect object detection. In other words, high humidity levels in the marine environment can lead to haziness and reduce visibility in the atmosphere. Objects in real-world scenarios often exhibit diverse aspect ratios, meaning they can be elongated or compressed in various ways. This mainly results in images with reduced contrast and clarity, making it challenging to distinguish objects such as sea transports. Object detection algorithms often rely on well-defined features and patterns. Therefore, for better detection purposes we apply histogram equalization techniques for image enhancement, avoiding a narrow range of intensity values in ship images.
Traditional object detection models might struggle when confronted with such variability. Moreover, real-time object detection demands swift processing to keep pace with the continuous stream of input data, such as video frames. Slower inference speeds can lead to latency issues, causing a lag between the occurrence of an event and the model’s response. Considering above-mentioned obstacles, YOLO is a great approach.
3.1. YOLO Architecture
YOLOv1 was introduced in 2016; initial steps of real-time object detection of YOLO algorithm that consisted of 24 convolutional layers are shown in
Figure 2.
YOLOv1 takes an input image of fixed size (e.g., 448 × 448 pixels). Input image is then divided into as A × A grid, where each grid cell is responsible for predicting objects that fall within it. Each grid cell predicts B bounding boxes and confidence scores for those boxes. The final output is a tensor of dimensions (A,A,B × 5 + C), where B is the number of bounding boxes predicted per grid cell, 5 corresponds to the bounding box coordinates and C is the number of classes. YOLOv1 used stochastic gradient descent as its optimizer, localization loss, and classification loss functions. Loss function is designed to penalize both localization and classification errors. The λ
coord and λ
noobj were equalized to 5, set to regularization coefficients to regulate the magnitude of different parts of localized objects as shown in equation below.
where
denoted objects that appeared in cell
i and
denoted the
jth bounding box in cell
i that was set to prediction.
3.2. The Model Structure of Yolov8 Network
YOLOv8 stands as the latest iteration in the YOLO object detection model series, retaining the foundational architecture of its predecessors while introducing a myriad of enhancements. In the context of ship fire detection, the importance of YOLOv8 in real-time applications becomes evident. Utilizing a custom collection of ship images depicting both fire and non-fire scenarios, YOLOv8 proves instrumental in swiftly and accurately identifying instances of ship fires. This capability is particularly crucial for maritime safety, where timely detection of ship fires can significantly contribute to effective emergency response and disaster mitigation.
Figure 3 is representation of YOLOv8 architecture, which is built on PyTorch open-source machine leaning library. Backbone layer of model includes convolutional 2D (conv2d) image and batch normalization (U) in the same parameter, then Rectified Linear Unit (ReLU) activation function in leak parameter to handle negative inputs, allowing small non-zero gradients to propagate through the network. C is for concatenation and P (3,4,5) are detection model names.
3.3. Histogram Equalization Technique Application for Detection Enhancement
Histogram equalization (HE) technique application for the enhancement of ship fire images is a crucial approach in our study. As we mentioned, marine environment has high likelihood of becoming humid most of the time. Our objective is to enhance precision in ship fire detection and categorize images based on the presence or absence of fire. Adverse weather conditions associated with high humidity, such as fog, mist, or heavy rainfall, can significantly impact the quality of images and compromise the effectiveness of object detection. The presence of moisture in the air can cause reduced visibility, image distortion, and altered surface characteristics, making it difficult for algorithms to accurately identify and locate objects. Therefore, we combined trained ship fire detection model with HE. The proposed HE technique adjusts the brightness of ship images by evenly distributing over RGB channels. Moreover, HE produces unrealistic effects in photographs most of the time. To solve this, we also apply image upscaling technique by keeping image identity. The brightness distribution can be seen through cumulative density function (cdf) or cumulative distribution function, showing a line in 0 to 255 color channels.
where
represent the output value after applying the operation for HE to variable
is a rounding result of the expression inside the parenthesis operations,
is cumulative density at the point variable, and
is a constant for the minimum value.
M ×
N are dimension variables of an input image.
L is the equalized value (266), because pixel intensity is generally expressed in between 0 and 255.
3.4. Data Distribution
In maritime safety and emergency response, the swift and accurate detection of ship fires plays a pivotal role in mitigating potential disasters. Leveraging advancements in computer vision and deep learning, this methodology outlines a comprehensive approach to training a model specifically designed for the detection of ship fires. The process involves the meticulous collection of a diverse dataset.
To augment the dataset’s size and variability, video frames depicting ship fires are extracted, enriching the training material. The chosen model architecture, a variant of the widely used YOLO family, is tailored to facilitate real-time detection capabilities. Pre-trained on a large dataset, the model is fine-tuned to discern between two crucial classes: ships on fire and ships not on fire. The methodology encompasses critical steps, including dataset organization, data augmentation, model selection, and training parameter optimization, ensuring the development of a robust and reliable ship fire detection system.
This methodology not only emphasizes the technical intricacies of model training but also underscores the importance of continuous improvement. Regular updates, fine-tuning, and adaptation to evolving scenarios contribute to the model’s ongoing effectiveness in safeguarding maritime environments. Through a systematic and well-documented approach, this method serves as a valuable resource for those aiming to deploy advanced technologies for ship fire detection, ultimately enhancing safety measures within maritime operations.
Figure 4 depicts our dataset’s ship images in different scenarios. (a) shows wide visible burning ships, where fire is clearly shown. (b) is an example of ships on fire where smoke is the dominant feature to classify. (c) is an example of no-fire class. Overall, if ship images have smoke in them, we labeled those images as “fire” class. “Not-fire” class images are clear, without any smoke with only evaporation in the images.
Table 1 represents a comprehensive dataset we used to train our model, encompassing images representative of both fire and non-fire scenarios. The dataset comprised a total of 19,781 images, with meticulous attention given to balancing the representation of fire-related instances alongside non-fire instances, a critical consideration for ensuring model robustness and generalization.
Within the dataset, the category of “Fire” images encompassed a substantial count of 16,261 instances, indicative of the emphasis placed on capturing the diverse manifestations of fire-related occurrences. Correspondingly, “Non-Fire” images were meticulously curated to provide a complementary representation, totaling 5735 instances, thereby facilitating a comprehensive assessment of the model’s discriminative capabilities across varied scenarios.
For the purpose of model training and evaluation, the dataset was partitioned into distinct subsets, namely “Training Images” and “Validation Images”. The “Training Images” subset, consisting of both fire and non-fire instances, comprised 19,372 images, serving as the foundational corpus upon which the model’s learning process was predicated. In parallel, the “Validation Images” subset, comprising 6144 images, was employed to gauge the model’s performance and generalization ability on unseen data, thereby ensuring a rigorous evaluation framework.
The dataset comprises over 25,000 high-resolution images, capturing a wide array of maritime environments. Each image is meticulously labeled to indicate the presence or absence of a fire on the ship. The inclusion of both positive and negative instances aims to challenge object detection models to discern subtle details amidst the complex maritime backdrop. A notable challenge in this dataset is the variability in ship sizes and orientations, and the dynamic nature of fire occurrences. The aspect ratio of ships, combined with the unpredictable nature of fires, necessitates robust algorithms capable of handling these variations for accurate detection. The dataset is partitioned into training and validation sets to facilitate the development and evaluation of our ship fire detection model. We employ state-of-the-art object detection architectures and fine-tune them on our dataset. The training process involves optimizing for both accuracy and inference speed, balancing the need for precision with the demand for real-time performance.
5. Conclusions
In conclusion, this research has addressed the escalating challenge posed by onboard fires on container ships within the maritime industry. The discernible surge in incidents over recent years necessitates a proactive approach to enhance maritime safety and mitigate the associated risks. Leveraging the YOLO object detection algorithm, our study successfully developed an efficient and reliable system to detect ship fires. The model, trained on a comprehensive dataset comprising over 25,000 ship images, achieved an impressive accuracy exceeding 99%, underscoring its robust performance.
The multifaceted nature of ship fires, stemming from diverse causes such as electrical faults and human error, emphasizes the significance of advanced detection systems. Beyond immediate safety concerns, the implications of ship fires extend to environmental impacts, including the release of pollutants and greenhouse gases, contributing to global warming. Recognizing the urgency of addressing these challenges, our research advocates for the integration of state-of-the-art fire detection technologies into maritime safety systems. By implementing advanced detection systems, not only can we safeguard human lives and protect valuable cargo, but we also contribute to minimizing the ecological footprint associated with maritime disasters. This paper highlights the critical importance of proactive measures in preventing and responding to ship fires effectively. To enhance the accuracy of our ship fire detection model, we combined the trained model with HE algorithms to preprocess ship images, because the presence of moisture in the air, such as fog, mist, or heavy rainfall, can significantly influence the quality of images. The application of HE is set to increase object detection accuracy. As the maritime industry faces evolving challenges, embracing cutting-edge technologies becomes imperative to ensure the resilience and sustainability of maritime operations. Our research serves as a foundation for further advancements in ship fire detection and underscores the pivotal role of technology in shaping the future of maritime safety.