Fusing XGBoost and SHAP Models for Maritime Accident Prediction and Causality Interpretability Analysis
Abstract
:1. Introduction
2. Database
2.1. Data Sources
2.2. Statistical Analysis
3. Methodology
3.1. XGBoost Model
3.2. Interpretable Machine Learning Framework SHAP
4. Results and Analysis
4.1. Feature Variable Selection and Parameter Tuning
4.2. Feature Variable Selection and Parameter Tuning
4.3. Analysis of the Model Results
4.3.1. Analysis of the XGBoost Classification Prediction Results
- Evaluation of the Prediction Accuracy Results
- 2
- Analysis of Feature Importance
4.3.2. SHAP Interpretability Analysis
- |SHAP| Value Analysis of the Features
- 2.
- SHAP Value Analysis of the Features
- (1)
- Ship Collision AccidentsA comprehensive evaluation of the influence of each factor on ship collision accidents is shown in Figure 10. Among them, when the eigenvalues of ‘poor lookout’, ‘misjudgment’ and ‘failure to use safe speed’ are larger (red), the more positive the corresponding SHAP value is, indicating that these factors aggravate the possibility of a collision. The following three factors were used to analyze the specific causes of collision accidents: poor lookout, misjudgment and failure to use safe speed.
- (i)
- Unlike traffic accidents on land, the scope of what the eyes can see is relatively comprehensive. There are many uncertain factors at sea. Before an accident, the officer on duty could fail to make full use of all effective means that were suitable for the environment and conditions at that time, such as radar, AIS, etc., to maintain uninterrupted systematic observations. Such irregular operations will make the captain unable to know the situation of the surrounding sea in time, which will lead to chaos and a high risk of encountering maritime traffic.
- (ii)
- When sailing at sea, the ability to improvise is invaluable to an experienced captain. It is human nature to panic when encountering unexpected events and knowing how to avoid errors in judgment is a must for a captain to mature. Zhejiang’s navigation area, as the world’s busiest sea area, not only has a large number of merchant ships, but with its strong fishing industry, Zhejiang also has tens of thousands of fishing boats. For economic reasons, many small fishing boats without a license will choose to take risks during the closed season, which brings a huge risk of collisions during the normal navigation of a ship.
- (iii)
- Similar to traffic accidents on land, the use of safe speed for navigation when a ship is sailing at sea is also an important factor in avoiding accidents. As mentioned earlier, the density of ships in the Zhejiang navigation area is very high. As you can imagine, with such density, even if the crew members do their best, sometimes maritime accidents are inevitable. Ensuring that the ship’s equipment and facilities are in good condition can also make the captain more adept when navigating; otherwise, in an emergency, it will make the captain powerless.
- (2)
- Allision AccidentsFigure 11 shows a comprehensive evaluation of the influence of the factors on ship allision accidents. Among the most important are ‘misjudgment’, ‘insufficient crew’ and ‘failure to use safe speed’. The greater the value of these factors (red), the more positive the corresponding SHAP value, indicating that these factors aggravate the possibility of an allision accident. In the following, the three factors ‘misjudgment’, ‘efficient crew’ and ‘failure to use safe speed’ were used to analyze the specific causes of allision accidents.
- (i)
- All the collision accidents in this dataset are the responsibility of only one ship, and can almost be called unilateral accidents. In some of these accidents, the captain did not arrange a safe watch during berthing, and the ship was affected by ocean currents, causing the cable to break and then go out of control. In some cases, the captain did not check the actual height of the cargo and blindly steered the ship through the bridge. Some of them involved ships in cases of temporary water traffic deregulation, being out of the anchor’s range, not fixing routes as early as possible, a lack of indications on the chart, and the captain not fully considering the water traffic control or the ship traffic flow density and blindly choosing to plan a route through traffic flow congestion areas. With such misjudgments, maritime accidents are understandable.
- (ii)
- Poor visibility is also an important factor in allision accidents. In the vast sea, ships sailing at night are easily affected by night vision impairment, especially in the complex sailing environment, which is easily affected by background light on the shore, making it very easy to strike the dock or bridge. If the ship encounters poor visibility during navigation, and the captain fails to take measures such as blowing fog signals, driving at a safe speed and arranging regular lookouts according to the visibility conditions at that time, the probability of collision and allision accidents at sea will be significantly increased.
- (iii)
- The meteorological and hydrological conditions of the quay also affect allision accidents. In some cases, the shore base lacked sufficient support, but some captains were blindly confident that their ships could be berthed, ignoring the fact that they lacked berthing space. Some large tonnage ships attempted to berth at small quays. All these phenomena frequently occurred in Zhejiang’s waters, and all these risk factors may lead to maritime accidents.
- (3)
- Capsizing AccidentsA comprehensive evaluation of the effects of the factors on capsizing accidents is shown in Figure 12. Among the factors, ‘unreasonable loading mode’, ‘improper operation’, ‘strong wind’ and ‘high tidal current’ were the most important. The higher the characteristic’s value (red), the higher the corresponding SHAP value, indicating that these factors will aggravate the possibility of capsizing accidents. In the following, the SHAP values of ‘unreasonable loading mode’, ‘imperfect operation’, ‘strong wind’ and ‘high tidal current’ were used to analyze the specific causes of capsizing accidents.
- (i)
- When a capsizing accident occurs, an unreasonable loading method is arguably the primary factor leading to a maritime accident. The more a vessel is loaded with cargo in an unreasonable manner, the more likely it is to cause a capsizing accident. This is not only related to the properties of the cargo itself, but also to the captain’s excessive pursuit of saving time and effort to reduce costs and to carry out illegal operations. For example, after loading had been completed on one ship, the bulky cargo was higher than the circumference of the ship’s cargo hold hatch, the cargo hold was not covered with a hatch cover, the cargo was covered with canvas, and then the canvas was tied and fixed. There was no calculation of the lashing force and stability strength for the ship’s loading condition. When the ship encountered strong wind or the ocean current was fast, the cargo was jolted and displaced, which led to the ship becoming unbalanced, which led to water in the cargo hold and the ship tilted. With an increase in the ship’s roll, the water in the cargo hold increased, and the ship sank rapidly in the strong wind and waves.
- (ii)
- In addition, the captain did not fully consider the maneuvering ability of the ship and the ability to resist wind and waves due to the fast speed of some ships and the harsh meteorological conditions in the waters where the accident occurred. Due to insufficient estimation of the danger at that time, the ship ventured to sail close to the waters of the island and reefs, which led to the flooding of the hull with the bow wave, and then the ship lost stability and capsized [40].
- (iii)
- In fact, maritime operations are a comprehensive behavioral performance influenced by multiple factors. For example, unlike the general perception, the SHAP value shows that a rapid tide does not necessarily bring an increased risk of maritime accidents. It may be that the crew are more alert and pay more attention to the safety of navigation during a rapid tide. Focusing on a single characteristic alone cannot fully analyze the causal results of the orderliness of maritime navigation, so it is necessary to further investigate the effects of multiple factors on the orderliness of maritime navigation.
5. Conclusions
- (i)
- There are few maritime accident data samples. The data adopted in this study were only from the maritime data of Zhejiang Province, China, and the relationship between maritime accidents and their characteristics analyzed here is applicable to the maritime situation of Zhejiang Province. The relationship between maritime accidents and accident risk characteristics can be further analyzed by obtaining more data from ports in other coastal provinces. At the same time, different possible causes of maritime accidents in different regions could be analyzed, and more targeted preventive measures could be proposed.
- (ii)
- The summary of risk characteristics of maritime accidents in Table 1 is based on accident reports and work experience and lacks specific methodology and theoretical support.
- (i)
- The interpretability analysis model of SHAP adopted in this study can visualize and quantify the risk characteristics of accidents, but it belongs to the class of post hoc analysis. We are committed to developing a methodological model that provides real-time risk quantification during maritime operations.
- (ii)
- According to relevant studies, it is highly likely that human factors lead to accidents in maritime accidents. In future research, we will combine the theory of human factor reliability analysis to analyze the human factor accidents in maritime accidents.
Author Contributions
Funding
Conflicts of Interest
References
- IMO. Statistics. Available online: https://www.imo.org/en/OurWork/IIIS/Pages/Statistics.aspx (accessed on 2 July 2022).
- INTERCARGO. Bulk Carrier Casualty Report 2012–2021. Available online: https://www.intercargo.org/bulk-carrier-casualty-report-2012-2021/.pdf (accessed on 2 July 2022).
- EMSA. Annual Overview of Marine Casualties and Incidents 2021; EMSA: Griffin, Germany, 2022. [Google Scholar]
- Chauvin, C.; Lardjane, S.; Morel, G.; Clostermann, J.-P.; Langard, B. Human and organisational factors in maritime accidents: Analysis of collisions at sea using the HFACS. Accid. Anal. Prev. 2013, 59, 26–37. [Google Scholar] [CrossRef]
- Zhang, J.; Teixeira, Â.P.; Soares, C.G.; Yan, X. Quantitative assessment of collision risk influence factors in the Tianjin port. Saf. Sci. 2018, 110, 363–371. [Google Scholar] [CrossRef]
- Heij, C.; Knapp, S. Effects of wind strength and wave height on ship incident risk: Regional trends and seasonality. Transp. Res. Part D Transp. Environ. 2015, 37, 29–39. [Google Scholar] [CrossRef] [Green Version]
- Ståhlberg, K.; Goerlandt, F.; Ehlers, S.; Kujala, P. Impact scenario models for probabilistic risk-based design for ship-ship collision. Mar. Struct. 2013, 33, 238–264. [Google Scholar] [CrossRef]
- Deng, J.; Liu, S.; Xie, C.; Liu, K. Risk Coupling Characteristics of Maritime Accidents in Chinese Inland and Coastal Waters Based on NK Model. J. Mar. Sci. Eng. 2021, 10, 4. [Google Scholar] [CrossRef]
- Erol, S.; Demir, M.; Çetişli, B.; Eyüboğlu, E. Analysis of ship accidents in the Istanbul Strait using neuro-fuzzy and genetically optimised fuzzy classifiers. J. Navig. 2018, 71, 419–436. [Google Scholar] [CrossRef]
- Xue, J.; Papadimitriou, E.; Reniers, G.; Wu, C.; Jiang, D.; van Gelder, P. A comprehensive statistical investigation framework for characteristics and causes analysis of ship accidents: A case study in the fluctuating backwater area of Three Gorges Reservoir region. Ocean. Eng. 2021, 229, 108981. [Google Scholar]
- Faghih-Roohi, S.; Xie, M.; Ng, K.M. Accident risk assessment in marine transportation via Markov modelling and Markov Chain Monte Carlo simulation. Ocean. Eng. 2014, 91, 363–370. [Google Scholar] [CrossRef]
- Roberts, S.E.; Pettit, S.J.; Marlow, P.B. Casualties and loss of life in bulk carriers from 1980 to 2010. Mar. Policy 2013, 42, 223–235. [Google Scholar] [CrossRef]
- Li, G.; Weng, J.; Hou, Z. Impact analysis of external factors on human errors using the ARBN method based on small-sample ship collision records. Ocean. Eng. 2021, 236, 109533. [Google Scholar] [CrossRef]
- Wu, B.; Zhao, C.; Yip, T.L.; Jiang, D. A novel emergency decision-making model for collision accidents in the Yangtze River. Ocean. Eng. 2021, 223, 108622. [Google Scholar] [CrossRef]
- Fan, L.; Zheng, L.; Luo, M. Effectiveness of port state control inspection using Bayesian network modelling. Marit. Policy Manag. 2022, 49, 261–278. [Google Scholar] [CrossRef]
- Maceiras, C.; Pérez-Canosa, J.; Vergara, D.; Orosa, J. A Detailed Identification of Classificatory Variables in Ship Accidents: A Spanish Case Study. J. Mar. Sci. Eng. 2021, 9, 192. [Google Scholar] [CrossRef]
- Jiang, M.; Lu, J.; Yang, Z.; Li, J. Risk analysis of maritime accidents along the main route of the Maritime Silk Road: A Bayesian network approach. Marit. Policy Manag. 2020, 47, 815–832. [Google Scholar] [CrossRef]
- Wu, B.; Tang, Y.; Yan, X.; Soares, C.G. Bayesian network modelling for safety management of electric vehicles transported in RoPax ships. Reliab. Eng. Syst. Saf. 2021, 209, 107466. [Google Scholar] [CrossRef]
- Wynants, L.; Van Calster, B.; Collins, G.S.; Riley, R.D.; Heinze, G.; Schuit, E.; Bonten, M.M.J.; Dahly, D.L.; Damen, J.A.; Debray, T.P.A.; et al. Prediction models for diagnosis and prognosis of covid-19: Systematic review and critical appraisal. BMJ 2020, 369, m1328. [Google Scholar]
- Pacoureau, N.; Rigby, C.L.; Kyne, P.M.; Sherley, R.B.; Winker, H.; Carlson, J.K.; Fordham, S.V.; Barreto, R.; Fernando, D.; Francis, M.P.; et al. Half a century of global decline in oceanic sharks and rays. Nature 2021, 589, 567–571. [Google Scholar] [CrossRef]
- Kou, G.; Chao, X.; Peng, Y.; Alsaadi, F.E.; Herrera-Viedma, E. Machine learning methods for systemic risk analysis in financial sectors. Technol. Econ. Dev. Econ. 2019, 25, 716–742. [Google Scholar] [CrossRef]
- Kumar, N.; Poonia, V.; Gupta, B.; Goyal, M.K. A novel framework for risk assessment and resilience of critical infrastructure towards climate change. Technol. Forecast. Soc. Change 2021, 165, 120532. [Google Scholar] [CrossRef]
- Berk, R.; Heidari, H.; Jabbari, S.; Kearns, M.; Roth, A. Fairness in criminal justice risk assessments: The state of the art. Sociol. Methods Res. 2021, 50, 3–44. [Google Scholar] [CrossRef]
- Zhang, W.; Wu, C.; Zhong, H.; Li, Y.; Wang, L. Prediction of undrained shear strength using extreme gradient boosting and random forest based on Bayesian optimization. Geosci. Front. 2021, 12, 469–477. [Google Scholar] [CrossRef]
- Zhang, D.; Chen, H.-D.; Zulfiqar, H.; Yuan, S.-S.; Huang, Q.-L.; Zhang, Z.-Y.; Deng, K.-J. iBLP: An XGBoost-based predictor for identifying bioluminescent proteins. Comput. Math. Methods Med. 2021, 2021, 6664362. [Google Scholar] [CrossRef] [PubMed]
- Zhao, D.; Wang, J.; Zhao, X.; Triantafilis, J. Clay content mapping and uncertainty estimation using weighted model averaging. Catena 2022, 209, 105791. [Google Scholar] [CrossRef]
- Yuan, C.; Li, Y.; Huang, H.; Wang, S.; Sun, Z.; Wang, H. Application of explainable machine learning for real-time safety analysis toward a connected vehicle environment. Accid. Anal. Prev. 2022, 171, 106681. [Google Scholar] [CrossRef]
- Qi, H.; Yao, Y.; Zhao, X.; Guo, J.; Zhang, Y.; Bi, C. Applying an interpretable machine learning framework to the traffic safety order analysis of expressway exits based on aggregate driving behavior data. Phys. A Stat. Mech. Its Appl. 2022, 597, 127277. [Google Scholar] [CrossRef]
- Parsa, A.B.; Movahedi, A.; Taghipour, H.; Derrible, S.; Mohammadian, A.K. Toward safer highways, application of XGBoost and SHAP for real-time accident detection and feature analysis. Accid. Anal. Prev. 2020, 136, 105405. [Google Scholar] [CrossRef]
- Zhejiang Provincial Bureau of Statistics. Statistical Bulletin of National Economic and Social Development of Zhejiang Province in 2021; Zhejiang Provincial Bureau of Statistics: Hangzhou, China, 2022.
- Zhejiang Maritime Safety Administration. Statistics of Ships of Zhejiang Maritime Safety Administration 2016–2021; Zhejiang Maritime Safety Administration: Hangzhou, China, 2022.
- Zhejiang Maritime Safety Administration. Analysis Report on Water Safety Situation of Zhejiang Maritime Safety Administration in 2021 and the Fourth Quarter; Zhejiang Maritime Safety Administration: Hangzhou, China, 2022.
- Onyshchenko, S.; Shibaev, O.; Melnyk, O. Assessment of potential negative impact of the system of factors on the ship’s operational condition during transportation of oversized and heavy cargoes. Trans. Marit. Sci. 2021, 10, 126–134. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
- Fan, S.; Blanco-Davis, E.; Yang, Z.; Zhang, J.; Yan, X. Incorporation of human factors into maritime accident analysis using a data-driven Bayesian network. Reliab. Eng. Syst. Saf. 2020, 203, 107070. [Google Scholar] [CrossRef]
- Qiao, W.; Liu, Y.; Ma, X.; Liu, Y. A methodology to evaluate human factors contributed to maritime accident by mapping fuzzy FT into ANN based on HFACS. Ocean. Eng. 2020, 197, 106892. [Google Scholar] [CrossRef]
- Ahn, S.I.; Kurt, R.E.; Akyuz, E. Application of a SPAR-H based framework to assess human reliability during emergency response drill for man overboard on ships. Ocean. Eng. 2022, 251, 111089. [Google Scholar] [CrossRef]
- Lv, P.; Zhen, R.; Shao, Z. A Novel Method for Navigational Risk Assessment in Wind Farm Waters Based on the Fuzzy Inference System. Math. Probl. Eng. 2021, 2021, 4588333. [Google Scholar] [CrossRef]
- Szlapczynski, R. Evolutionary sets of safe ship trajectories: A new approach to collision avoidance. J. Navig. 2011, 64, 169–181. [Google Scholar] [CrossRef]
- Zhejiang Provincial Bureau of Statistics. Water Safety Accident. Available online: https://www.zj.msa.gov.cn/ZJ/zwgk/gkml/xzqz/index.html (accessed on 4 July 2022).
General Feature Classification | Second-Level Feature Classification | Symbol | Number | Elements |
---|---|---|---|---|
Management characteristics | Company, ship | P1–P5 | 5 | Inadequate implementation of SMS, inadequate shore-based support, poor duty arrangement, personnel training and equipment maintenance |
Environmental characteristics | Weather, navigation environment | O1–O10 | 10 | Poor visibility, strong wind, high tidal current, high ship density, narrow waterways, fishing areas, natural environment (e.g., intermediate obstacles and channel curvature), no shipping marks set or damaged, poor channel maintenance and no chart mark |
Personnel characteristics | Competency, lack of lookout, operational errors, insufficient crew, fishing boats, failure to use safe speed, negligence of good sailing practice, lack of crew responsibility, hit and run | V1–V15 | 15 | Inadequate training and drills, poor psychological quality, poor physical quality, poor lookout, misjudgment, improper operation, insufficient crew, fishing boat crew grabbing the bow, fishing boat crew fatigue, failure to use safe speeds, negligence of good sailing practice, failure to equip or correct charts, crew in poor condition (drunk and/or fatigued), failure to stand by the engine and anchor, hitting and running |
Ship characteristics | Equipment, maintenance, goods, fishing boats | T1–T9 | 9 | Incomplete equipment, unreasonable design, equipment damage, loss of function, unseaworthiness, unreasonable loading mode, poor ship condition, fishing boat operation mode, fishing boat sailing without a license |
Pilotage characteristics | Inaccurate pilotage scheme, improper pilot operation, communication and cooperation between pilot and pilot, physical and mental state of pilot | R1–R4 | 4 | Unsuitable pilotage plan, improper pilot operation, communication and cooperation between pilot and pilot, physical and mental state of pilot |
Wharf characteristics | Improper berthing arrangements and over-standard berthing | S1–S4 | 4 | Lack of berthing space, poor navigation of surrounding ships, interference of frontier meteorological and hydrological conditions, and over-standard berthing |
Parameters | Set Value | Parameters | Set Value |
---|---|---|---|
max_depth | 5 | n_estimators | 100 |
min_child_weight | 1 | scale_pos_weight | 1 |
gama | 0.1 | alpha | 0.1 |
subsample | 0.8 | lambda | 1 |
Colsample_bytree | 0.8 | Learning_rate | 0.1 |
Boost (General parameters) | gbtree | Num_class (Task parameters) | 2 |
Evaluation Indicators | XGBoost | RF | LR |
---|---|---|---|
Acc | 97.14 | 94.28 | 91.42 |
Pre | 97.14 | 94.28 | 91.42 |
Rec | 97.14 | 94.28 | 91.42 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, C.; Zou, X.; Lin, C. Fusing XGBoost and SHAP Models for Maritime Accident Prediction and Causality Interpretability Analysis. J. Mar. Sci. Eng. 2022, 10, 1154. https://doi.org/10.3390/jmse10081154
Zhang C, Zou X, Lin C. Fusing XGBoost and SHAP Models for Maritime Accident Prediction and Causality Interpretability Analysis. Journal of Marine Science and Engineering. 2022; 10(8):1154. https://doi.org/10.3390/jmse10081154
Chicago/Turabian StyleZhang, Cheng, Xiong Zou, and Chuan Lin. 2022. "Fusing XGBoost and SHAP Models for Maritime Accident Prediction and Causality Interpretability Analysis" Journal of Marine Science and Engineering 10, no. 8: 1154. https://doi.org/10.3390/jmse10081154
APA StyleZhang, C., Zou, X., & Lin, C. (2022). Fusing XGBoost and SHAP Models for Maritime Accident Prediction and Causality Interpretability Analysis. Journal of Marine Science and Engineering, 10(8), 1154. https://doi.org/10.3390/jmse10081154