Ordinal Random Tree with Rank-Oriented Feature Selection (ORT-ROFS): A Novel Approach for the Prediction of Road Traffic Accident Severity
Abstract
:1. Introduction
- Novel ordinal classification method: Ordinal random tree (ORT) is introduced as an innovative approach for traffic accident severity prediction for the first time in the literature, addressing the limitations of traditional nominal classification methods.
- Handling ordinal complexity with binary decomposition: ORT uses binary decomposition (BD) to transform the multi-class ordinal task into simpler two-class problems, making it easier to model the progression in accident severity levels and boosting classification accuracy.
- Selecting features based on class orderings: It incorporates rank-oriented feature selection (ROFS), a new technique that chooses features based on the ordered progression of accident severity levels. Such selection advances the model’s ability to differentiate between severity classes appropriately.
- Addressing class imbalance: Incident severity datasets often exhibit a class imbalance, where the number of fatal incidents is substantially lower than non-fatal ones. ORT-ROFS uses the synthetic minority over-sampling technique (SMOTE) to address this imbalance by augmenting the minority fatal classes, achieving more robust predictions.
- Providing explainability with a tree structure: ORT-ROFS builds a tree-based model, which can be easily interpretable, explainable, and understandable by humans while maintaining high predictive accuracy. Since the tree model is like a flowchart, ORT-ROFS can be seen as an explainable artificial intelligence (XAI) method.
- Enhanced prediction accuracy: Experimental results showed that ORT-ROFS achieved an average improvement of 10.81% over state-of-the-art methods. In addition, it demonstrated an average improvement of 4.58% in accuracy over traditional methods by considering orders among class labels both in the feature selection and prediction stages.
2. Related Works
3. Material and Methods
3.1. General Description of the Proposed Method
3.2. Formal Description of the Proposed Method
3.2.1. Class Imbalance Handling with SMOTE
3.2.2. Ordinal Classification
3.2.3. Binary Decomposition
3.2.4. Rank-Oriented Feature Selection with Pearson Correlation
3.2.5. Random Tree Classifier
3.3. ORT-ROFS Algorithm
Algorithm 1: Ordinal Random Tree with Rank-Oriented Feature Selection (ORT-ROFS) |
Inputs: D: the ordinal dataset with n instances such that D={(x1,y1), (x2,y2), …, (xn,yn)} Y: ordinal class labels y ϵ {c1,c2, …,ck} with an order c1 < c2 < … < ck : the minority class(es) T: new instances to be predicted Outputs: : predicted class labels for the inputs in |
//Step 1—Synthetic Minority Oversampling Technique (SMOTE) for i = 1 to k do foreach (xj,yj) in D counts[yi]+ = 1 end foreach if counts[i] < threshold Minor.Add(i) end if end for foreach class in DMinor syntheticSamples = SMOTE (class) D.Add (syntheticSamples) end foreach //Step 2—Construction of Binary Datasets for i = 1 to k − 1 do foreach (xj,yj) in D if (yj < = ci) Di.Add (xj,0) else Di.Add (xj,1) end if end foreach end for //Step 3—Rank-Oriented Feature Selection (ROFS) for i = 1 to k − 1 do FS = FeatureAnalysis (Di) ROFS = ROFS Ս FS end for D.Apply (ROFS) //Step 4—Construction of Models for i = 1 to k − 1 do Mi =RandomTree (Di) M* = M* Ս Mi end for //Step 5—Classification of New Samples in T foreach in y = M*(x) = MAX ( c1) for i = 2 to k−1 do ci)) end for ck−1)) = end foreach End Algorithm |
3.4. Dataset Description
4. Experimental Studies
- Accuracy: It measures the proportion of correctly classified instances among the total instances. It is a simple and widely used metric to evaluate the overall performance of a machine-learning model.
- Precision: It evaluates the proportion of true positive predictions among all positive predictions made by the model. It is particularly useful in imbalanced datasets.
- Recall: It is also known as sensitivity or true positive rate, and measures the proportion of actual positive instances correctly identified.
- F-measure: It is the harmonic mean of precision and recall metrics, providing a balance between the two. It is particularly useful when the dataset is imbalanced.
- SMOTE was used to address the class imbalance inherent in the road traffic accidents dataset, particularly for the “serious injury” and “fatal injury” classes. SMOTE generates new samples according to the neighborhood strategy, where the key parameter “nearestNeighbors” determines the number of neighbors considered when creating synthetic examples. To determine the optimal parameter value, we systematically tested a range of “nearestNeighbors” values (k = 5, 6, 7, 8, 9, 10). The results demonstrated that k = 5 achieved the highest accuracy for the ORT-ROFS method, shown in Table 3. Therefore, in our experiments, this parameter was set to 5. The other key parameters of SMOTE include “ClassValue”, which specifies the target class for oversampling, and “Percentage”, which defines the percentage increase in instances for that class. Here, SMOTE was applied twice: the first time with “ClassValue” set to 2 and “Percentage” set to 400 to augment the “serious injury” class, and the second time with “ClassValue” set to 3 and “Percentage” set to 200 to augment the “fatal injury” class. These configurations were selected to balance the dataset while maintaining meaningful relationships between the classes.
- For the process of ordinal classification, the “batchSize” was set to 100, which is the default value. We tested various values for the “batchSize” parameter and observed no significant changes in the results. The classifier was configured as “Random Tree”, leveraging its ability to model nonlinear relationships in the data competently. Here, the binary decomposition was employed to transform the ordinal problem into multiple binary classification problems. This decomposition allows the random tree classifier to handle ordinal relationships effectively.
- ROFS was employed to select the most relevant features while preserving the ordinal relationships in the dataset. In our method, the attribute evaluator was set to “CorrelationAttributeEval” to measure the correlation between each feature and the target class, and the search method was configured as “Ranker” to rank and select attributes based on their correlation scores. The evaluator calculates the Pearson correlation coefficient for numeric attributes, while nominal attributes are evaluated by treating each value as an indicator variable and computing an overall correlation via a weighted average. Under these settings, key features including Hour, Day_of_week, Age_band_of_driver, Types_of_junction, Light_conditions, Weather_conditions, Number_of_vehicles_involved, and Number_of_casualties were chosen as the most influential in predicting traffic accident severity. These selected features were then used for classification, leading to effective prediction results.
- Random tree classifier served as the base classifier for the proposed ORT-ROFS method due to its capability to model complex relationships. The “maxDepth”, determining the maximum depth of the tree, was tested for values ranging from 1 to 20, as illustrated in Figure 2. Setting “maxDepth” to “NaN” allows the tree to grow without restrictions, effectively making it limitless. The results showed that accuracy steadily increased and reached 87.08% at 20. Notably, the unrestricted setting of “maxDepth” achieved an accuracy of 87.19%, which was slightly higher than the best accuracy observed within the tested range. This indicates that while deeper trees generally lead to better accuracy, allowing the tree to grow without restrictions can still yield competitive results, demonstrating the flexibility of the model.
- Random tree (RT) is the baseline method, showing an accuracy of 84.09%. This method does not apply any specific handling of ordinal relationships or feature selection, offering a solid but unspecialized approach to the dataset.
- Ordinal random tree (ORT), which incorporates handling for ordinal data through binary decomposition of the target classes, achieved a slight improvement in accuracy of 84.65% over RT. By explicitly considering the ordered nature of the target variable, ORT demonstrates that using the class ordinality can enhance classification performance, even without feature selection.
- Ordinal random tree with feature selection (ORT-FS), which applies Pearson correlation for feature selection on multi-class targets, achieved an accuracy of 81.44%. However, despite the application of ordinal classification and feature selection, the approach may not fully capture the nuances of ordinal relationships as effectively as ORT and ORT-ROFS, resulting in lower performance.
- Ordinal random tree with rank-oriented feature selection (ORT-ROFS) mathematically expressed the best performance across all metrics, with an accuracy of 87.19%. This method, which applies the ROFS technique on binary class targets derived from the dataset, yielded the highest precision of 87.20, recall of 87.19, and F-measure of 87.16. It showcased a superior ability to handle the ordinal nature of the data while also selecting the most relevant features, leading to a noticeable improvement in classification accuracy.
5. Discussion
6. Conclusions and Future Works
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
AdaBoost | Adaptive boosting |
ANN | Artificial neural network |
AUC | Area under curve |
BD | Binary decomposition |
CART | Classification and regression trees |
CatBoost | Categorical boosting |
CNN | Convolutional neural network |
DT | Decision tree |
FCM | Fuzzy c-means |
FPR | False positive rate |
FS | Feature selection |
GB | Gradient boosting |
GPC | Gaussian process classifier |
KNN | K-nearest neighbors |
LGBM | Light gradient boosting machine |
LR | Logistic regression |
MBE | Mean bias error |
ML | Machine learning |
MPE | Mean percentage error |
MSE | Mean square error |
NB | Naive Bayes |
OC | Ordinal classification |
ORT | Ordinal random tree |
ORT-FS | Ordinal random tree with feature selection |
ORT-ROFS | Ordinal random tree with rank-oriented feature selection |
RBF | Radial basis function |
RF | Random forest |
RMSE | Root mean square error |
ROC | Receiver operating characteristic |
ROFS | Rank-oriented feature selection |
RT | Random tree |
RTA | Road traffic accidents |
SMOTE | Synthetic minority oversampling technique |
SVM | Support vector machines |
TNR | True negative rate |
TPR | True positive rate |
XGB | Extreme gradient boosting |
References
- Ali, Y.; Hussain, F.; Haque, M.M. Advances, Challenges, and Future Research Needs in Machine Learning-Based Crash Prediction Models: A Systematic Review. Accid. Anal. Prev. 2024, 194, 107378. [Google Scholar] [CrossRef] [PubMed]
- Hee, L.V.; Khamis, N.; Noor, R.M.; Abdul Karim, S.A.; Puspitasari, P. Predicting Fatality in Road Traffic Accidents: A Review on Techniques and Influential Factors. In Intelligent Systems Modeling and Simulation III; Abdul Karim, S.A., Ed.; Studies in Systems, Decision and Control; Springer: Cham, Switzerland, 2024; Volume 553. [Google Scholar] [CrossRef]
- Chai, A.B.Z.; Lau, B.T.; Tee, M.K.T.; McCarthy, C. Enhancing Road Safety with Machine Learning: Current Advances and Future Directions in Accident Prediction Using Non-Visual Data. Eng. Appl. Artif. Intell. 2024, 137, 109086. [Google Scholar] [CrossRef]
- Wen, X.; Xie, Y.; Jiang, L.; Pu, Z.; Ge, T. Applications of Machine Learning Methods in Traffic Crash Severity Modelling: Current Status and Future Directions. Transp. Rev. 2021, 41, 855–879. [Google Scholar] [CrossRef]
- Wang, J.; Zhao, C.; Liu, Z. Can Historical Accident Data Improve Sustainable Urban Traffic Safety? A Predictive Modeling Study. Sustainability 2024, 16, 9642. [Google Scholar] [CrossRef]
- Qi, Z.; Yao, J.; Zou, X.; Pu, K.; Qin, W.; Li, W. Investigating Factors Influencing Crash Severity on Mountainous Two-Lane Roads: Machine Learning Versus Statistical Models. Sustainability 2024, 16, 7903. [Google Scholar] [CrossRef]
- Pourroostaei Ardakani, S.; Liang, X.; Mengistu, K.T.; So, R.S.; Wei, X.; He, B.; Cheshmehzangi, A. Road Car Accident Prediction Using a Machine-Learning-Enabled Data Analysis. Sustainability 2023, 15, 5939. [Google Scholar] [CrossRef]
- Frank, E.; Hall, M. A Simple Approach to Ordinal Classification. In Machine Learning: ECML 2001; De Raedt, L., Flach, P., Eds.; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2001; Volume 2167. [Google Scholar] [CrossRef]
- Fürnkranz, J.; Hüllermeier, E.; Vanderlooy, S. Binary Decomposition Methods for Multipartite Ranking. In Machine Learning and Knowledge Discovery in Databases; Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J., Eds.; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2009; Volume 5781. [Google Scholar] [CrossRef]
- Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
- Chandrashekar, G.; Sahin, F. A survey on feature selection methods. Comput. Electr. Eng. 2018, 66, 31–47. [Google Scholar] [CrossRef]
- Dhal, P.; Azad, C. A comprehensive survey on feature selection in the various fields of machine learning. Appl. Intell. 2022, 52, 4543–4581. [Google Scholar] [CrossRef]
- Frank, E.; Kirkby, R. Random Tree. Available online: http://weka.sourceforge.net/doc.dev/weka/classifiers/trees/RandomTree.html (accessed on 22 November 2024).
- Arciniegas-Ayala, C.; Marcillo, P.; Valdivieso Caraguay, Á.L.; Hernández-Álvarez, M. Prediction of Accident Risk Levels in Traffic Accidents Using Deep Learning and Radial Basis Function Neural Networks Applied to a Dataset with Information on Driving Events. Appl. Sci. 2024, 14, 6248. [Google Scholar] [CrossRef]
- Muktar, B.; Fono, V. Toward Safer Roads: Predicting the Severity of Traffic Accidents in Montreal Using Machine Learning. Electronics 2024, 13, 3036. [Google Scholar] [CrossRef]
- Obasi, I.; Benson, C. Evaluating the effectiveness of machine learning techniques in forecasting the severity of traffic accidents. Heliyon 2023, 9, e18812. [Google Scholar] [CrossRef] [PubMed]
- Gatarić, D.; Ruškić, N.; Aleksić, B.; Đurić, T.; Pezo, L.; Lončar, B.; Pezo, M. Predicting Road Traffic Accidents—Artificial Neural Network Approach. Algorithms 2023, 16, 257. [Google Scholar] [CrossRef]
- Aldhari, I.; Almoshaogeh, M.; Jamal, A.; Alharbi, F.; Alinizzi, M.; Haider, H. Severity Prediction of Highway Crashes in Saudi Arabia Using Machine Learning Techniques. Appl. Sci. 2023, 13, 233. [Google Scholar] [CrossRef]
- Yan, M.; Shen, Y. Traffic Accident Severity Prediction Based on Random Forest. Sustainability 2022, 14, 1729. [Google Scholar] [CrossRef]
- Islam, M.K.; Reza, I.; Gazder, U.; Akter, R.; Arifuzzaman, M.; Rahman, M.M. Predicting Road Crash Severity Using Classifier Models and Crash Hotspots. Appl. Sci. 2022, 12, 11354. [Google Scholar] [CrossRef]
- Khattak, A.; Almujibah, H.; Elamary, A.; Matara, C.M. Interpretable Dynamic Ensemble Selection Approach for the Prediction of Road Traffic Injury Severity: A Case Study of Pakistan’s National Highway N-5. Sustainability 2022, 14, 12340. [Google Scholar] [CrossRef]
- Dong, S.; Khattak, A.; Ullah, I.; Zhou, J.; Hussain, A. Predicting and Analyzing Road Traffic Injury Severity Using Boosting-Based Ensemble Learning Models with SHAPley Additive exPlanations. Int. J. Environ. Res. Public Health 2022, 19, 2925. [Google Scholar] [CrossRef]
- Santos, D.; Saias, J.; Quaresma, P.; Nogueira, V.B. Machine Learning Approaches to Traffic Accident Analysis and Hotspot Prediction. Computers 2021, 10, 157. [Google Scholar] [CrossRef]
- Boo, Y.; Choi, Y. Comparison of Prediction Models for Mortality Related to Injuries from Road Traffic Accidents after Correcting for Undersampling. Int. J. Environ. Res. Public Health 2021, 18, 5604. [Google Scholar] [CrossRef]
- Fiorentini, N.; Losa, M. Handling Imbalanced Data in Road Crash Severity Prediction by Machine Learning Algorithms. Infrastructures 2020, 5, 61. [Google Scholar] [CrossRef]
- Assi, K.; Rahman, S.M.; Mansoor, U.; Ratrout, N. Predicting Crash Injury Severity with Machine Learning Algorithm Synergized with Clustering Technique: A Promising Protocol. Int. J. Environ. Res. Public Health 2020, 17, 5497. [Google Scholar] [CrossRef] [PubMed]
- Lee, J.; Yoon, T.; Kwon, S.; Lee, J. Model Evaluation for Forecasting Traffic Accident Severity in Rainy Seasons Using Machine Learning Algorithms: Seoul City Study. Appl. Sci. 2020, 10, 129. [Google Scholar] [CrossRef]
- Assi, K. Traffic Crash Severity Prediction—A Synergy by Hybrid Principal Component Analysis and Machine Learning Models. Int. J. Environ. Res. Public Health 2020, 17, 7598. [Google Scholar] [CrossRef]
- Benesty, J.; Chen, J.; Huang, Y.; Cohen, I. Pearson Correlation Coefficient. In Noise Reduction in Speech Processing; Springer: Berlin/Heidelberg, Germany, 2009; pp. 1–4. [Google Scholar] [CrossRef]
- Dagum, C. Decomposition and interpretation of Gini and the generalized entropy inequality measures. Statistica 1997, 57, 295–308. [Google Scholar]
- Bedane, T.T. Road Traffic Accident Dataset of Addis Ababa City; Mendeley Data. Available online: https://data.mendeley.com/datasets/xytv86278f/1 (accessed on 22 November 2024).
- Shahane, S. Road Traffic Accident Dataset of Addis Ababa City; Kaggle. Available online: https://www.kaggle.com/datasets/saurabhshahane/road-traffic-accidents (accessed on 22 November 2024).
- Witten, I.H.; Frank, E.; Hall, M.A.; Pal, C.J. Data Mining: Practical Machine Learning Tools and Techniques, 4th ed.; Morgan Kaufmann: Cambridge, MA, USA, 2016; pp. 1–664. [Google Scholar]
- Xiao, Y.; Duan, Z. Improving crash injury severity prediction via ensemble models and sampling techniques for addressing data imbalance. SSRN-Soc. Sci. Res. Netw. 2024, 4730366. [Google Scholar] [CrossRef]
- Obaid, M.A.A. Driver’s Accident Behavioral Analytics Using AI. Master’s Thesis, Rochester Institute of Technology, Dubai, United Arab Emirates, 2024. Available online: https://repository.rit.edu/theses/11748/ (accessed on 1 January 2025).
- Ramya, A.; Eswari, M.S. Road accident severity in India: A machine learning approach. Int. J. Progress. Res. Eng. Manag. Sci. 2024, 4, 372–375. [Google Scholar]
- Endalie, D.; Abebe, W.T. Analysis and detection of road traffic accident severity via data mining techniques: Case study Addis Ababa, Ethiopia. Math. Probl. Eng. 2023, 2023, 6536768. [Google Scholar] [CrossRef]
- Kodepogu, K.; Manjeti, V.B.; Siriki, A.B. Machine learning for road accident severity prediction. Mechatron. Intell. Transp. Syst. 2023, 2, 211–226. [Google Scholar] [CrossRef]
- Adeliyi, T.T.; Oluwadele, D.; Igwe, K.; Aroba, O.J. Analysis of road traffic accidents severity using a pruned tree-based model. Int. J. Transp. Dev. Integr. 2023, 7, 131–138. [Google Scholar] [CrossRef]
- Alhosani, M. Traffic Accidents Analysis & Prediction in UAE. Master’s Thesis, Rochester Institute of Technology, Dubai, United Arab Emirates, 2022. Available online: https://repository.rit.edu/theses/11380/ (accessed on 1 January 2025).
Reference | Year | Region | Method | C | R | Period | Metric | Ordinal Classification |
---|---|---|---|---|---|---|---|---|
Arciniegas-Ayala et al. [14] | 2024 | Ecuador | CNN, CNN-RF, GPC-RBF, SVM-RBF, ANN | √ * | Unspecified | ACC, SPE, SEN, ET | X | |
Muktar and Fono [15] | 2024 | Canada | XGB, CatBoost, RF, GB | √ | 2012–2021 | ACC, P, R, F | X | |
Obasi and Benson [16] | 2023 | UK | NB, RF, LR, ANN | √ | 2005–2014 | ACC, P, R, F | X | |
Gatarić et al. [17] | 2023 | Serbia Srpska | ANN | √ | Unspecified | RMSE, MBE, MPE, x2, r2 | X | |
Aldhari et al. [18] | 2022 | Saudi Arabia | XGB, RF, LR | √ | 2017–2019 | ACC, AUC, ROC, P, R, F | X | |
Yan and Shen [19] | 2022 | USA | ANN, KNN, SVM, RF | √ | 2016–2019 | AUC, P, R, F | X | |
Islam et al. [20] | 2022 | Saudi Arabia | LR, RF, XGB | √ | 2009–2016 | ACC, SPE, SEN, P, R, F | X | |
Khattak et al. [21] | 2022 | Pakistan | RF, CART, LR, AdaBoost | √ | 2015–2019 | ACC, P, R, F | X | |
Dong et al. [22] | 2022 | Pakistan | Natural GB, CatBoost, LGBM, AdaBoost | √ | 2015–2019 | ACC, AUC, P, R, F | X | |
Santos et al. [23] | 2021 | Portugal | DT, RF, LR, NB | √ | 2016–2019 | ACC, AUC, P, R | X | |
Boo and Choi [24] | 2021 | Korea | LR, RF, SVM | √ | 2013–2017 | ACC, ROC, P, R, F | X | |
Fiorentini and Losa [25] | 2020 | UK | RT, KNN, LR, RF | √ | Unspecified | ACC, TPR, FPR, TNR, F, P | X | |
Assi et al. [26] | 2020 | UK | ANN, FCM, SVM | √ | 2011–2016 | ACC, SEN, F, P | X | |
Lee et al. [27] | 2020 | Korea | RF, ANN, DT | √ | 2007–2015 | MSE, RMSE | X | |
Assi et al. [28] | 2020 | Australia | ANN, SVM | √ | 2014–2019 | ACC, SEN, P, F | X | |
Proposed Method | Ethiopia | ORT-ROFS | √ | 2017–2020 | ACC, P, R, F | √ |
No | Feature Name | Description | Values |
---|---|---|---|
1 | Hour | The time the accident occurred | Numeric |
2 | Day_of_week | The day the accident occurred | Monday, Sunday, Friday, Wednesday, Saturday, Thursday, Tuesday |
3 | Age_band_of_driver | The age range of the driver | ‘Under 18’, 18–30, 31–50, ‘Over 51’ |
4 | Sex_of_driver | The gender of the driver | Female, Male |
5 | Educational_level | The education level of the driver | ‘Junior high school’, ‘Above high school’, ‘Elementary school’, ‘High school’, Illiterate, ‘Writing and reading’ |
6 | Vehicle_driver_relation | The driver’s relationship to the vehicle involved in the crash | Employee, Owner, Other |
7 | Driving_experience | The driving experience of the driver involved in the accident | 1–2 year, ‘Above 10 yr’, 5–10 yr, 2–5 yr, ‘No License’, ‘Below 1 yr’ |
8 | Type_of_vehicle | Type of vehicle involved in the accident | Automobile, ‘Public (>45 seats)’, ‘Lorry (41–100 Q)’,’Public (13–45 seats)’, ‘Lorry (11–40 Q)’,’Long lorry’, ‘Public (12 seats)’, Taxi, ‘Ridden horse’, ‘Pick up to 10 Q’, Station wagon, Turbo, Bajaj, Motorcycle, ‘Special vehicle’, Bicycle, Other |
9 | Owner_of_vehicle | The ownership type of vehicle | Owner, Governmental, Organization, Other |
10 | Service_year_of_vehicle | The time elapsed since the vehicle’s last service before the accident | ‘Above 10 yr’, 5–10 yrs, 1–2 yr, 2–5 yrs, ‘Below 1 yr’ |
11 | Defect_of_vehicle | The defect status of the vehicle before the accident | ‘No defect’, 7, 5 |
12 | Area_accident_occured | The area where the accident occurred | ‘Office areas’, ‘Residential areas’, ‘Recreational areas’,’ Industrial areas’, ‘Industrial areas’, ‘Church areas’, ‘Market areas’, ‘Rural village areas’, ‘Hospital areas’, ‘Outside rural areas’, ‘School areas’, ‘Recreational areas’, ‘Rural village areas Office areas’, Other |
13 | Lanes_or_Medians | The type of lane in which the vehicle was traveling at the time of the accident | ‘Double carriageway (median)’, ‘Undivided Two way’, ‘One way’, ‘Two-way (divided with broken lines road marking)’, ‘Two-way (divided with solid lines road marking)’, Other |
14 | Road_allignment | The terrain of the road where the accident occurred | ‘Tangent road with flat terrain’, ‘Tangent road with mild grade and flat terrain’, Escarpments, ‘Tangent road with rolling terrain’, ‘Gentle horizontal curve’, ‘Tangent road with mountainous terrain and’, ‘Steep grade downward with mountainous terrain’, ‘Sharp reverse curve’, ‘Steep grade upward with mountainous terrain’ |
15 | Types_of_Junction | The type of road junction where the accident occurred | ‘No junction’, ‘Y Shape’, Crossing, ‘O Shape’, ‘T Shape’, ‘X Shape’, Other |
16 | Road_surface_type | The type of road surface on which the accident occurred | ‘Earth roads’, ‘Asphalt roads’, ‘Gravel roads’, ‘Asphalt roads with some distress’, Other |
17 | Road_surface_conditions | The condition of the road surface | Dry, Snow, ‘Wet or damp’, ‘Flood over 3 cm. deep’ |
18 | Light_conditions | The lighting conditions when the accident occurred | Daylight, ‘Darkness—lights lit’, ‘Darkness—no lighting’, ‘Darkness—lights unlit’ |
19 | Weather_conditions | Weather conditions at the time of the accident | Normal, Raining, Cloudy, ‘Raining and Windy’, Windy, Snow, ‘Fog or mist’, Other |
20 | Type_of_collision | The manner in which the vehicles collided | ‘Collision with roadside-parked vehicles’, ‘Vehicle with vehicle collision’, ‘Collision with roadside objects’, ‘Collision with animals’, Rollover, ‘Fall from vehicles’, ‘Collision with pedestrians’, ‘With Train’, Other |
21 | Number_of_vehicles_involved | The number of vehicles involved in the accident | Numeric |
22 | Number_of_casualties | The number of accident-related deaths | Numeric |
23 | Vehicle_movement | The driver’s behavior just before the accident | ‘Going straight’, U-Turn, ‘Moving Backward’, Turnover, ‘Waiting to go’, ‘Getting off’, Reversing, Parked, Stopping, Overtaking, ‘Entering a junction’, Other |
24 | Casualty_class | The classification of the person injured or killed in the accident | ‘Driver or rider’, Pedestrian, Passenger |
25 | Sex_of_casualty | Gender of the person injured or killed in the accident | Male, Female |
26 | Age_band_of_casualty | The age range of the person injured or killed in the accident | 31–50, 18–30, ‘Under 18’, ‘Over 51’, 5 |
27 | Casualty_severity | The numeric value that indicates seriousness of the injury | Numeric |
28 | Work_of_casualty | The employment status of the person injured or killed in the accident | Driver, Unemployed, Employee, Self-employed, Student, Other |
29 | Fitness_of_casualty | The health condition of the person injured or killed in the accident before the accident | Normal, Deaf, Blind, Other |
30 | Pedestrian_movement | If a pedestrian was involved, the pedestrian’s movement and location at the time of the accident | ‘Not a Pedestrian’, ‘Crossing from drivers nearside’, ‘Crossing from nearside—masked by parked or stationary vehicle’, ‘Unknown or other’, ‘Crossing from offside—masked by parked or stationary vehicle’, ‘In carriageway, stationary—not crossing (standing or playing)’,’Walking along in carriageway, back to traffic’, ‘Walking along in carriageway, facing traffic’, ‘In carriageway, stationary—not crossing (standing or playing)—masked by parked or stationary vehicle’ |
31 | Cause_of_accident | Cause of the accident | ‘Moving Backward’, Overtaking, ‘Changing Lane to the left’, ‘Changing Lane to the right’, Overloading, ‘No priority to vehicle’, ‘No priority to pedestrian’, ‘No distancing’, ‘Getting off the vehicle improperly’, ‘Improper parking’, Overspeed, ‘Driving carelessly’, ‘Driving at high speed’, ‘Driving to the left’, Overturning, Turnover, ‘Driving under the influence of drugs’, ‘Drunk driving’, Other |
32 | Accident_severity | Severity of the accident | ‘Slight Injury’, ‘Serious Injury’, ‘Fatal Injury’ |
Accuracy (%) | |
---|---|
k | ORT-ROFS |
5 | 87.19 |
6 | 86.95 |
7 | 86.90 |
8 | 86.53 |
9 | 86.10 |
10 | 85.87 |
Accuracy (%) | ||||
---|---|---|---|---|
Dataset | RT | ORT | ORT-FS | ORT-ROFS (Proposed) |
Road Traffic Accident (RTA) | 84.09 | 84.65 | 81.44 | 87.19 |
Reference | Year | Method | Split Ratio | Accuracy (%) |
---|---|---|---|---|
Xiao and Duan [34] | 2024 | Light Gradient Boosting Machine + SMOTE | NA | 84.00 |
Obaid [35] | 2024 | Random Forest | 80:20 | 84.49 |
Decision Tree | 83.06 | |||
Ramya and Eswari [36] | 2024 | Logistic Regression | NA | 87.00 |
Extreme Gradient Boosting | 86.00 | |||
Decision Tree | 74.00 | |||
Random Forest | 84.00 | |||
Endalie and Abebe [37] | 2023 | Support Vector Machines | 80:20 | 85.00 |
Kodepogu et al. [38] | 2023 | Decision Tree | 80:20 | 83.30 |
Random Forest | 77.40 | |||
K-Nearest Neighbors | 82.20 | |||
Naive Bayes | 85.30 | |||
Adaptive Boost Classifier | 85.30 | |||
Adeliyi et al. [39] | 2023 | J48 Pruned Tree | 10-fold-cross-validation | 85.47 |
Naive Bayes | 83.53 | |||
Bagging | 84.29 | |||
K-Nearest Neighbors | 77.58 | |||
Logistic Model Tree | 84.74 | |||
Decision Tree | 84.52 | |||
Random Tree | 84.51 | |||
Logistic Regression | 84.51 | |||
Alhosani [40] | 2022 | Gradient Boosting | 70:30 | 77.75 |
Random Forest | 79.78 | |||
Logistic Regression | 69.47 | |||
Decision Tree | 53.10 | |||
Support Vector Classifier | 56.67 | |||
Extra Trees | 81.35 | |||
Proposed Approach | ORT-ROFS | 80:20 | 86.69 | |
70:30 | 86.88 | |||
10-fold-cross-validation | 87.19 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ghasemkhani, B.; Balbal, K.F.; Birant, K.U.; Birant, D. Ordinal Random Tree with Rank-Oriented Feature Selection (ORT-ROFS): A Novel Approach for the Prediction of Road Traffic Accident Severity. Mathematics 2025, 13, 310. https://doi.org/10.3390/math13020310
Ghasemkhani B, Balbal KF, Birant KU, Birant D. Ordinal Random Tree with Rank-Oriented Feature Selection (ORT-ROFS): A Novel Approach for the Prediction of Road Traffic Accident Severity. Mathematics. 2025; 13(2):310. https://doi.org/10.3390/math13020310
Chicago/Turabian StyleGhasemkhani, Bita, Kadriye Filiz Balbal, Kokten Ulas Birant, and Derya Birant. 2025. "Ordinal Random Tree with Rank-Oriented Feature Selection (ORT-ROFS): A Novel Approach for the Prediction of Road Traffic Accident Severity" Mathematics 13, no. 2: 310. https://doi.org/10.3390/math13020310
APA StyleGhasemkhani, B., Balbal, K. F., Birant, K. U., & Birant, D. (2025). Ordinal Random Tree with Rank-Oriented Feature Selection (ORT-ROFS): A Novel Approach for the Prediction of Road Traffic Accident Severity. Mathematics, 13(2), 310. https://doi.org/10.3390/math13020310