1. Introduction
Drowsiness manifested by drooping eyes, mind wandering, eye rubbing, inability to concentrate, and yawning, is a state of fatigue that presents a substantial danger, especially when it comes to road safety. Recent investigations highlight the seriousness of the problem, revealing that 30% of the 1 million deaths caused by road accidents can be related to driver weariness or drowsiness [
1,
2]. The likelihood of a collision increases thrice when the driver is experiencing weariness, emphasizing the importance of taking preventative steps. The American Automobile Association (AAA) has discovered that there are approximately 328,000 crashes caused by drowsy driving each year [
3]. These crashes have had a significant impact on society, costing almost 109 billion USD, not including property damage [
3]. This staggering figure encompasses immediate and long-term medical expenses, productivity losses in both workplace and household contexts, legal and court costs, insurance administration expenses, and the economic impact of travel delays. Specific demographic groups are particularly susceptible to drowsiness while driving. Night-shift male workers and individuals with sleep apnea syndrome emerge as high-risk categories [
4]. Several research studies have been published, suggesting strategies to mitigate or notify drivers about possible indications of drowsiness [
5,
6,
7,
8,
9,
10,
11,
12,
13,
14]. These measures are important steps in tackling the critical issue of drowsy driving and improving road safety.
Drowsiness detection systems can be classified into three main categories: vehicle dynamics, physiological signals, and recognition of driver face characteristics [
11,
12,
15,
16]. Nevertheless, the efficacy of vehicle dynamics-based systems is hindered by the suboptimal performance caused by unpredictable variables such as road geometry, sluggish processing speed, traffic conditions, and head movement [
15,
16,
17]. Conversely, the examination of yawning and blinking by analyzing facial images of the driver has shown potential in controlled or virtual environments [
16,
17]. However, the performance of these systems often decreases when used in real-world settings due to factors including changes in lighting, differences in skin color, and temperature fluctuations [
16,
17]. Conversely, systems relying on physiological signals have demonstrated a high level of accuracy, establishing them as a dependable approach for real-world applications. Physiological measures such as electroencephalography (EEG) [
6,
18,
19,
20,
21,
22,
23,
24], electrooculography (EOG) [
25,
26,
27,
28,
29,
30], respiration rate [
12,
31,
32,
33,
34,
35], electrocardiography (ECG) [
34,
36,
37,
38], and electromyography (EMG) [
39,
40,
41,
42,
43] are commonly used in the systems designed to identify driver drowsiness. Although the sensors used to capture these signals are effective, a significant obstacle occurs due to their invasive character, making it difficult to integrate or practically use them in real-world contexts.
Among these physiological signals, the respiration rate is especially noteworthy because it fluctuates significantly from awake to sleep and varies depending on numerous physiological situations. In addition, the respiratory system undergoes modifications during sleep, which are impacted by decreased muscle tone and shifts in chemical and non-chemical reactions [
44]. It is worth mentioning that a decline in the rate at which a person breathes is frequently noticed prior to a driver reaching a state of sleep [
45,
46]. This study aims to address the challenge of accurately detecting driver drowsiness in real time using UWB radar signals and advanced machine learning (ML) techniques. The primary objectives are to develop robust feature extraction methods, design efficient ensemble models, and validate their effectiveness against existing methods. In this manuscript, the proposed system employs the non-invasive acquisition of chest movement through Ultra-Wideband (UWB) radar to distinguish between the drowsy and non-drowsy states of the driver. UWB radar offers notable benefits such as fast data rates and low power transmission levels [
47]. This is achieved by transmitting very short-duration pulses, resulting in signals with wide bandwidth. The technology does not raise any privacy problems because it is not influenced by ambient elements, does not rely on light or skin color, and emits very little power, guaranteeing human safety [
48,
49,
50]. Furthermore, the system maintains its resilience even when exposed to Wi-Fi and mobile phone transmissions. The UWB radar’s ability to penetrate different materials or obstructions, combined with its non-intrusive nature [
51,
52], makes it an excellent option for this drowsiness detection system. The chest readings obtained are subsequently transformed into grayscale images, as illustrated in [
53], and these images are utilized as input deep learning (DL) models. The features extracted from these models are then employed to train and test ML algorithms. The contributions of this study are as follows:
This system utilized a dataset from [
12] and transformed it into grayscale images for analysis.
The system employs Convolutional Neural Network (CNN) architecture to extract features from these images.
These features are input into various machine learning (ML) algorithms, and the performance of these algorithms is assessed on a test set.
The hybrid ensemble models RF-MLP and RF-XGB-SVM have been developed to combine the unique capabilities of multiple algorithms.
The models undergo evaluation using metrics such as accuracy, precision, recall, and F1 score. In the end, a comparative analysis is conducted to determine which deep learning-based feature yields superior results.
This paper is organized into several sections.
Section 2 presents the literature review of the study, while
Section 3 describes the methodology of the proposed approach.
Section 4 presents the results, and finally,
Section 5 contains the study’s conclusion.
2. Literature Review
The literature review examines various prominent studies that specifically investigate the identification and categorization of drowsy and alert conditions in drivers. The classification of drowsy and non-drowsy states is accomplished by utilizing non-invasive IR-UWB radar to measure the breathing rate, as stated in the research [
12]. The chest motions of 40 individuals were collected, and the Support Vector Machine algorithm achieved an accuracy rate of 87%. The study demonstrates the efficacy of UWB in detecting driver drowsiness by analyzing breathing rates. The paper introduces an EEG-based spatial-temporal CNN (ESTCNN) in [
54] to detect driver fatigue. The network independently acquires characteristics from EEG inputs, with an exceptional classification accuracy of 97.37%. The experiments involve the collection of EEG signals from eight participants in both alert and fatigue stages. The research presented by [
55] focuses on two distinct categories of videos: alert and drowsy. The study utilizes a thorough dataset consisting of 60 individuals who have been classified into three groups: alert, low vigilant, and drowsy. Two separate models are created, utilizing computer vision and deep learning to analyze temporal and spatial features. Ref. [
56] suggests a method of evaluating exhaustion that does not require any intrusive procedures. This method involves analyzing physiological signs such as heart rate variability (HRV) and ECG data. During sleep periods, ECG data are collected, and the continuous wavelet transform is used to extract features. The average accuracy achieved via ensemble logistic regression is 92.5%, with a processing time of 21 s. Ref. [
57] improves the detection of drowsiness by combining ECG and EEG features. The data collected from 22 participants in a driving simulator exhibit noteworthy characteristics that differentiate between states of being alert and tired. By combining modalities, Support Vector Machine (SVM) classification produces enhanced performance, while channel reduction ensures accuracy using only two electrodes.
The Intelligent Drowsiness Detection System (DDS) described in [
58] uses Deep Convolutional Neural Networks (DCNNs), specifically VGG16, InceptionV3, and Xception, to address driver fatigue. The Xception model has exceptional performance, with an accuracy of 93.6%. It surpasses both the VGG16 and InceptionV3 models when applied to a dataset containing facial recordings depicting drowsy and non-drowsy states. In [
59], an approach with two phases tackles the difficulties in intelligent transportation systems by presenting an improved fatigue detection system that relies on DenseNet. The system consists of a module that represents the model and a sophisticated method for channel attention. The second stage utilizes a guided policy search (GPS) algorithm to facilitate collaborative decision-making, adjusting to the current levels of driver fatigue in real time. Empirical validation on datasets such as YaWDD, RLDD, and DROZY showcases substantial enhancements and achieved an average accuracy of 89.62%. The fatigue detection method, implemented in [
60], utilizes powerful CNN models to specifically target yawning. This system demonstrates a remarkable accuracy of 96.69% on the YaWDD dataset. The analysis demonstrates that data augmentation achieves a trade-off between accuracy and model resilience, resulting in a modest decrease in accuracy but an improvement in the model’s ability to withstand complications. In [
61], a novel deep learning approach for driver drowsiness identification utilizes a MobileNet-SSD CNN with the SSD technique. Trained on a diverse dataset of 6000 photos, the model achieves a substantial Mean Average Precision (mAP) of 0.84, prioritizing computing efficiency for real-time processing on mobile devices. The methodology incorporates a unique dataset from various sources, ensuring diverse representation. Experimental results demonstrate the model’s resilience, achieving high mAP values for closed eyes (0.776), open eyes (0.763), and outstanding face detection (0.971).
In a study conducted by researchers [
62], a Regularized Extreme Learning Machine (RELM) showed exceptional performance in identifying driver drowsiness. The RELM achieved an accuracy rate of 99% using a dataset consisting of 4500 pictures. The combination of video surveillance, image processing, and ML [
63] results in a sleepiness detection system that achieves a 93% accuracy rate. This accuracy is determined by analyzing eye blink patterns from the YawDD dataset. The system described in [
64] utilizes the PERCLOS algorithm, Python modules, and ML techniques to evaluate eye movements. It achieves a high accuracy rate of 93% in the real-time detection of driver drowsiness. The utilization of mmWave FMCW radar enables [
65] to reach an accuracy of 82.9% in detecting drowsiness. This is accomplished by collecting chest motions and employing ML methods. Ref. [
66] integrates MTCNN facial detection with GSR sensor-based physiological data, resulting in an accuracy of 91% in the real-time detection of driver drowsiness. The study [
67] combines behavioral metrics and physiological data, utilizing Raspberry Pi and SVM classifiers, to achieve a commendable accuracy rate of 91% in detecting driver tiredness. The study [
68] uses a histogram of oriented gradients (HOG) and linear SVM to achieve outstanding precision. The DDS in [
69] uses CNN to extract features, resulting in an accuracy rate of 86.05% on a dataset of 48,000 photographs. The study [
70] conducted in Zimbabwe mainly addresses the issue of road safety. It successfully achieves a detection accuracy of over 95% in identifying drowsiness. This is accomplished by the implementation of the principal component analysis (PCA) dimensionality reduction technique, along with classifiers such as XGBoost and Linear Discriminant Analysis. The implementation of a real-time drowsiness detection system on Nvidia Jetson Nano, as described in [
71], achieves an accuracy rate of 94.05%. In particular, it excels particularly in detecting yawning. The paper [
72] presents a DDS that uses webcam-based surveillance to detect drowsiness in real-time. The system achieves a high level of accuracy, with over 97% accuracy in multiple metrics such as precision, sensitivity, and F1-score. Ref. [
55] presents a DDS that operates in real-time. The system utilizes the Viola–Jones algorithm, a beeping sound mechanism, and calculates the distance between the lips. This combination of technologies provides scalability and cost-effectiveness, which ultimately leads to increased road safety.
Although these investigations contribute substantially to the field of driver drowsiness detection, it is important to highlight numerous limitations. Although video-based methods are successful in controlled environments, they can face difficulties in real-world situations due to inconsistent lighting conditions, which could potentially affect the precision of drowsiness detection. Furthermore, the adoption of physiological data-centric systems in real-time is hindered by practical problems arising from the invasive nature of on-body sensors, regardless of their effectiveness. Not only does this give rise to privacy problems, but it also obstructs the smooth incorporation of such technologies into ordinary driving situations. Hence, the implementation of these techniques in actual driving scenarios requires careful deliberation of these limitations.
4. Results and Discussion
This section provides a comprehensive analysis and discussion of the results obtained from the experiments carried out during this research. The objective is to provide a thorough analysis of the results while clarifying their importance within the context of this study. Furthermore, it involves a substantial discussion exploring the impacts and importance of these findings, thereby enriching the understanding of the broader academic and practical implications stemming from the research endeavor.
4.1. Experiment Setup
The experimental analyses were conducted on the HP EliteBook x360 1040 G6 (HP Inc., Lahore, Pakistan), which serves as the primary computing platform. Equipped with an Intel(R) Core (TM) i5-8365U processor operating at 1.60 GHz, this system exhibits remarkable computational prowess at a peak speed of 1.90 GHz. An additional 16.0 GB of RAM significantly enhances the performance of the CPU, resulting in improved efficiency when it comes to multitasking and data management. Implementing Windows 11 Pro and running on a 64-bit architecture, the system demonstrates the seamless integration of state-of-the-art hardware and software components. This technical configuration highlights the utilization of advanced capabilities throughout the experimentation phase, ensuring a stable and flexible computing environment. Data preprocessing was performed using MATLAB R2020a. The subsequent experiments, including feature extraction and model training, were implemented in Python 3.0 using Jupyter Notebook 6.5.2. This environment allowed for the seamless integration of code, visualizations, and documentation, facilitating an interactive and iterative workflow. The software environment comprised Python 3.8, TensorFlow 2.4, and scikit-learn 0.24. Hyperparameter tuning was performed using grid search to identify optimal configurations for each model. Software debugging and iterative refinements were managed using Jupyter Notebook’s real-time monitoring and visualization tools, which allowed for dynamic adjustments during the training process.
4.2. Data Splitting
The dataset comprises recordings obtained from forty male participants, encompassing both drowsy and alert states. By segmenting each file at one-minute intervals, the total number of files within each category rises to 200. The dataset is subsequently divided into test and training sets in the proportion of 70% for training and 30% for testing. Additionally, a GAN is employed to augment the dataset, resulting in each class having 1200 values. These augmented datasets are then divided into training and testing sets in an 80–20 split, ensuring a robust and comprehensive evaluation of the model’s performance. The objective of this strategic division is to guarantee an equitable distribution of instances for drowsy and non-drowsy states throughout the training and testing stages, thereby promoting the development and assessment of resilient models.
4.3. Classification Results
In this study, a diverse array of machine learning classifiers, encompassing SVM, Random Forest (RF), XGBoost (XGB), and Multi-Layer Perceptron (MLP), were employed for the classification task. Furthermore, ensemble classifiers were implemented in two ways RF-MLP Ensemble and RF-XGB-SVM Ensemble. To improve the performance of the models, rigorous hyperparameter tuning was performed using the Grid Search technique. The selected specific hyperparameters are provided in
Table 3.
The training phase included the use of the training dataset, followed by testing on an independent test set.
Table 4 completely presents the classification performance of these models on the test set, providing insights into their efficacy and comparative evaluation.
The results in
Table 4 show that ensemble models, RF-MLP and RF-XGB-SVM, showcased exceptional performance, with an accuracy of 95% and 96.6%, respectively. This strong result emphasizes the efficacy of combining several learning techniques, demonstrating their capacity to greatly improve predictive accuracy. Notably, RF consistently outperformed all the measures tested, attaining an amazing accuracy of 93.33% and an F1-score of 0.94. SVM and XGBoost performed similarly, with both models achieving a roughly 91% accuracy and F1 scores of 0.91 and 0.92, respectively. Although effective, they marginally trailed RF’s superior performance. The MLP performed significantly worse, with an accuracy of 74.6% and an F1-score of 0.75. This result sheds light on the potential limits of the MLP model architecture for the specific drowsiness detection task. For accurate drowsiness detection, the ensemble model, notably RF-XGB-SVM, emerges as a highly promising classifier. The confusion matrix is shown in
Figure 8.
The augmented dataset was used to ensure fair and comparable evaluations across different models and datasets by maintaining consistency in model training. The same set of hyperparameters as those applied to the original dataset was used, guaranteeing uniform training conditions. Following successful training, the trained models were rigorously tested using the designated test set. The evaluation results, meticulously documented and presented in
Table 5, provide insights into the classifiers’ performance metrics, including accuracy, precision, recall, and F1-score.
It is evident from
Table 5 that the SVM achieved an accuracy of 98.76%, with a Precision, Recall, and F1-Score all standing at 0.99, indicating a highly consistent and reliable performance across different evaluation metrics. The RF and XGB classifiers both exhibited identical performance metrics, each attaining an accuracy of 99.17%, and scoring 0.99 in Precision, Recall, and F1-Score. This suggests that both classifiers were equally effective in handling the augmented dataset, offering robust and accurate predictions. The MLP demonstrated the highest performance among the individual classifiers, with an accuracy of 99.5%. Remarkably, it achieved perfect scores of 100 in Precision, Recall, and F1-Score, indicating an exceptional ability to correctly identify and classify instances without any false positives or negatives. For the ensemble classifiers, the RF-MLP Ensemble achieved an accuracy of 99.3%, with a Precision, Recall, and F1-Score of 0.99. This performance is slightly lower than that of the MLP alone but still indicates strong predictive capabilities by leveraging the strengths of both RF and MLP. The RF-XGB-SVM Ensemble outperformed all other models, reaching an accuracy of 99.58%. It also achieved perfect scores of 100 in Precision, Recall, and F1-Score. This superior performance highlights the effectiveness of combining multiple classifiers, capitalizing on their individual strengths to deliver highly accurate and reliable predictions. The results demonstrate that all classifiers performed exceptionally well on the augmented dataset, with ensemble methods, particularly the RF-XGB-SVM Ensemble, providing a slightly higher accuracy than other classifiers. The confusion matrix of RF-XGB-SVM is shown in
Figure 9.
4.4. K-fold Cross Validation
To assess the robustness and reliability of the models, a K-fold cross-validation approach was implemented in this study. The dataset underwent a process of partitioning into five distinct folds, and the models underwent iterative training and evaluation across each of these folds.
Table 6 provides a comprehensive presentation of the results obtained from the cross-validation process. This enables readers to gain a nuanced comprehension of the performance of the models across various subsets of the data.
The findings presented in
Table 4 indicate that the ensemble models, specifically RF-XGB-SVM, demonstrated superior performance in terms of both accuracy and consistency. It is noteworthy that RF-XGB-SVM demonstrated the highest accuracy at 97% and a remarkably low standard deviation of 0.018. These results indicate that RF-XGB-SVM operates with robustness and dependability across various factors. Indicating its effective generalization capabilities, RF-MLP additionally exhibited notable results, possessing a precision rate of 95% and a standard deviation of 0.03. In comparison to other individual classifiers, RF demonstrated its efficacy as a solitary model by attaining an accuracy of 94% and a moderate standard deviation of 0.02. In contrast, SVM and XGBoost exhibited similar performance levels, attaining accuracies of approximately 91% each. With a standard deviation of 0.04 compared to SVM’s 0.05, XGBoost exhibited marginally less variability. The MLP demonstrated the most substantial standard deviation of 0.04 and the lowest accuracy of 73%.
The k-fold cross-validation results on the augmented dataset, presented in
Table 7, provide a comprehensive evaluation of the classifiers’ performance in terms of accuracy and variability. The accuracy is reported along with the standard deviation (Std), which indicates the consistency of the model across different folds. The SVM and RF classifiers both achieved an average accuracy of 0.98 with a standard deviation of 0.01. This reflects their robust performance and reliability, with minimal variation in accuracy across the different folds of the dataset. XGB exhibited a slightly lower average accuracy of 0.97 with a standard deviation of 0.01. While still demonstrating strong performance, the XGB classifier showed a slightly higher variability in its predictions compared to SVM and RF. The MLP classifier outperformed the other individual classifiers, achieving an impressive average accuracy of 0.99 with a standard deviation of 0.01. This high accuracy, coupled with low variability, underscores MLP’s effectiveness and stability in handling the augmented dataset. The RF-MLP Ensemble, which combines the strengths of both Random Forest and Multi-Layer Perceptron, achieved an average accuracy of 0.98 with a standard deviation of 0.01. This indicates that the ensemble method is as reliable as the individual RF and SVM classifiers, but did not surpass the performance of MLP alone. The RF-XGB-SVM Ensemble demonstrated the highest performance among all models, with an average accuracy of 0.99 and a standard deviation of 0.01. This suggests that combining Random Forest, XGBoost, and SVM in an ensemble approach results in a model that is not only highly accurate but also consistently reliable across different subsets of the dataset. The k-fold cross-validation results affirm the high performance and robustness of the classifiers, with ensemble methods, particularly the RF-XGB-SVM Ensemble, providing the best accuracy and consistency.
4.5. Computational Time Complexity
Table 8 summarizes the computational time complexity of classifiers, measured in seconds. SVM demonstrates the lowest time complexity at 1.53 s, followed by RF (2.47) and XGB (2.81). MLP has a higher complexity at 3.63 s, while the RF-MLP ensemble increases to 3.72 s. The RF-XGB-SVM ensemble requires the most time at 4.15 s. This highlights a trade-off between computational efficiency and model complexity, with simpler models offering faster predictions, while more complex ensembles deliver heightened accuracy at the expense of increased computational time.
The computational time complexity of the classifiers on the augmented dataset, as shown in
Table 9, provides insight into the efficiency of each model in terms of training time measured in seconds. The SVM required 4.19 s for training, indicating a relatively fast processing time given its sophisticated algorithm. Similarly, the RF classifier took 4.22 s, which is comparable to SVM and reflects its efficiency in handling the dataset with multiple decision trees. XGB demonstrated the shortest computational time among all classifiers, completing its training in 3.93 s. This rapid processing time is indicative of XGB’s optimized implementation for gradient boosting, which is known for its speed and performance. The MLP, however, required the longest training time of 5.1 s. This increased time complexity can be attributed to the neural network’s iterative training process, involving numerous parameters and layers that need to be optimized. Among the ensemble classifiers, the RF-MLP Ensemble took 4.3 s to train. This slight increase compared to the individual RF model reflects the added complexity of integrating the MLP component, yet it remains efficient. The RF-XGB-SVM Ensemble had a computational time of 4.7 s. While this is higher than the individual classifiers, it remains reasonable given that it combines three different models. The increase in computational time is justified by the significant boost in accuracy and robustness provided by this ensemble approach. The computational time complexity results illustrate a trade-off between training time and model performance. While MLP and ensemble methods take longer to train, their superior accuracy and reliability often justify the additional computational cost. Conversely, XGB stands out for its quick processing time, making it an efficient choice when computational resources or time are limited.
4.6. Comparison with Existing Studies
In comparison to a prior study conducted by Siddiqui et al. [
12], which used the same dataset as employed in this manuscript, the proposed method presented in this manuscript has exhibited advancements in accuracy as shown in
Table 10. The study [
12] achieved an accuracy of 87.5%, while our proposed methodology achieved a significantly higher accuracy of 99.58%. This substantial improvement underscores the efficacy of the approach introduced in this manuscript. The enhanced accuracy suggests that the employed classifiers, such as RF-XGB-SVM, have effectively leveraged the features within the dataset, surpassing the performance achieved in the earlier study.
4.7. Discussion
The results demonstrate that the RF-XGB-SVM ensemble model outperforms all other classifiers. In multiple evaluation criteria, such as accuracy, precision, recall, and F1-score, RF-XGB-SVM consistently exhibited superior performance compared to its competitors. The remarkable efficacy of the RF-XGB-SVM ensemble model can be attributed to the synergistic cooperation of RF, XGB, and SVM. RF, with its collection of decision trees, effectively captures complex data relationships. The XGB algorithm, a robust gradient boosting technique, enhances the performance of less capable models, while SVM prioritizes the identification of suitable hyperplanes for classification. The integration of these classifiers produces a model that not only utilizes a range of learning techniques but also performs exceptionally well in detecting different patterns throughout the feature space. The ensemble approach offers a reliable solution by reducing overfitting and allowing error correction through the combined knowledge of classifiers for drowsiness detection. Despite its excellent classification performance, it is noteworthy that the RF-XGB-SVM model incurs a higher computational time compared to individual classifiers.
The accuracy comparison of all the classifiers on both datasets is shown in
Figure 10. The analysis revealed that while individual models like SVM, RF, XGBoost, and MLP performed exceptionally well, achieving high accuracy rates (up to 99.5% for MLP), ensemble methods provided the best results. The RF-XGB-SVM ensemble achieved the highest accuracy of 99.58%, coupled with perfect precision, recall, and F1-score, demonstrating the advantage of combining diverse classifiers. K-fold cross-validation confirmed the robustness and consistency of all models, with low standard deviations indicating reliable performance across different folds.
The findings highlight the effectiveness of ensemble approaches in achieving high performance while balancing computational efficiency. The primary aim of this study is to achieve high accuracy in detecting driver drowsiness, which is crucial for enhancing road safety. This focus, however, leads to higher computational complexity. The benefits of improved detection accuracy justify the additional computational cost. To make the method more practical for deployment in various real-world scenarios, efforts are being made to explore optimizations that improve real-time performance.
5. Conclusions
Drowsiness while driving offers a significant risk, resulting in decreased cognitive performance and an increased likelihood of an accident. Drowsiness-related vehicle crashes have serious consequences, including trauma, economic costs, injuries, and even fatalities. This study demonstrates the effectiveness of using UWB radar and advanced ensemble models for real-time driver drowsiness detection. This study focuses on classifying drivers into drowsy and non-drowsy states using data from ultra-wideband radar. The five-minute dataset was divided into one-minute chunks and converted to grayscale images. A Two-Dimensional Convolutional Neural Network was used to extract spatial features from these images. Using these features, various machine learning classifiers were trained and tested. Notably, the ensemble classifier RF-XGB-SVM attained an amazing accuracy of 96.6% by combining Random Forest, XGBoost, and Support Vector Machine. The k-fold cross-validation score was 97%, with a standard deviation of 0.018, indicating a stable and consistent performance. Utilizing Generative Adversarial Networks for dataset augmentation led to enhanced accuracies across all models, with the RF-XGB-SVM model surpassing others by achieving an accuracy score of 99.58%. The proposed method significantly improves detection accuracy, highlighting its potential to enhance road safety by reducing fatigue-related accidents. Future research could investigate the integration of other sensor modalities for improved detection, as well as the deployment of the system in real-world driving scenarios for comprehensive validation.