The fatalities on the road have increased in 2016 by 5.6% percent from calendar year 2015 (37,461 lives were lost on U.S. roads in 2016) according to National Highway Traffic Safety Administration (NHTSA). The major contributing factor in the fatal accidents on the roadway is distracted driving. Continuous focus of the driver is a necessity for driving. Various research studies have shown that the driver’s attention decreases during multitasking, such as slower reaction time, decreased situational awareness, impairing judgments and narrowed visual scanning [
1]. Distraction occurs when drivers divert their attention from the task of driving to a secondary activity instead such as having a phone conversation, texting, using the infotainment system [
2], etc. The most common distracting secondary tasks during driving is when the driver uses his/her personal cell phone for either calling or texting. It is crucial to detect and notify driver distraction at its early stages in order to minimize the risk of road accidents. Many research investigations have been conducted to develop reliable feedback systems to alert distraction scenarios to the drivers. Many of the previous works employed techniques based on eye lid closure and movement tracking [
3], lane tracking [
4], and video cameras as an image processing technique by periodically taking video images of the driver [
5] to identify inattention state of the drivers. Even though successful performances were achieved through above methods, they suffer from issues such as privacy violation risks and delayed detection and responses when the effect of distraction is visually noticeable. Those limitations can be overcome via continuous monitoring of physiological signals such as Electroencephalogram (EEG) rather than cameras. EEG based systems that generate state-of-the-art results are comprehensive and reliable [
6] . However, the complexity of setup for collecting and analyzing the data is one of the major limitations of EEG, which makes the system expensive and intrusive to implement [
7,
8]. Galvanic Skin Response (GSR), on the other hand, is a minimally intrusive modality that can be sensed on the wrist and fingers and can be recorded easily [
9,
10]. GSR also known as skin conductance (SC) is one of the most sensitive markers for emotional arousal [
11]. Unconscious response of our body to different stimuli through skin conductance is measured using GSR. Changes in skin conductance in the hands and foot region triggers emotional stimulation [
11,
12]. Higher skin conductance is demonstrated for intense level of arousal. Sympathetic activity, driving human behavior, cognitive, and emotional state on a subconscious level is controlled autonomously by the skin conductance.
Several investigations on synchronously recorded GSR signals have been conducted to inspect the impacts of cognitive state change. In study [
12], the authors used GSR as an index of cognitive load to evaluate users’ stress due to workload while performing reading and arithmetic task. Temporal and spectral features were explored and concluded that spectral features showed to be promising in measuring the cognitive workload compared to the temporal features. In the previous work [
13], a novel method for analyzing skin conductance (SC) using Short Time Fourier Transform (STFT) was employed to extract estimation of mental work load with high enough temporal bandwidth to be useful for augmented cognition application. Graphical data analysis of the STFT showed notable increase in the power spectrum across a range of frequencies directly following fault events. GSR was used in [
14] for emotion recognition by extracting time domain and wavelet based features. Features were extracted using various window lengths. Random forest machine learning algorithm was used to characterize valence and arousal satisfactorily. In previous work [
15], a system for human emotion recognition that automatically selects GSR features was proposed. Thirty features were extracted and a covariance based feature selection was implemented to extract an optimized feature set to better characterize the human emotions. Support vector machine (SVM) has been used for human emotion recognition with an accuracy of more than 66.67%. The above-mentioned previous works provided enough support to consider GSR as a reliable measure to identify and characterize mental workload. However, very few investigations were performed to detect cognitive workload or distraction while naturalistic driving using GSR. The authors in Ref. [
16] used physiological signals like electrocardiogram, galvanic skin response and respiration to develop a novel system for stress detection during naturalistic driving. Features were extracted mainly from time, spectral and wavelet multi-domains. Features were generated for 10-s intervals of data. Detection of stress was accomplished using kernel-based classifiers. This study provided satisfactory results to employ physiological signal measures to in-vehicle intelligent systems to assist drivers on the road for early detection of stress. In that study, raw GSR signal in combination with other physiological measures was used to detect stress and solely time domain features of raw GSR were explored. In our previous work [
10], we considered raw GSR signals for a preliminary analysis of driver distraction during a naturalistic driving experiment. We solely focused on two scenarios: (i) normal driving (non-distracted state) and (ii) driving while having an engaging phone conversation (distracted state). We then extracted some standard statistical measures and used binary SVM for distraction detection. Our aim was to analyze the discriminative power in the raw GSR space between normal and distracted driver state. We evaluated the detection model on six subjects and achieved the average detection accuracy of 91%.
Our aim in this paper is to design a system to identify the impact of secondary tasks of calling and texting on drivers using a continuous measure of phasic GSR signal during a naturalistic driving experiment. In our experiments, we use a wrist band wearable GSR on a population of 10 driver subjects that participated in this study during real driving experiments. Three scenarios were investigated in our experiments: (1) normal driving focusing attention on the primary task of driving; (2) phone distracted driving while having an engaging phone conversation; and (3) text distracted driving while writing and sending texts when driving. We hypothesize calling to be a cognitive distraction element in comparison to texting, which represent cognitive and visual distraction at the same time. We aim to evaluate GSR towards identification of distraction on the edge using short-term segments of GSR. The collected GSR data was decomposed into phasic and tonic components using continuous decomposition analysis (CDA) [
17]. We then conducted a high resolution spectro-temporal analysis of the decomposed signals and continuous phasic components of GSR containing the most discriminative information was considered for subsequent analysis. We then extracted several spectral and temporal measures that characterize the phasic GSR signal in correlation with distracted scenarios. We employed linear and kernel-based Support Vector Machine (SVM) and 10 fold cross validation (10-CV) to generate identification results. Upon evaluating the result, phasic GSR showed promise as a reliable indicator of driver distraction by achieving an overall average accuracy of 94.81% to identify distraction elements under a naturalistic driving condition. Since input feature space is constructed in a manual process, the redundancy and computational complexity of the space might decrease the accuracy and response time of distraction identification in the generalization phase. Therefore, we employed support vector machine – recursive feature elimination (SVM-RFE) [
18] to remove the redundancies for more efficient processing on the edge. We employed SVM-RFE in order for the following: (1) generate a rank of the discriminative features for the subject population, and (2) create a reduced feature subset with the highest distraction identification accuracy. Our experimental results using SVM-RFE demonstrated marginal decrease in accuracy while reducing the computational complexity and the redundancy in the input space towards early notification of distraction state to the driver.