1. Introduction
Stress is associated with an increased risk of numerous diseases [
1,
2,
3,
4,
5,
6,
7] and mental disorders [
8,
9]. Besides these adverse effects on individuals, it also imposes a large financial burden on society via absenteeism, healthcare costs, and productivity loss [
10,
11]. Personalized just-in-time interventions may be able to prevent or alleviate some of these burdens [
12]. To do this, either the negative impact of stress or a decreased resilience to cope with stress should be detected early, preferably via unobtrusive monitoring. For instance, unobtrusive detection of the negative impact of stress (e.g., on sleep or physiological systems) in an early state may help increase awareness that current circumstances may be causing wear and tear on bodily systems (allostatic load) that may be contributing to health-related problems if sustained over time [
13]. Similarly, recognition of the potential depletion of resources that are needed for resiliently coping with challenges could be used to trigger feedback to take it easy that day and avoid overly challenging circumstances where possible. Recent developments in wearable sensor technology introduce promising opportunities for this type of unobtrusive monitoring [
14,
15].
When the first modern wearables came to market around 2009 (e.g., the Fitbit Classic), these devices initially became popular as pedometers or activity trackers but were already able to estimate sleep duration via accelerometry as well [
16]. Since then, consumer wearable-based sleep tracking has improved to a point where it is considered proficient for measuring the Total Sleep Time (TST), while the detection of sleep stages needs further work [
17]. Sleep deprivation is known to have a reciprocal relationship with stress, meaning that it is both caused and can be caused by stress [
18]. Longitudinal studies with repeated daily measures confirm this bidirectional association [
19,
20,
21] but tend to rely on subjective TST measures (e.g., measured via questionnaires) and need verification using objective sleep measurements [
22]. Wearable-based research can therefore contribute to this body of knowledge and explore the potential of wearables to unobtrusively monitor for signs of the negative impact of stress or decreased resilience.
Besides behavioral outcomes such as physical activity and sleep, around 2015 (e.g., the Fitbit Charge HR) consumer wearables started measuring heart rate after photoplethysmography (PPG) sensors were included [
23]. Today, PPG sensors are also used to track physiological outcomes such as heart rate, blood oxygen saturation, blood pressure, and respiration [
24]. Perhaps the most important PPG-based innovation in the context of stress and resilience is the measurement of Heart Rate Variability (HRV), which can now be accurately measured using wearables or even camera-based smartphone apps in a resting state or during sleep [
25]. HRV is a measure of the variation in heartbeats and is a proxy for autonomous nervous system functioning [
26]. HRV acutely declines during stress [
27] and afterward can remain suppressed during subsequent sleep [
28,
29]. Consequently, individuals with a low resting HRV are more likely to interpret seemingly mild stimuli as significant stressors [
30,
31,
32] and have suboptimal emotion regulation [
33,
34]. Although these findings are based on population studies that investigated between-subject differences, the reciprocal nature of these findings illustrates that an initial decline in resting HRV could potentially cascade into subsequent days and thus have downstream effects.
A recent paper introduced a conceptual model in which the potential underlying mechanism for such a cascading effect of an initial decline in resting HRV was described [
14]. The model suggests that resting HRV buffers against the impact of demands on stress by making potentially stressful situations seem less stressful [
30,
31,
32], as well as against the impact of stress on mental exhaustion via more optimal emotion regulation [
33,
34]. Since this model also proposes that the need for recovery (e.g., increased mental exhaustion and/or decreased vigor) negatively influences resting HRV [
28,
29], a potential negative feedback loop is formed. This aligns with the conservation of resources theory, which states that since resources are needed to cope with demands, an initial loss of resources may result in a loss spiral [
35]. Finally, the model hypothesizes stress to both be negatively impacted by stress [
18,
19,
20,
21], as well as to buffer against the negative impact of an increased need for recovery on resting HRV due to its restorative properties [
36,
37]. A study was then performed to test these hypotheses by utilizing wearables to measure TST and resting HRV, as well as an Ecological Momentary Assessment (EMA) smartphone app to measure subjective demands, stress, and mental exhaustion [
38]. The study confirmed that resting HRV is both negatively impacted by mental exhaustion and buffers against the negative associations between demands and stress, as well as stress and exhaustion. Day-to-day changes in resting HRV may therefore be both indicative of the negative impact of stress and predictive of stress-resilience, potentially even on a multi-day level. Further exploration of these potential multi-day bidirectional associations will improve our understanding of the degree to which day-to-day changes in wearable-measured resting HRV can be interpreted as potentially stress-related and in which they should be expected to reflect a state of lowered resilience.
To summarize: wearable-measured sleep and resting HRV have both been bidirectionally associated with subjective stress-related outcomes, but within-subject research investigating the potential patterns in multi-day associations in a real-world context is lacking. Increased insight into the degree to which these relationships are consistently observed in individuals may help improve models for the early recognition of the negative impact of stress and of lowered resilience. Such insights could contribute to the development of automated resilience interventions that may help to prevent stress-related problems. These interventions are especially relevant for individuals working in safety-critical professions, such as police officers [
39]. Therefore, this study explores whether wearable-measured TST and resting HRV in police officers (1) can be predicted by stress-related EMA outcomes (demands, stress, mental exhaustion and vigor) in the preceding days, and (2) predict stress-related EMA outcomes in the subsequent days.
4. Discussion
This study aimed to explore to what degree wearable-measured sleep and resting HRV in police officers (1) can be predicted by stress-related EMA outcomes in the preceding days, and (2) predict stress-related EMA outcomes in the subsequent days. After performing a time series analysis on eight participants, the results showed that associations in both directions of modest strength were observed and that TST and resting HRV were more consistent predictors for the next day’s perceived demands, stress, mental exhaustion, and vigor than the other way around. Demands was a negative predictor of TST of one participant, and for resting HRV in another. Mental exhaustion predicted both resting HRV and TST in the same participant. Especially, TST seemed a strong predictor of stress-related EMA outcomes. TST negatively predicted demands in two participants, stress in three participants, mental exhaustion in five participants, and positively predicted vigor in five participants. Resting HRV negatively predicted demands in two participants, and both stress and mental exhaustion in one participant.
This study led to three key findings that will first be reflected upon, followed by a discussion of the strengths and limitations of the study, and finally a summary of the main conclusions and recommendations for future research.
4.1. Associations between TST, HRV and EMA Outcomes Are Not Consistently Observed
Although TST was a negative predictor of mental exhaustion and a positive predictor of vigor in the majority (62.5%) of the participants, no association between a wearable- and an EMA-based item was consistently observed in all participants. No convincing explanations for the prevalence of these associations were identified after inspection of differences in the participant characteristics (
Table 1).
The number of participants in this study (n = 8) was too low to meaningfully assess to what extent between-subject differences in participant characteristics could predict the prevalence of these associations. Future studies with a larger sample size are recommended to explore if the occurrence or strength of these associations may be explained by participant characteristics, for instance via multilevel VAR [
70]. If these differences can be explained in future studies, they may be used to further personalize wearable-based models for stress-resilience.
It is also possible that the strength of these associations does not (only) depend on differences between individuals, but (also) on differences within individuals or in their environment. However, it may be difficult to determine beforehand what these influencing factors may be. It is possible to first explore if the strength of these relationships changes over time, for example via time-varying VAR models [
71]. Detecting such changes over time is particularly feasible in datasets with a larger number of observations and/or more granular data. If these associations do change over time, it is possible that they may be actually relevant for all participants, but only under certain circumstances. Depending on the outcomes of such studies, it could provide new insights into the internal or external factors that determine when these associations are observed.
4.2. The Impact of Changes in HRV Appears to Be More Abiding than That of Changes in TST
The IRF visualizations in
Figure 1 demonstrated that a demand-induced decline of resting HRV appears to have a longer recovery time (5–6 days) than a demand-induced decline of TST (2–3 days). Similarly, the impact of a change in resting HRV on stress-related EMA outcomes (
Figure 3) appears to also be more long-lasting (1–3 days) than that of a change in TST (1 day) (
Figure 2). This was attributed to the significant autoregression component that was observed in resting HRV, but not in TST. The strong autoregression component in the resting HRV model means that resting HRV values are relatively likely to be similar to those of the previous day(s). Therefore, a demand-induced decline in resting HRV (analysis 1) may take several days to recover from. Similarly, the impact of a decline in resting HRV on demands, stress, mental exhaustion, and vigor is likely to spill over into subsequent days, as it means that resting HRV is likely to remain suppressed for another few days.
This observation may be explained by the fundamentally different nature of the concepts resting HRV and TST. Resting HRV is a quantification of a physiological state that is continuously striving to maintain stability (homeostasis) despite disruptive challenges (allostasis) [
13]. The recovery from a stressor that has a physiological impact (allostatic load) could take longer depending on the intensity and frequency of the stressor, as well as the quality and quantity of the subsequent recovery [
36,
37]. As such, a large decline in resting HRV can logically be expected to take some time as well. TST, on the other hand, is a quantification of the recovery process itself. Stress can negatively influence TST on the following night [
18,
19,
20,
21] and can therefore also impact TST on subsequent nights in the case of a recurring or sustained stressor. However, when this is not the case, it is also possible that the individual compensates for the previous sleep loss via recovery sleep [
72], which would mean that TST on a subsequent night is no longer suppressed but actually increased. From this perspective, TST values can be expected to be more volatile than changes in resting HRV and thus have a weaker autoregression component. However, it is possible that changes in TST do have a longer-lasting impact on relevant underlying (psycho)physiological states such as vigor, which was observed to consistently have a significant autoregression component (
Table 4).
The seemingly more abiding impact of a change in resting HRV on the resting HRV of the subsequent days may also be influenced by the development of a negative feedback loop. A previous study showed that evening mental exhaustion negatively impacted subsequent resting HRV and that resting HRV itself buffered against the positive association between demands and stress, as well as between stress and mental exhaustion [
38]. This aligns with the Conservation of Resources Theory, which describes that an initial loss of resources could lead to a negative feedback loop. This means that fewer resources are available to handle upcoming challenges, which leads to lower resilience [
35]. However, in the current study, no bidirectional association between a stress-related EMA item and resting HRV was observed within a single participant. Future studies with a larger sample are needed to increase insight into the multi-day impact of stress-related changes in resting HRV.
4.3. TST and HRV Are More Consistent Predictors of Stress-Related Outcomes than Vice Versa
These findings indicate that wearable-measured TST and HRV seemed better predictors of stress-related EMA outcomes than the other way around. EMA-based predictions of TST and resting HRV were only observed in two participants, who had relatively large samples of observations (N = 385 and N = 283) compared to the median (N = 144). Additionally, these relationships were not consistently observed in both participants. These differences cannot merely be explained by statistical power. Nevertheless, these models explained a modest amount of variance in TST (9%) and resting HRV (22%) in some participants. It is possible that these relationships are relatively small in nature and can only be observed in larger samples.
The finding that TST is a more consistent predictor of stress-related outcomes than that it can be predicted by stress-related outcomes aligns with prior research [
22]. For instance, a lower TST has consistently been shown to predict increased stress [
19,
20,
21,
73]. Conversely, in the same studies, the opposite is regularly associated with smaller effect sizes [
19,
21], but in another study, TST was not associated with stress-related outcomes [
73].
Similar scientific findings on the combination of both the predictive power and predictability of resting HRV in the context of stress-related outcomes are limited. However, the current findings do align with prior research, which has shown that stress-related outcomes negatively affect resting HRV [
27,
28,
38] and that a relatively lower resting HRV than an individual’s normal resting HRV can negatively impact stress-related outcomes on the following day [
38,
74].
One of the implications of this finding is that a decrease in wearable-measured TST or resting HRV does not necessarily point toward the occurrence of stress-related outcomes. Although the observed decrease in TST or resting HRV might have been caused by subsequent high demands or stress, this outcome may have been confounded by other factors. In situations where sudden extreme demands or stress occur, this might in some cases directly cause a decreased TST or resting HRV. However, in these circumstances, the wearable-user is likely already aware of the impact of such events. In such instances, the wearable-user less likely needs objective feedback to confirm this short-term effect.
Based on these findings, wearable-measured TST and resting HRV are not necessarily usable as a direct indication of the negative impact of stress but hold more promise to function as potential predictors to estimate one’s resilience. For instance, these insights could be implemented in resilience interventions in the form of a readiness score that gives the user feedback on his or her expected readiness to handle mental and physical challenges that day [
75]. Ideally, these factors will be expanded upon in future research (e.g., by also assessing behavioral outcomes such as smartphone usage, geolocation, or patterns in communication) that also explore different modeling approaches (e.g., machine learning) in order to improve the performance of these models.
4.4. Strengths and Limitations of the Current Study
This study applied a novel research design and recruited a motivated number of participants that resulted in a series (n = 8) of sizable datasets (N = 125–386) with mostly (80.7–96.8%) complete observations. By utilizing a consumer-available wearable that is validated for both TST and resting HRV measurements to collect observational data in a real-life environment, the generalizability of the findings to practical settings is relatively good. However, three limitations of the current study should also be considered during the interpretation of the presented results.
First, the multiple n-of-1 study design with a small number of participants (n) but a large number of observations per participant (N) was optimized as a first exploration of the potential existence of the hypothesized multi-day associations at a within-subject level based on high-quality data but limits the generalizability of the current findings to a broader target population. Therefore, future research with a larger number of participants is needed to increase confidence that the found associations are indeed relevant for larger groups of people. Future research is also needed to better understand why the identified relationships are prevalent in some cases, but not in others. For instance, it is possible that studies with a larger number of observations per participant can unveil to what extent associations with a smaller strength can be observed in other participants, and to what extent the strength of these associations may change over time (e.g., via time-varying VAR).
Second, the included healthy participants and data collection during the COVID-19 lockdown might have affected the participants’ perceptions of demanding and stressful situations. Their daily practice may not have been very demanding, which may have resulted in relatively low variance in the data. This aligns with the findings of a study on 2567 European police officers, which reported decreased strain during the pandemic [
76]. The analyzed participants all scored relatively well on the mental well-being questionnaires (
Table 1). Another article that was based on data from this same study population showed that some participants reported moderately elevated stress and somatization throughout the study period, but that there were no clinically relevant signs of anxiety and depression [
41]. Future studies with a more mentally challenged sample need to verify the current findings for more challenging conditions.
Finally, some of the statistical assumptions of the created VAR models were technically violated. Most notably, none of the VAR models had normally distributed residuals, which was likely the result of sometimes skewed or bimodally distributed EMA items. Since simulation studies have shown that this assumption is particularly relevant when relatively small samples are assessed but not problematic when a sample of at least 100 observations is analyzed [
70], this was not considered to be a problem for the interpretation of the results. The VAR model of participant 5 was also found to have autocorrelated residuals, which could not be resolved (e.g., by adding additional lags). Although this does not necessarily limit the interpretability of the model and related findings, it does show that the model is incomplete, and at least one unobserved but relevant factor was not included in the present study.
5. Conclusions
This multiple n-of-1 study showed that in relatively healthy police officers, demands were occasionally observed to be a negative predictor of wearable-measured TST and resting HRV. TST and resting HRV were more regularly observed to be negative predictors of demands, stress, or mental exhaustion, whereas TST also positively predicted vigor in several participants. The presented results illustrate that caution is needed when interpreting changes in TST and resting HRV to be potentially stress-related and that TST and resting HRV are more likely to be useful as predictors of stress-resilience (e.g., expressed as a readiness score).
However, since the identified associations were not consistently observed amongst participants, further research is necessary to better understand the underlying mechanism. For instance, future studies with a larger sample of participants, which is also needed to improve the generalizability of the current findings, could consider assessing if these between-subject differences could be explained by participant characteristics (e.g., via multilevel VAR). Another direction could be to explore if the strength of these associations’ changes over time in samples with a larger number or more granular data (e.g., via time-varying VAR). Finally, future studies should explore if predictive models with a higher explained variance can be achieved by including additional data sources (e.g., smartphone usage, geolocation, or patterns in communication) or utilizing more inductive methods (e.g., machine learning approaches).