1. Introduction
Sleep is a dynamic process and one of the most fundamental physical requirements for human survival [
1,
2]. The electroencephalography (EEG) is most used for its examination [
3]. Four sleep stages are commonly detected according to the more recent guidelines published by the American Academy of Sleep Medicine (AASM) [
4]: three non-rapid eye movement stages (NREM1, NREM2, and NREM3) and rapid eye movement (REM) sleep [
5,
6]. Deep sleep (NREM3) plays an important role in memory consolidation. NREM3 is characterized by slow wave activity (SWA) containing a frequency up to 4 Hz. Specifically, so-called slow oscillations (SOs) have a significant impact on memory [
5,
7,
8,
9,
10]. SOs are synchronized EEG waves with a frequency from 0.5 Hz to 1.0 Hz [
9] as a neocortical-hippocampal dialogue occurs, which allows for memory replay and redistribution into the long-term neocortical memory stores [
11,
12,
13,
14]. They predominate in deep sleep [
15,
16].
To enhance memory consolidation, a number of studies have been conducted to explore methods to improve SWA during sleep. Attempts to increase memory consolidation by stimulating EEG signals have used electrical, olfactory, and acoustic stimulation [
17,
18]. Synchronized auditory stimulation in EEG signals modulates SOs and improves consolidation of the memory [
19,
20]. For the right effect, it is important to stimulate the SO waves in their rising phase (upward going SO slope, going towards the up state) [
21]. A number of studies are examining memory consolidation by using synchronized auditory stimulation. For example, two phase-controlled stimulation was used in studies of Ngo et al. and Besedovsky et al. [
22,
23,
24]. The first step was an SO-negative peak detection, followed by the first auditory stimulation with individual time delay settings, and the second stimulation was 1.075 ms delayed with respect to the first one. Results from Ngo et al. [
22] were compared with those of a Thalamocortical Neural Mass Model in the study of Costa et al. [
25]. The study [
26] compared the precision of stimulation with [
22] and the author’s implementation of the Phase-locked Loop (PLL) algorithm. A method based on the PLL was used for EEG signal stimulation in studies of Papalambros et al. and Ong et al. [
27,
28]. Various methods are being tested for stimulation efficacy when used on different populations, such as the elderly, insomniacs, and those with psychiatric and cognitive disorders. Indeed, the optimal timing has been recently examined to reach appropriate modulation of the SOs [
29]. The literature includes a comparison of different stimulation methods, mostly on healthy young volunteers. The open loop method was used in the study of Weigen et al. and involved three pulses with 1.075 s inter stimulus interval (ISI) followed by 5–9 s pause between the next three pulses [
30]. This study [
30] tested the acoustic stimulation on healthy young adult subjects. Auditory stimuli adjusted and targeted by an unsupervised algorithm to be phase-locked to the negative peak of slow waves single pulse were used in the case of Leminen et al.’s study on healthy young adults [
31]. Debellemaniere et al. used the linear regression fitting of a sinus wave to stimulate SOs on a filtered in their study on young adults [
32]. Twenty healthy young subject were tested using the PLL method with approximately 1 s ISI followed by 5–6 s pause in study of Grimaldi et al. [
33]. The open loop method with 12 pulses and 1 s ISI followed by 15 s pause was used in study of Simor et al., and this study was performed on healthy young adults [
34]. The closed-loop acoustic stimulation during sleep was used in study of Fattinger et al. in the case of children with epilepsy [
35].
Our study investigates chronic insomnia patients. Insomnia is a sleep disorder in which individuals complain of difficulties in falling asleep, maintaining sleep or early waking from sleep last regularly for at least four weeks [
36], and it is common problem in elderly people [
37]. It is the most common sleep disorder; in the adult population, 30–48% have reported at least one symptom related to insomnia at some stage of their lives [
38]. At the same time, insomnia has a significant impact on the quality of life. It is a significant risk factor for cardiovascular disease, hypertension, and type 2 diabetes and may lead to lower productivity at work or a higher risk of workplace accidents [
36,
38]. There are still many unknowns in the pathophysiology of insomnia due to its broad definition and clinical heterogeneity [
38]. It is generally accepted that the pathophysiology of insomnia could be characterized by a lack of SWA. It was found by Merica et al. that the spectral power in chronic insomnia patients is lower for delta and theta band frequencies [
39]. In the same study, it was found that beta band power spectral density was higher in chronic insomnia patients during the REM sleep phase [
39]. A similar power spectral density was found in elderly subjects in a study by Carrier et al. [
40]. The authors of that study found that, with age, there was a decrease in the power spectral density in the SWA and in the theta and sigma bands during sleep [
40]. In contrast, in the beta band, power spectral density during sleep increased with age [
40].
Generally, there are two acoustic stimulation methods (fixed-step and PLL-based) applied across studies that have not been quantitatively compared yet. A comprehensive explanation of both methods and their rigorous comparison is essential for the further research of acoustic stimulation. Most of the studies report results in healthy young subjects [
41]. However, some studies have presented results in elderly or middle-aged subjects. Results were presented using only one implementation of the PLL method. For example, the study of Papalambros et al. presented results in elderly subjects [
27], in patients with amnestic mild cognitive impairment [
42] and in middle-aged adults [
19,
43]. One study, specificially that of Wunderlin et al. [
44], shows that the effectiveness of current implementations is not yet so high that the widespread use of SWA stimulation can be considered, and this study was evaluated on the summarized results of 11 experiments. Furthermore, the target group for expanding use would be predominantly older individuals in whom it is difficult to physiologically detect continuous long-term deep sleep [
45]. Therefore, it is essential to search for sensitive methods for both the detection and analysis of SWA. For this reason, we focused our research on the following points: quantitatively comparing two types of methods (fixed-step and two implementations of the PLL method) used for stimulating SWA and offering solutions for further testing.
An open question of whether induced or modulated SWA is similar to naturally occurring SOs was identified to be of high importance for clinical translation of the stimulation methods in a recent systematic review [
46]. The effect of stimulation was monitored, and advanced metrics were proposed in this study on chronic insomnia patients to help elucidate the exact mechanism of stimulation.
4. Discussion
Research on the real-time stimulation of SWA is already being developed, and several independent scientific teams have already described their implementations and initial results. The aim of our paper was to add objectification by a quantitative comparison of the two most commonly used approaches to stimulation. Our aim was also to extend a family of acoustic stimulation effect metrics through a sensitive and well-established concept. This will help in understanding the mechanisms underlying the stimulation efficacy and basic principles, which are not yet known or clearly defined in the literature.
The first major issue of SWA stimulation is the objectification of subject measurements. Because of the real-time response, it is not possible to work with FIR filters such that they guarantee an ideal steepness, and it is generally difficult to find a metric that can ideally detect slow oscillations across different subjects [
46]. There is also an issue with adjusting the sound level. In this study, the threshold was set individually so that a subject could hear it well but not be disturbed by it. For objectification, it would be appropriate to determine a threshold according to the “standardized” procedure. This may not be a large issue in younger subjects. However, in older adults, the difference in hearing quality is very high. An algorithm that would systematically determine the optimal sound level for each individual based on testing before the measurement itself is essential for future clinical studies.
SWA detection can be a problem in the case of chronic insomnia patients or, for example the elderly, due to sleep variability [
60]. Insomnia is a common problem in the case of elderly people [
61] as well. The NREM3 phase is not homogeneous, and fluctuating sleep occurs more frequently in these cases. In our laboratory, the course of the proband’s sleep was monitored by the constant supervision of a laboratory technician. For this reason, we switched on the detection only after the visual identification of stable deep sleep by the laboratory technicians, which is recommended for future studies.
There is a trend in the literature to replace the fixed-step method [
20] with PLL methods so as to effectively stimulate the following SWA (not only the first wave after detection). There have been many implementation types of PLL, such as [
26,
54,
62]. Two types of PLL were implemented and analyzed in this study. Specifically, PLL implementation with the integral part and PLL-XOR implementation were tested. These two types are commonly used as a digital PLL implementation.
The study [
26] also implemented the PLL method with an integral part for SWA stimulation. The authors described their implementation process in detail, but we were unable to replicate some parts. For example, the cut-off frequency of the low-pass filter was set to 0.03 Hz, which was a limit that was not applicable to a standard filter in real-time processing in our case. Our IIR filters with such a cut-off frequency were unstable, and the FIR filter could not be used for real-time processing due to its slow response. For these reasons, we implemented the PLL method with an integral part based on Scher implementation [
54], and the parameter set was tuned in this implementation. This implementation principally corresponded to the implementation in the study [
26]. Both implementations applied a low-pass filter after the phase detector and then used an integral form to convert the filtered signal to the current phase of the PLL signal. However, the PLL method in our study also used the proportional form in the calculation. Despite small differences between our method and a previously reported method [
26], both algorithms are similar enough for the the purpose of comparison.
Generally, we observed a very high sensitivity of the PLL behavior to its parameters, which ultimately convinced us to give priority to the fixed-step method. However, we approached the PLL parameter optimization in three different ways. The first approach, called the phase-based method, resulted in PLLs oscillating at very low frequencies, lower than 0.5 Hz, which was a much lower frequency band with respect to the typical SWA band between 0.5 and 4.0 Hz. It is hypothesized that the PLL fitted very slow drifts of the EEG data, which could not be attenuated by stable filters. It is not possible to apply an FIR filter for real-time stimulation due to its very high order, and the IIR filter is not stable in this case. Though the slow drifts lower than 0.5 Hz were tempered, it was not possible to completely eliminate them.
The second approach, called the time-phase-based method, was an extended form of the previous phase-based method. In that way, we eliminated the ambiguity of the phase-based method, which led to the fitting of the slow drifts. In this case, the PLL signal oscillated with a higher frequency, and the main peak in power spectral density was approximately 5 Hz. The results showed that the stimulation was performed in the rising phase of the real signal. However, the problem lay in the incorrectly high PLL frequency, causing the second stimulation to be performed in the same rising phase of the real EEG records as the first stimulation. Thus, spurious PLL fitting could be observed if the PLL frequency became very high, compared to the frequency of interest. Here, the PLL output signal frequency was approximately 5 Hz, and the frequency of the SWA was approximately 0.5 Hz.
The fixed-time-based method optimized the PLL parameters based on a prior specification of the delay between detection and stimulation. This approach was applied to avoid the influence of noise in EEG recordings, causing the noise in the phase estimates to be required by the phase-based and time-phase-based methods. Even the noise-free criterion resulted in difficulties in terms of over-fitting the PLL parameters. Very small changes in a prior fixed-time delay produced significant changes in PLL behavior, which was quantified by the mean frequency of the PLL output in our case. In this case, the PLL signal had the highest frequency across all three tested criteria, and the mean value of the phase in which the stimulation occurred did not correspond to the desired value to which the PLL was to be adjusted.
Generally, we state that the PLL method showed very complex behavior, which is not necessarily captured by the optimization metrics used in previous studies. An over-fitted PLL can result in a narrow polar histogram, while its output signal is far from optimally fitting the original EEG data. Thus, a spuriously working PLL can be obtained.
For example, it is essential to ensure that the interval in which the stimulation at the rising phase of the PLL output takes place is very narrow. Afterwards, for fast PLL oscillation, the stimulation was skipped; see
Section 2.4. For this reason, there could be a small amount of stimulation events, and the PLL could therefore be wrongly fit because of the incorrectly distributed weights between subjects. To eliminate this phenomenon, we extended the stimulation interval in the rising phase of the PLL. The number of stimulations was thus increased, and the PLL signal had spectral characteristics corresponding to SWA; see
Table 6. However, the resulting stimulations in the EEG were than scattered throughout the wave, including the falling phase (downward negative-going wave, going towards the down state); see
Figure 8. This indicates the poor synchronization of the PLL and EEG signals.
Acoustic stimulation during slow-wave sleep can have a positive effect on memory consolidation. In recent years, many studies [
17] have been published that describe the methods and the effect of stimulation in the context of memory change. However, the effect of stimulation on the electrical activity of the brain as such has not yet been clearly described. At the same time, no quantitative comparison of the two methods commonly used for acoustic stimulation was performed. The new statistical look at acoustic stimulation in our study should help others to use and develop acoustic stimulation further.
Both stimulation methods were applied on the same dataset. No brain response was elicited because the data were artificially streamed. Thus, only the first stimulation was evaluated. Overall, the fixed-step method stimulated more frequently compared to the PLL-XOR implementation method. This was due to the stimulation interval, which was too short for some fast oscillations. The fixed-time pause, which assumes the slowest frequency of 0.5 Hz, is shorter for a number of cases than the pause of the PLL method.
The fixed-step method has less variance, which could indicate greater homogeneity of the stimulation position. Shifting from the mean value of the stimuli to the rising phase would not lead to so many cases of stimulation at the falling phase. PLL-XOR has a higher value of kurtosis, which suggests that there are more extreme values in the phase distribution than in the fixed-step method. However, this difference is not significant. Skewness values are low for both methods, which indicates a relatively symmetrical distribution. A fixed-step method shows positive skewness values, while those of the PLL-XOR method are negative. We consider negative values to be advantageous here, which means that outlying values are concentrated in the left part of the distribution; i.e., stimulation occurs earlier. This means that PLL-XOR should again have the advantage that stimulation in the falling edge will not occur as often (in the case of good phasing).
The mean value of the PLL-XOR stimulation position and that of the fixed-step method are similar (approx. 250), but the PLL-XOR method has a greater variance and a high amount of stimulation in the falling phase. Our study and comparison show that PLL cannot be easily adapted for universal use by different studied populations and individuals. When we looked at the optimal PLL parameters that were tuned for each record separately, they varied across individuals. Therefore, if we try to find unique common parameters for all individuals, we encounter the inability of PLL to adapt to our requirements. The fixed-step method does not have many cases where the falling phase of the SWA has been stimulated. The fixed-step method seems to be a better variant due to its robustness and good stimulation position results.
In this study, a combination of the ITPC and amplitude analysis was proposed to study the effects of acoustic stimulation by the fixed-step approach during sleep. In previous papers [
22,
23,
26], the averaged signal across stimulation or sham trials was mainly utilized to demonstrate the effect of the stimulation. We showed that the ITPC is more sensitive to the effects compared to the commonly used averaged signal. The ITPC is amplitude-independent, which allows us to address the phases of the SWA specifically. This is in contrast with the commonly used averaged signal, which contains information about both amplitude and phase. For the first time, it has been shown that the phase synchronization of the SWA is increased by acoustic stimulation to a greater extent than the amplitude. An important fact is that, instead of the prolongation of SWA due to stimulation, the ITPC measures have the specific qualities of the SWA during the deep sleep period. More technically, ITPC measures how much the temporal features of the SWA are consistent across all detections. This finding can lead to a proper understanding of the actual effect of the acoustic stimulation and can give rise to more theories explaining the effects.
Further, the ITPC is the first step towards a rigorous interpretation of cross frequency coupling (CFC); see [
56]. The CFC is becoming a broadly observed phenomenon in EEGs during sleep, and phase amplitude coupling (PAC) is the most common case. However, for PAC to be rigorously interpreted, the ITPC has to be known to eliminate potentially spurious couplings due to a stimulus presentation. Generally, the contribution of an evoked response to the observed changes in the SWA due to an actual sound stimulus is still an open question. We believe that computing the ITPC can contribute to a better distinction of these two mixed phenomena and will allow us to use and interpret advanced methods rigorously.
We have found that the time–frequency representations of the ITPC and signal power are not as similar as one would expect. The ITPC and signal power showed rather complementary results. The ITPC was increased in a frequency-specific band during a broad time period. The signal power changes in a more time-specific manner and are distributed over a broad frequency band. Again, the ITPC was increased mostly in the band of SWA with a longer duration compared to signal power changes.
The combination of the ITPC and power time–frequency representations is a general way of analyzing the effects of acoustic stimulation, since the ITPC is amplitude-independent and power is phase-independent. This approach is suitable for distinguishing between evoked and induced changes. For example, our results showed that the signal power in the spindle band changed due to the acoustic stimulation, while the phase synchronization did not. Thus, the spindle activation is not likely to occur due to the evoked response to the acoustic stimulus. The specific latency of the observed changes simultaneously brings further information that can be confronted with known evoked phenomena in EEGs. Utilizing the proposed approach and integrating the obtained information can shed light on the distinction between evoked and induced changes due to acoustic stimuli and can support rigorous theories explaining the treatment effects of this promising method.