1. Introduction
Estimating the frequency, the phase and the amplitude of a signal is a standard classical problem of signal processing. In particular, phase estimation has been widely applied in synchronization in communication [
1], power analysis [
2], speech enhancement [
3], GPS navigation [
4] and much more. Until now, phase estimators based on direct Discrete Fourier Transform (DFT) are the mainstream [
5].
In [
5], Liguori pointed out that, the existing DFT-based phase estimators heavily rely on the result of frequency estimate. To illustrate this dependency, let us study the sampled version (the sampling rate is
) of a complex exponential signal
exp
as
where
,
,
are amplitude, frequency and phase respectively and
t is the sampling interval 1/
. Accordingly, the frequency resolution of the
N-point DFT
is
f =
/
N = 1/
Nt. Assume the peak DFT bin is at
k =
,
∈
(
refers to the set of positive integer numbers). Then, it can be deduced that the ideal phase value of
is
The variable
in (
2) is the fractional frequency offset
with the value
, which is closely related to two sampling cases (coherent sampling or noncoherent sampling). For the coherent sampling case, i.e.,
,
and thus the sequence {
,
N − 1} exactly covers
signal periods, it can be easily inferred from (
2) that the phase
can be estimated by directly taking the phase at the peak bin. Nevertheless, in case of noncoherent sampling, i.e.,
,
and thus the sequence does not exactly contain integer times of signal periods, it can be inferred from (
2) that the phase estimate
is related to both the peak index
and the frequency offset
.
Up to now, a lot of frequency estimators have been proposed to estimate the fractional offset
. For example, Offelli proposed an energy-based approach [
6] in which the frequency offset of some component can be derived from the energy summation of the windowed DFT spectral bins around the peak bin. Other approaches based on interpolated FFT were reported in [
2,
7,
8,
9], in which
is calculated via interpolation between several successive high-amplitude DFT bins centered with peak bin. In recent years, a lot of high-accuracy interpolation-based estimators such as Provencher estimator [
10], Jacobsen estimator [
11], Candan estimator [
12], and phase difference-based estimator [
13] were proposed. However, in [
5,
14,
15], Liguori emphasized that “the bias of frequency estimate will introduce the uncertainty propagation to the phase acquisition”. In other words, the error of frequency offset
in (
2) will inevitably give rise to the error of phase estimate
accordingly.
Besides, spectral leakage is also a non-ignorable factor of degrading DFT-based phase estimators. In essence, for the non-coherent sampling case, performance degradation of phase estimation actually results from DFT’s inherent spectral leakage effect [
16]. Therefore, to enhance the accuracy of phase estimation, some approach capable of suppressing spectral leakage is expected to be developed.
This paper proposes a novel approach of correcting DFT spectra which can alleviate phase estimate’s dependence on frequency estimate (or rather, the dependence on frequency offset ). This improvement actually attributes to the property that the proposed fully-traversed DFT spectrum has a much slighter spectral leakage compared to original uncorrected DFT spectrum in non-coherent sampling case. To verify these advantages, we also apply this corrected DFT-based phase estimator into a phase-coded steady-state visual evoked potential-based brain-computer interface (SSVEP-based BCI).
Recently, for the purpose of increasing realizable targets, phase information is popularly integrated into frequency-coding in SSVEP-BCI, such as joint/mixed frequency and phase coding [
17,
18,
19]. In [
20], Lee introduced one scheme of phase coding in SSVEP-BCI system, in which 8 LEDs flickering at 31.25 Hz with the phase interval
were used to represent 8 cursor functions. However, the phase decoding in [
20] was realized through detecting the maximum amplitude peak of the averaged SSVEP waveform, i.e., it was realized in time domain instead of frequency domain. In [
21], Lee segmented one stimulus into reference epoch and phase-shift epoch, in which the phase differences between these two epochs were used to distinguish different targets. Shortly after that, in [
22], Lee designed six SDFS (stepping delay flickering sequences) with stimulus frequency of 32 Hz, among which 6 different phase-tagged objects are represented by 6 distinct delay segments such that they can be detected by the normalized power of averaged responses. However, SSVEP-BCIs in [
21,
22] actually employ special stimulus sequences to facilitate phase decoding, which may be not suitable for the common situations of phase measurement. In [
23], to increase the number of visual stimuli, Gao and Jia proposed a frequency and phase mixed coding scheme using multiple frequencies. Then, Gao proposed the phase constrained canonical correlation analysis (p-CCA) to better distinguish frequency-tagged stimuli from multichannel SSVEPs [
24], in which the phase information was estimated from the FFT result of the SSVEP records over the occipital cortex. Following this, Gao et.al introduced phase coding in CCA to discriminate six
-interval targets [
25]. However, the stimuli frequencies in [
23,
24,
25] were actually deliberately chosen such that each stimuli frequency exactly equals the integer times of FFT resolution and thus the frequency offset is also exactly zero. Therefore, in SSVEP-BCIs, novel phase estimator capable of detecting the phases of stimuli with any frequency offset is expected to be developed.
This paper will demonstrate that the proposed phase corrected DFT-based estimator is in accordance with the above demand of SSVEP-BCIs. Experimental results show that, the proposed phase estimator is superior to the conventional estimator in SSVEP phase extraction.
The rest of this paper is organized as follows.
Section 2 introduces the derivation of the proposed fully-traversed DFT.
Section 3 elaborates the properties of the proposed fully-traversed DFT in the noiseless case.
Section 4 gives accuracy analysis of this phase estimator in noisy case.
Section 5 demonstrates the proposed DFT-based phase estimator’s superiority to conventional phase estimators in the phase-coded SSVEP-BCI with auto-calibration. Finally, we conclude with a summary of results in
Section 6.
3. Property of Phase Estimation in the Noiseless Circumstance
3.1. Single-Tone Case
Consider a single-tone signal
= exp(j
,
. Since
is obtained by averaging
, substituting
into
in (
12) and (
18) yields
Obviously, (
19) can be rewritten as
where the superscript ‘∗’ represents complex conjugate operation. Then, combining (
20) with (
4) yields
Taking the phase part of (
21), we can obtain the phase spectrum
as
Equation (
22) shows that, for the signal
= exp
,
, the phase values at all
N spectral bins (including the peak bin
) uniformly equal the ideal phase
(also refers to the instantaneous phase of the center sample
). In other words, the synthesized spectrum
directly provides the accurate phase estimate at any spectral bin, thus removing the conventional DFT-based phase estimator’s dependency on the frequency offset
.
Furthermore, since
is obtained by linearly averaging the
N sub DFT spectra as (
18) shows, the proposed fully-traversed DFT spectrum is of linearity, also. Thus, for a single-tone signal with amplitude
, the following holds
among which the peak spectral bin is
3.2. Multi-Tone Case
Consider a signal containing
Q (
) components expressed as
Since the conventional DFT is a linear transform, combining (
5) and (
25), its multi-tone spectrum
is composed of
Q single-tone spectra, i.e.,
Similarly, since the proposed phase corrected DFT spectrum is also of linearity, combining (
5) with (
23), its multi-tone spectrum
equals the summation of
Q single-tone spectra as
Furthermore, to measure the phase of the
i-th tone, the peak bin at
should be focused. Therefore, the conventional DFT spectrum
can be written as
Similarly, the proposed fully-traversed DFT spectrum
can be expressed as
In (
29), the first term is the expected
i-th single-tone spectrum which directly provides the phase estimate, and the second item represents the interference of other tones. Obviously, for either the conventional DFT spectrum or the fully-traversed DFT spectrum, the accuracy of the
i-th tone’s phase estimation depends on the intensity of interference. Particularly, this accuracy depends on the relative magnitude ratio between
, and
. From (
28) and (
29), we have
Please note that
,
and thus
. Since
is a monotonously descending even function, we have
Combining (
31) with (
32), we have
Equation (
33) shows that, for the fully-traversed DFT spectrum, the relative magnitude ratio between any other tone and the tone of interest is smaller than that of the conventional DFT. In other words, compared to the conventional DFT spectrum, the fully-traversed DFT magnitude spectrum does better in suppressing spectral leakage and interferences, thus yielding a higher accuracy of phase estimation.
3.3. Simplified Dataflow of Phase Estimation
From the aforementioned 4-step procedure listed in Section II, one can find that the fully-traversed DFT experiences N times of DFT operation. Therefore, to reduce computation complexity, this procedure needs to be simplified.
According to the linearity of DFT, the average of
N sub-spectra is equivalent to the DFT result of the averaged data. Therefore, if we average the
N sub-sequences
in (
15) and (
16), then one new sequence {
N − 1} can be constructed as
Accordingly, the DFT result
Y(
k) of
y(
n) is
Clearly, Equation (
35) only involves one time of DFT operation, thus greatly reducing computation complexity. Accordingly, its simplified dataflow is illustrated in
Figure 1 (take
as an example), from which a low-complexity procedure of multi-tone phase estimation can be summarized as follows.
Firstly, weight the input ()-length data sequence + − 1 with one ( − 1)-length triangular window − − ;
Secondly, (N − 1) weighted data pairs (in each pair, the two data are spaced with N samples) are individually summed up to generate (N − 1) data − 1), except the center sample due to ;
Thirdly, implement normalized DFT on the data sequence [ − 1)] to provide the final spectral result = [ − 1)].
Lastly, collect all the peak indices of . For each index , directly taking the phase value of provides its phase estimates .
5. Applying Fully-Traversed DFT in Phase-Coded SSVEP-BCI
Recently, BCIs have become a very hot topic in neural engineering. A BCI detects an user’s ongoing brain activities and translates them into meaningful messages, which helps patients with severe motor disabilities to express their messages to external world [
30]. In particular, BCI based steady-state visual evoked potential (SSVEP) has received much attention in bioengineering research due to its satisfactory performance [
31]. To increase the number of recognizable targets, the phase-coded SSVEP-BCIs use phase information to encode subject’s visual intention. In this system, the phase-tagged visual stimuli are characterized with flashing at one frequency but different phases, resulting in that subjects’ SSVEPs also differ in phase features. Therefore, through extracting the phase information of SSVEP potentials, a computer is able to distinguish which flicker the subject desires to select.
5.1. Experiment Paradigm
In general, SSVEP is always elicited after some latency time
L (or labeled as lag phase ‘
’), which actually corresponds to a phase difference between flicker’s phase ‘
’ and SSVEP’s phase ‘
’. Hence, the relationship between the phase difference ‘
’ and latency
L is described by
where
,
q denote the stimulus frequency and the integer cycles, respectively. Under normal condition,
L is stable in a short period of time but differs in inter-subject such that
cannot be calculated in advance [
21,
23,
24]. From (
54), it can be found that the flicker’s phase is usually not equal to SSVEP’s phase (or
) due to
, which means that we cannot directly use
to identify which flicker the subject desires to select if
is unknown. As a result, an additional measure of phase calibration is necessary for phase-coded SSVEP-BCI to calibrate this error
in the detection algorithm.
Hence, we build up a BCI system which uses a half-field phase-tagged stimulus to evoke the SSVEP with two different frequency components
,
at the same time. This system does not adopt any phase calibration since it is able to identify the flicker by introducing the phase difference instead of the phase under the assumption
(this frequency distinction makes a subject more sensitive to flickering stimuli than to those with the same frequency), i.e.,
If , then . Therefore, the difference between and approximately equals the difference between and . Hence, the lag phase difference can be removed.
In our phase-coded SSVEP-BCI system, a visual stimulator (ViewSonic, 22 inch, 120 Hz refresh rate,
screen resolution) presenting two phase-tagged flickers (with the size 12 cm × 8 cm each) was used to evoke subjects’ SSVEPs (
Figure 3 and
Table 1).
It should be noted that, in our SSVEP-BCI illustrated in
Figure 3, the selected flickering frequencies
and
should be as close as possible (this helps to remove the possible jump change of multiple of
for the lag phase difference between
and
, which was solved in [
17] by means of a exhaustive search procedure based on least-square fitting). Otherwise, their lag phase difference
would not be removed. However, due to the fact that all the stimulus frequencies in our SSVEP-BCI are acquired by integer dividing a fixed LCD display refresh frequency
= 120 Hz (see [
32]), we can only obtain a limited number of flikering frequencies (they are 120/7 Hz, 120/8 Hz, 120/9 Hz, 120/10 Hz, 120/11 Hz) falling at the visual sensitive region (10 Hz, 20 Hz). Therefore, among these candidate flickering frequencies, the frequency pair with the minimum interval is (
) = (120/11 Hz, 120/10 Hz) = (10.9 Hz, 12 Hz). Obviously, in this case, the lag phase difference between
and
will not be removed as
and
are not close enough. Hence, this phase difference should be taken into account in order to achieve an accurate result. In practice, it can be roughly estimated according to their empirical results of the apparent latency of SSVEPs [
24]. In our experiment,
cam be roughly estimated as
.
Different from the well-known CCA (Canonical Correlation Analysis) method, which also uses phase information to enhance the classification accuracy of SSVEPs, our proposed scheme consumes lower hardware cost. As [
17] pointed out, the CCA method needs multiple stimulus frequencies (6 frequencies were adopted) to remove the possible jump change of multiple of
for the lag phase difference. In contrast, our proposed scheme only employs 2 frequencies, thereby lowering the hard cost.
Three subjects (S1∼S3, two males and one female) were seated on a comfortable chair before the visual stimulator in an illuminated room. The subjects’ EEG signals were recorded by a g.USBamp EEG amplifier from 13 electrodes (PO, PO, PO, POZ, PO, PO, PO, P1, PZ, P2, O1, Oz, and O). Specify the sampling rate samples/s. This experiment consisted of 5 runs containing 10 trials each. Each trial lasted for 8 s. Subjects were instructed to focus on one of flickers according to the following paradigm: From 0 to 2 s a cue appeared indicating which flashing flicker was required to focus on; From 2 s to 8 s the subjects gazed at the specified flicker; Then the next trial started. The order of gazed-flickers was ‘1212121212’ in each run. Thus this dataset had 25 trials for each flicker. The whole experiment lasted about 30 minutes. During this experiment, the subjects’ EEG signals were recorded for offline analysis later.
5.2. Procedure of SSVEP Phase Extraction
We collected a total amount of 50 trials for each subject. Basically, the procedure of SSVEP phase extraction would contain the following steps:
- step 1.
Apply both conventional DFT and phase corrected DFT to extract the phase values () and , respectively;
- step 2.
Substitute
and the estimate
=
into (
56) to estimate the difference
=
;
- step 3.
Use the estimate to identify the gazed-flicker. If is close to 0 (or 180) deg, then the gazed-flicker is judged as flicker 1 (or 2).
The judgement involved in step 3 can actually be extended to distinguish
C targets (
) by finding the maximum among
C decision variables
as
where
refers to the ideal phase difference
, i.e., the ideal clustering center of pattern recognition. For the above 2-category recognition problem, we have
,
. From (
57), it can be found if the detected phase difference
, then
. In other words, if
is close to 1, the gazed-flicker is judged as flicker 1 and vice versa. Furthermore, (
57) also allows to assume more clustering centers. As will be elaborated, introducing more assumed clustering centers helps to evaluate the performance of phase estimator.
5.3. Result of Offline Analysis
The classification rates of this 2-class experiment are listed in
Table 2.
It can be found that the proposed phase estimator can achieve a higher classification rate than conventional DFT-based estimator does, which is entirely over 10%.
Table 3 lists not only these two estimators’ detected phase values (
(12),
(10.9)) but also 4 columns of phase decision values
, calculated by (
57). Here decision variable
corresponds to the
kth assumed clustering center (4 assumed phase clustering centers are
, respectively) while the subject actually gazes at the flicker
j. Therefore, for any flicker index
,
(marked with shadow in
Table 3) is close to 1 and tends to be the maximum among
, if the accuracy of selected phase estimator is high enough. The data for all trials involved in
Table 3 are recorded within 4 s.
In
Table 3, generally speaking, the detected phases
by the proposed phase estimator are closer to the ideal phase ‘
’ than that of conventional DFT estimator. This result can be explained as follows: Since the sampling rate
samples/s and the window length equals 4s, it follows that
samples are recorded and thus DFT frequency resolution
Hz. Hence, the stimulus frequency
Hz can also be written as
, i.e., the frequency offset
equals nonzero value
, indicating that severe spectral leakage causing large phase measurement error will arise in the DFT spectrum. In contrast to this, for our proposed phase estimator, due to the property of ‘direct phase extraction’, the phase measurement error caused by frequency offset is much smaller than that of the conventional DFT case, as
Table 3 lists.
Different from the case of , both estimators’ detected phases are uniformly close to the ideal phase (or ) (Even both mean and standard deviation are similar). This is because Hz , i.e., the frequency offset , resulting in that spectral leakage will not appear in both conventional DFT spectra and corrected DFT spectra.
Hence, since SSVEP phase extraction in this experiment is based on the phase difference between
and
rather than either of them, the conventional DFT phase estimator is more likely to cause large errors than the proposed phase estimator. From
Table 3, one can notice that, for the conventional DFT phase estimator, the phase differences between
and
are far away from 36 deg (see the results in 3-rd, 7-th and 11-th row, respectively). Accordingly, the expected maximum decision values
or
detected by conventional DFT estimator are entirely smaller than that detected by the proposed estimator, as
Table 3 lists. This reflects the fact that the conventional DFT is not so good as the proposed fully-traversed DFT in extracting the phase information of SSVEP.
To further evaluate the performance of the fully-traversed DFT in this experiment, as previously mentioned, we assume that there were 4 flickers with stimulus phases
on screen. Hence, the gazed-flicker was identified by finding the maximum among
,
,
,
. Moreover, the average classification rate for these 4 assumed flickers are also listed in
Table 4. In this simulated 4-category experiment, compared to the 2-category case in
Table 2, the classification rate of the fully-traversed DFT decreases but still over 80%. In contrast, the classification rate of conventional DFT is only about 37%, as
Table 4 lists. The underlying reason lies in that conventional DFT cannot accurately extract the phase information at
Hz due to spectral leakage, while the proposed phase corrected DFT still works well for its property of ‘direct phase extraction’.
In summary, fully-traversed DFT is well suitable for the phase-coded SSVEP-BCI, especially in our proposed duel-frequency stimulus-based system which uses phase difference to recognize the gazed flick. Due to the fact that fully-traversed DFT does well in extracting the phase information at any frequency offset, the proposed phase estimator can achieve a higher classification rate. More importantly, as listed in
Table 3, we find that the corresponding phase’s standard deviation of the proposed estimator is smaller than that of conventional DFT estimator, although the frequency resolution of fully-traversed DFT is less than DFT. The reason is actually mentioned in
Section 2, i.e., the mechanism that fully-traversed DFT is the average of
N DFT sub-spectra (see formula (
18)) also helps to reduce the averaged noise’s power.
Based on our preliminary results, we believe that fully-traversed DFT can be applied in the online system. Although the simulated 4-category experiment results show that the corresponding average classification rate is only 80%, the classification rate is surely to be further enhanced once the estimate of the lag phase difference is more accurate. In this experiment, we only roughly estimate it as 36 deg. In fact, we can use another two very close frequencies (for example, we can replace LCD stimuli with LED stimuli) such that the lag phase difference will be close to zero and thus we do not need to estimate it. Moreover, it was also clearly found that the phase variance of fully-traversed DFT estimator does not get worse than conventional DFT under our experiment condition.