1. Introduction
In the rapidly advancing fields of neuroscience and artificial intelligence, generating realistic biophysical data has emerged as a critical focus. EEG data are particularly significant, as they provide valuable insights into the intricate workings of the human brain. The generation of synthetic EEG data holds immense potential for applications in brain–computer interfaces (BCIs), robotic movement control, neuroscience research, and cognitive assessment tools. Similarly, ECG waves, which record the electrical signals generated via the heart’s rhythmic contractions, are essential for understanding cardiovascular health. They play a vital role in medical diagnostics, monitoring heart conditions, and guiding treatments. Beyond traditional healthcare, ECG signals have found applications in human–computer interfaces (HCIs), enabling the control of devices such as wearable health monitors and driving innovative solutions across diverse fields [
1,
2]. The ability to generate synthetic ECG data enables researchers to simulate and examine cardiac behaviors without relying on large, real-world datasets. This can expedite the development of healthcare technologies and improve machine learning models for more accurate predictions in cardiac-related applications.
According to Salehi et al. [
3], the development of GANs has provided a powerful tool for generating synthetic data that closely resemble real data. Deep learning methods have shown significant potential to enhance decoding performance, but their effectiveness often hinges on the availability of large datasets. The generation of synthetic data addresses several critical challenges in this domain. Traditional EEG data acquisition is labor-intensive and requires extensive subject participation and careful calibration to ensure high-quality data, as noted by [
4]. Nik et al. [
5] emphasize that, while large datasets are crucial for training machine learning models, collecting high-quality EEG data is particularly challenging due to its subject- and session-dependent nature, which necessitates precise calibrations. Similarly, Galván et al. [
6] highlight that the accurate identification of motor imagery (MI) patterns in EEG signals is constrained by data-related limitations, hindering the practical implementation of such systems. Furthermore, Chen et al. [
7] point out that, while BCIs offer non-invasive communication methods, their efficiency is heavily dependent on individual training data, often acquired during lengthy calibration sessions.
Another significant challenge in accessing medical data is the heavy restrictions imposed due to its sensitive nature, limiting its availability for research and clinical training purposes. As noted by Chaurasia et al. [
8], unique brainwave patterns of individuals can be used for authentication techniques, making EEG data a form of biometric information and, therefore, highly sensitive. Zhang et al. [
9] explore EEG-based biometric cryptosystems for authentication, further underscoring their sensitivity. Standard de-identification techniques, aimed at facilitating data sharing, are often insufficient to fully protect the privacy of individuals in the dataset, as mentioned by Delaney et al.[
10]. For ECG data, imbalances are a frequent issue because abnormal cases are relatively rare [
11]. Additionally, the use of real patient ECGs is heavily regulated due to privacy concerns. This creates a constant demand for additional ECG data, particularly for training machine learning models in automatic diagnosis, which perform better with balanced datasets.The dependency on human subjects also raises ethical concerns related to privacy and consent.
Synthetic EEG data provide a way to circumvent many of these issues, enabling innovative research and practical applications. Furthermore, synthetic data can reduce commercial risks in product development for neurotechnology and BCI applications by offering a reliable and scalable source of training data. It addresses challenges such as insufficient or unreliable data, extended timelines for real-world data collection, and the difficulties of obtaining ethical approval for human participants. By minimizing the need for extensive data collection and navigating regulatory landscapes with anonymized data, synthetic data also significantly reduces costs.
The main contribution of this research is the development of a novel approach to generating synthetic EEG and ECG data using WGAN-GP combined with our CNN architecture to improve data classification. The synthetic EEG data closely resemble real data for concentration and relaxation states, while the synthetic ECG data correspond to normal and abnormal states. This method increases the dataset size, enhancing the generalization and performance of machine learning models. The application of WGAN-GP for generating synthetic biophysical data represents a significant advancement in medical data generation and analysis, enabling improved diagnostics and research. By leveraging deep learning techniques, the model produces high-fidelity synthetic waves, which augment real datasets—a crucial step for training machine learning models, as larger and more diverse datasets lead to better performance and generalization. Our approach has demonstrated superior accuracy, with the generated synthetic EEG data being effectively classifiable into concentrated and relaxed states. Additionally, the inclusion of synthetic ECG data has been shown to improve classification accuracy, further highlighting the potential of this approach in biomedical and other research fields that depend on biophysical data-driven applications.
The structure of this paper is outlined as follows.
Section 2 explores background research, reviewing the relevant literature. In
Section 3, we introduce our proposed work and outline its development.
Section 4 showcases our experimental results and insights. Finally,
Section 5 presents the conclusions drawn from our findings and outlines avenues for future research.
2. Related Work
Introduced by Goodfellow et al. [
12] in 2014, GANs have become a significant generative technique. Fatemeh Fahimi et al. [
13] highlight the rising interest in using GANs for EEG data generation due to their success in mimicking the temporal, spectral, and spatial features of authentic EEG signals. Since their inception, various GAN types have been developed to overcome initial limitations. Conditional GANs (cGANs) were introduced by Mirza and Osindero in 2014 [
14]. Deep convolutional GANs (DCGANs) were early GAN adaptations that improved training by using deep convolutional neural networks (CNN) for the discriminator and generator [
15]. Arjovsky et al. [
16] introduced Wasserstein GAN (WGAN). Gulrajani et al. [
17] further presented the WGAN-GP as a solution to WGAN restrictions. WGAN-GP could provide more stable training, preventing issues like mode collapse and leading to better overall performance of the GAN.
Despite advancements in GAN architectures, several challenges persist, particularly in the context of EEG data generation. Habashi et al. [
18] reviewed the application of GANs across various EEG domains, including motor imagery, P300, rapid serial visual presentation (RSVP), emotion recognition, and epilepsy. However, limited attention has been given to generating EEG signals related to specific mental states, such as concentration and relaxation. The review also highlighted the challenges in creating diverse synthetic data that adequately capture the complexity of EEG signals.
Cheng et al. [
19] introduced the “SleepEGAN” model, which demonstrated the potential of GANs in generating minority class samples for sleep-stage classification. However, its architecture was specifically designed for sleep studies, limiting its generalizability to other EEG applications. This underscores a critical gap in developing GAN models capable of generalizing across different mental states and EEG applications.
Shin et al. [
20] explored GAN-based data augmentation and anonymization in medical imaging. While their approach showed promise in the medical imaging domain, its application to EEG data generation remains underexplored. Hazra and Byun’s “SynSigGAN” model [
21] generates four types of synthetic biomedical signals, including EEG data related to epilepsy diagnosis and seizure classification. However, their results indicated that the synthetic data were overly similar to the real dataset, limiting its diversity and utility for training machine learning models.
Salazar et al. [
22] introduced the generative adversarial network synthesis for oversampling (GANSO), a novel method designed to improve classifier training with extremely limited data. GANSO integrates vector Markov random fields (vMRF) with GANs to synthesize data while preserving the structural properties of the original dataset. Its generative block uses graph Fourier transform to maintain graph connectivity, while the discriminative block ensures that the synthetic data are indistinguishable from the original. GANSO demonstrates strong potential for biomedical applications, particularly in scenarios with limited training samples. Unlike GANSO, WGAN-CG operates in the feature space of time-series data, such as ECG and EEG, without explicitly modeling graph dependencies. In this work, we focus on feature-space GANs for time-series data, with plans for future research to incorporate graph-based models. This will enable a direct comparison between feature-space and graph-based methods, such as GANSO, to evaluate their respective efficiencies.
Zhao et al. [
23] demonstrated that WGAN-GP could generate precise and varied EEG signals with improved spectral performance, aiding dataset expansion for traditionally hard-to-collect data. However, their work focused solely on P300 brain waves, with a limited analysis of real-world applicability and signal effectiveness across diverse contexts. Additionally, insufficient data preprocessing resulted in the need for 2000 epochs to achieve satisfactory results, highlighting both computational inefficiency and the necessity for more robust preprocessing techniques to reduce training time and enhance signal quality.
The limitations identified in previous studies highlight the need for further research into generating EEG data specific to the mental states of concentration and relaxation. While existing models have shown some success in generating EEG signals, they are often tailored to specific applications (e.g., sleep stages and epilepsy) and fail to produce data that are sufficiently diverse and representative of the complex nature of EEG signals. Moreover, the reliance on specific architectures, such as those used in “SleepEGAN,” restricts the versatility of these models.
To better understand the classification of brain waves based on their frequency, amplitude, shape, and other characteristics, as noted in [
24], it is essential to recognize that these frequencies vary with different mental states, as illustrated in
Table 1.
Evidence from the literature suggests that people with high levels of beta and gamma activity are in a state of concentration, while people with considerable alpha activity are in a state of relaxation. There are cases where the usage of real EEG is considered a privacy breach. Schiliro et al. [
25] mention that brain data can reveal private mental states, necessitating cognitive privacy protections against unauthorized access and collection. In [
26], the authors mention that privacy-preserving methods like homomorphic encryption (HE) are limited by high computational demands, noise buildup, and restricted applicability. These limitations further indicate the necessity of generating synthetic data to address privacy concerns effectively.
Similar to EEG generation, there are several studies related to synthetic ECG generation. In [
27], the authors discuss various approaches to the synthetic generation of ECG using GANs, variational autoencoder–decoders (VAEs), and large language models (LLMs), as well as the limitations of each. The study in [
28] focuses primarily on privacy aspects and does not extensively cover the performance or accuracy of the generated synthetic ECGs. In [
11], the authors focus on normal cardiac cycles and do not address the generation of abnormal ECG patterns, which are crucial for diagnostic purposes. Apart from GAN-based models, some research has been conducted using GPT models for biophysical data generation. While models like ChatEMG [
29] and the GPT-2-based model [
30] generate unlimited EMG signal sequences, the proposed WGAN-GP model generates high-quality synthetic data specific to each state, thus achieving higher accuracy in generating classifiable EEG signals and surpassing the fixed sequence limitation. The EEG signals generated via [
30] using GPT-2 will be compared to our approach in subsequent sections.
This study sought to address these gaps by proposing a novel approach using WGAN-GP for the generation of EEG and ECG signals. The proposed model incorporates efficient data preprocessing techniques, including the use of discrete wavelet transform (DWT) with wavelet Db2 and level 5, to capture both high- and low-frequency components of EEG signals. This approach not only reduces the training time but also enhances the diversity and utility of the synthetic data in training machine learning models.
Moreover, this research examines the impact of using real, synthetic, and combined datasets to determine the optimal proportion of synthetic data required to enhance classification accuracy in real-world BCI applications. For EEG signals, we adopted the approach outlined in the research by Manoharan and Faria [
31], which achieved notable success in classifying EEG data into mental states with an accuracy of 92% using a CNN classifier. For ECG signals, classification is performed using support vector machine (SVM) and random forest (RF) classifiers due to their effectiveness and robustness. SVMs are particularly adept at handling high-dimensional data, and they can deliver accurate results even with limited sample sizes. For instance, a study demonstrated that SVM-based arrhythmic beat classification effectively identified heart-related abnormalities in ECG signals [
32]. Similarly, RF classifiers, known for their ensemble learning approach, provide strong generalization capabilities and resilience against overfitting, making them highly suitable for physiological data analysis. Research by [
33] compared the performance of RF and SVM for ECG quality assessment, finding that both classifiers, when combined with nonlinear features, effectively assessed ECG quality. These studies underscore the reliability of SVM and RF classifiers in ECG analysis within complex biomedical datasets.
4. Results
4.1. EEG Results
To enhance user-friendliness in synthetic EEG generation, an interface was developed using ‘tkinter’, enabling users to configure input and output paths without modifying the Python code, as shown in
Figure 5. The interface allows users to merge real and synthetic data in varying proportions, and it includes tabs for Synthetic EEG Generation, DWT plotting, and CNN classification. Outputs from GAN and WGAN-GP models are saved as CSV files containing synthetic data for channels TP9, AF7, AF8, and TP10. These files match the original data in both data points and timestamps, ensuring seamless integration with real data for classification using the CNN classifier from [
31], with results recorded per subject. The current interface is exclusively designed for synthetic EEG wave generation, while synthetic ECG generation is currently handled via a Python script. Future work aims to integrate synthetic ECG generation into the interface.
Figure 6 and
Figure 7 collectively illustrate the effectiveness of the WGAN-GP model in accurately replicating real EEG patterns in both temporal and frequency domains, thereby validating the model’s ability to generate realistic synthetic EEG data. In
Figure 6, we observe the EEG waveforms from the TP9 channel for Subject A, presented for both concentration and relaxation states. The synthetic data, which extend seamlessly from the real EEG signals, demonstrate continuity in amplitude and oscillatory patterns, indicating that the model can produce consistent, biologically plausible waveforms. This continuity suggests that the synthetic data may be appended to real EEG data without introducing detectable discontinuities, which is essential for applications requiring prolonged EEG sequences.
Figure 7 further supports the validity of the synthetic data by presenting the Power Spectral Density (PSD) of the TP9 channel for the same subject and states. In the concentration state, the PSD of the synthetic data captures prominent gamma wave activity, reflecting the heightened cognitive processing typically associated with concentration. This match in gamma power indicates that the synthetic EEG successfully mirrors the spectral characteristics of the real concentration-state EEG. Similarly, in the relaxation state, the synthetic data show minimal gamma and beta activity, consistent with the reduced cognitive and wakeful engagement expected in a relaxed state. This spectral match reinforces the evidence that the model accurately distinguishes between cognitive states, not only in waveform morphology but also in underlying spectral features.
Together,
Figure 6 and
Figure 7 illustrate that the WGAN-GP model produces synthetic EEG data that closely align with the real data across both waveform and spectral dimensions, substantiating the model’s potential as a tool for generating realistic, state-specific EEG patterns. This high degree of fidelity in both the temporal and spectral domains demonstrates that the synthetic output of the model effectively simulates real EEG data, making it a valuable resource for research and applications requiring synthetic EEG signals.
Table 2 presents the number of data points recorded for each EEG channel (TP9, AF7, AF8, and TP10) across the mental states (concentration and relaxation) for real and corresponding synthetic EEG waves for both GAN and WGAN-GP. The data show consistent values across all channels and subjects for both real and synthetic datasets. This consistency indicates that both models effectively generated synthetic data that matched the structure of the real data, making them suitable for further analysis.
The evaluation criteria for model accuracy in
Table 3 are based on comparing classification performance across different models when trained on a real dataset and tested on unseen real, synthetic, and combined datasets. Specifically, the table highlights the accuracy achieved using each classifier type—including SVM, RF, and CNN—when applied to the respective datasets. For Bird et al. [
30], accuracy values are reported from their baseline models using the same dataset, with the classifiers trained on the real dataset. The reported accuracy reflects the percentage of correct classifications made using each model on each dataset type, indicating each model’s effectiveness in correctly identifying the target classes. The performance of our models (GAN and WGAN-GP) is also presented for a direct comparison, where the same dataset as that of Bird et al. is used. The CNN classifier trained on real data was employed, with WGAN-GP’s CNN classifier demonstrating notably higher accuracy than the Bird et al. [
30] baseline models across real, synthetic, and combined data. This comparison serves to assess the generalization capability and robustness of each model, especially WGAN-GP’s effectiveness in generating data that closely align with real data patterns.
The WGAN-GP model achieves the highest accuracy, with 98.42% for synthetic data and 98.45% for combined data, demonstrating its ability to generate high-quality synthetic EEG data that enhanced classification performance from 92% using only real data. In contrast, the basic GAN model achieved accuracy of 94.11%, surpassing the Bird et al. [
30] SVM classification accuracy of 93.71%, but falling short of their random forest (RF) classification, which reached 96.69% for combined data. While GPT-2 provides a creative approach to generating synthetic data, its lower accuracy compared to WGAN-GP suggests that it struggles to capture the nuances of EEG signals. Our findings highlight the effectiveness of WGAN-GP in addressing data scarcity, improving classification accuracy from 92% to 98.45%. Overall, WGAN-GP emerges as a superior option for applications requiring the precise classification of complex biological signals.
Table 4 and
Table 5 analyze the classification performance with varying percentages of synthetic data, providing important insights into optimizing model accuracy. For the WGAN-GP model, integrating synthetic data significantly improves performance, with average accuracies for concentration and relaxation consistently increasing as the proportion of synthetic data rises from 25% to 50%. At 50%, the model achieves an average accuracy of 98.48%, representing a peak in performance. However, introducing 75% synthetic data results in a slight decline before improving again to 100%, suggesting diminishing returns beyond the 50% threshold.
In contrast, the GAN model exhibits a decline in accuracy as the proportion of synthetic data increases, particularly when transitioning from 25% to 50%. The optimal performance of the GAN was observed at 25% synthetic data, indicating a notable disparity between the two models. This suggests that the synthetic data generated via the GAN may not be of the same quality as that produced via WGAN-GP.
Overall, our findings indicate that incorporating up to 50% synthetic data enhances classification accuracy for WGAN-GP, establishing it as an effective strategy for improving EEG data analysis.
4.2. ECG Results
Figure 8 and
Figure 9 display the normal and abnormal ECG samples of both real and synthetic data generated via the WGAN-GP model, illustrating the similarity between real and generated samples.
Table 6 presents the classification performance of ECG datasets (real, synthetic, and a combination of both) using SVM and RF classifiers. The datasets include 1200 real ECG samples, each containing 140 data points, and 1200 corresponding synthetic samples generated via the WGAN-GP model. The combined dataset, comprising both real and synthetic data, totaled 2400 samples.
For the SVM classifier, the real dataset achieved the highest classification accuracy at 98%, while the synthetic dataset alone achieved 95.8%. Combining real and synthetic data resulted in an accuracy of 97%, suggesting that real data alone may provide more informative features for SVM. However, combining synthetic data still yielded robust performance.
In contrast, the RF classifier performed best with the synthetic dataset, achieving an accuracy of 98.57%, which outperformed the real dataset’s accuracy of 97%. When the combined dataset was used, RF maintained a high accuracy of 98.40%. This indicates that RF benefits from the diversity provided via synthetic data and can effectively integrate both sources for robust classification.
Overall, the results suggest that, while SVM performs optimally with real data, RF demonstrates superior performance with synthetic data and remains highly effective when real and synthetic datasets are combined. Although the accuracy improvements from incorporating synthetic data for ECG are not as pronounced as those for EEG, the 1–2% increase still highlights the value of synthetic data for enhancing classification performance.
4.3. Statistical Significance of the Results
We ran statistical significance tests to assess the performance differences between models across various datasets (real, synthetic, and real + synthetic). The method employed was the Wilcoxon signed-rank test. The Wilcoxon signed-rank test, a non-parametric test, was used for pairwise comparisons. It is particularly effective for comparing matched data when the assumption of normality cannot be guaranteed. The test identifies whether differences between paired observations are symmetric to zero, providing p-values to evaluate statistical significance.
The results of the statistical significance tests are summarized in
Table 7 and further visualized in
Figure 10 and
Figure 11.
1. Real data comparisons:
GPT-2 + SVM vs. GPT-2 + RF: the difference in accuracies (90.84% vs. 88.14%) was not statistically significant (p > 0.05), suggesting similar performance for these two models on real data.
GPT-2 + SVM vs. WGAN-GP + CNN: WGAN-GP + CNN significantly outperformed GPT-2 + SVM (92% vs. 90.84%, p < 0.05).
GPT-2 + RF vs. WGAN-GP + CNN: the WGAN-GP + CNN model significantly surpassed GPT-2 + RF (92% vs. 88.14%, p < 0.01), indicating its superior reliability.
2. Synthetic data comparisons:
GPT-2 + SVM vs. GAN + CNN: GAN + CNN significantly outperformed GPT-2 + SVM (85.78% vs. 66.88%, p < 0.01).
GPT-2 + SVM vs. WGAN-GP + CNN: WGAN-GP + CNN achieved markedly higher accuracy (98.42% vs. 66.88%, p < 0.01).
GAN + CNN vs. WGAN-GP + CNN: WGAN-GP + CNN was significantly better than GAN + CNN (98.42% vs. 85.78%, p < 0.01), reinforcing the robustness of the WGAN-GP model for synthetic data generation.
3. Real + synthetic data comparisons:
GPT-2 + SVM vs. GAN + CNN: the performance difference between GPT-2 + SVM and GAN + CNN was not statistically significant (93.71% vs. 94.11%, p > 0.05).
GPT-2 + SVM vs. WGAN-GP + CNN: WGAN-GP + CNN significantly outperformed GPT-2 + SVM (98.45% vs. 93.71%, p < 0.01).
GAN + CNN vs. WGAN-GP + CNN: WGAN-GP + CNN significantly outperformed GAN + CNN (98.45% vs. 94.11%, p < 0.01).
The bar chart shown in
Figure 10 provides a visual comparison of model accuracies across the datasets, with statistical significance denoted as * (
p < 0.05) or ** (
p < 0.01). It highlights the consistently superior performance of WGAN-GP + CNN, particularly for synthetic and real + synthetic data.
The heatmap in
Figure 11 illustrates the pairwise
p-values from the Wilcoxon signed-rank test, where darker shades represent lower
p-values, indicating stronger statistical significance. The clear contrast between WGAN-GP + CNN and other models reinforces its performance advantages.
These results demonstrate the effectiveness of WGAN-GP + CNN, particularly for scenarios involving synthetic and augmented datasets. Its superior accuracy and statistical significance highlight its potential for applications requiring robust and reliable classification models.
5. Conclusions and Future Work
This research has introduced a novel approach using the WGAN-GP model to generate synthetic EEG waveforms corresponding to concentrated and relaxed mental states. These waveforms can be effectively utilized in BCI applications across various domains requiring human–machine interaction. The synthetic data generated via WGAN-GP after the EEG signals are pre-processed significantly enhance machine learning model training by increasing the dataset size, leading to improved generalization and performance. Specifically, the classification accuracy when only the real dataset was used reached 92%, but this increased to 98.45% when combined with synthetic data generated via WGAN-GP. This performance surpasses state-of-the-art models, which have reported SVM accuracy at 93.71% and RF accuracy at 96.69% on the same datasets. Meanwhile, the original GAN model achieved an accuracy of 94.11% with a mix of real and synthetic data, underscoring the superior quality of synthetic data generated via WGAN-GP to augment real datasets. Notably, adding 25% synthetic data generated via WGAN-GP was sufficient to improve accuracy, while 50% proved optimal across mental states, achieving the highest classification accuracy with the proposed CNN classifier architecture.These findings highlight the WGAN-GP model’s capability to generate high-quality synthetic EEG data, and these results illustrate how combining real and synthetic data enhances overall classification accuracy. Additionally, a user interface was developed to improve the usability of the generator model, enabling the synthesis of EEG waveforms corresponding to both concentration and relaxation states, beyond data visualization and classification. For ECG classification, the SVM model performed best with real ECG data, achieving an accuracy of 98%, while synthetic data alone reached 95.8%. In contrast, the RF classifier excelled with synthetic data, achieving an accuracy of 98.75%. When real and synthetic data were combined, the RF model maintained a high accuracy of 98.40%, an increase from 97% when real data alone were used. This demonstrates the robustness and quality of the synthetic data generated via the WGAN-GP model. Additionally, statistical significance was determined using the Wilcoxon signed-rank test, further emphasizing the potential of WGAN-GP for applications requiring a robust and reliable classification model.
Future Work
Future work for the WGAN-GP model will include exploring its applicability to additional biological signals, such as EMG, in order to further evaluate its generalizability and effectiveness. Moreover, we plan to enhance the interface in order to support direct ECG signal generation, as this process is currently performed via a Python script. Expanding the model in these directions could provide valuable datasets for advancing and refining machine learning algorithms. These advancements can significantly enhance human–technology interaction in applications such as assistive technologies, mental health monitoring, and cognitive load assessment, ultimately improving engagement and performance in interactive tasks.