Deep-Learning-Based Classification of Digitally Modulated Signals Using Capsule Networks and Cyclic Cumulants
Abstract
:1. Introduction
- The first is likelihood-based methods [3,4], in which the likelihood function of the received signal is calculated under multiple hypotheses that correspond to the various signals that are expected to be received, and the classification decision is made based on the maximum of this function. We note that likelihood-based approaches are sensitive to variations in the signal parameters, which are expected to be estimated, and estimation errors can lead to significant performance degradation [5].
- The second is feature-based methods that use CSP techniques [6,7,8] in which CC features [9,10] are extracted from the received signal, and classification is accomplished by comparing the values of these features with prescribed values corresponding to the signals that are expected to be received [11]. As noted in [12,13], the performance of CC-based approaches to modulation classification is affected by the presence of multipath fading channels, and robust CC-based classifiers for multipath channels were discussed in [14,15,16].
2. CC Features for Digitally Modulated Signals and Baseline Classification Model
2.1. CC Feature Extraction
- Use the BOI detector [37] to evaluate the signal bandwidth and obtain a low-resolution estimate of the center frequency.
- Frequency shift the BOI to the baseband using the low-resolution CFO estimate.
- Downsample/upsample the data as necessary such that the signal bandwidth is maximized, but keep the fractional bandwidth of the result strictly less than 1.
- Apply the SSCA to the data provided by Step 3 to detect the second-order CFs.
- Use the non-conjugate second-order CFs (if these are present) to obtain a high-resolution estimate of the symbol rate .
- If no non-conjugate CFs are present, the symbol rate may be estimated from any conjugate CFs present, which can also be used to provide a high-resolution estimate of the CFO.
- Determine the basic pattern of the second-order CFs present in the BOI.
- If conjugate CFs are not present from Step 4, then the data from Step 3 are raised to the fourth power and Fourier transformed, and further CSP is applied to determine the CF pattern and to estimate the symbol rate if not provided by Step 5 and to obtain a high-resolution estimate of the CFO.
- The delay vector ;
- The orders of CC features were limited to the set , and the number of conjugation choices was constrained by the order n to ;
- For each pair, the CFs where CCs are non-zero are related to the CFO () and symbol rate () by Equation (7), where the set of non-negative integers k is restrained to a maximum value of five.
2.2. The Cyclic Cumulant Estimate
2.3. Baseline Classification Model
3. Cyclic Cumulants and Capsule Networks for Digital Modulation Classification
- Warping: This involves using the order n of the CC estimates to obtain “warped” versions . We note that CSP-based blind modulation classification also employs warped CC estimates.
- Scaling: The warped CC estimates were subsequently scaled to a signal power of unity, using a blind estimate of the signal power. This provided consistent values for the capsule network to train on and prevented varying signal powers from causing erroneous classification results due to neuron saturation—a common issue with input data that do not go through some normalization process.
- Feature extraction layer: This first layer of the network performs a general feature mapping of the input signal, and its parameters are similar to those used in other DL-based approaches to classification of digitally modulated signals [21,24,39]. This layer includes a convolutional layer followed by a batch normalization layer and an activation function.
- Primary capsules: This layer consists of eight primary capsules, which is equal to the number of digital modulation classes of interest. These capsules operate in parallel using as the input the output from the feature extraction layer, and each primary capsule includes two convolutional layers with a customized filter, stride, and activation function, followed by a fully connected layer.
- Fully connected layer: This layer consists of a neuron vector with the weights connecting to the previous layer. Each neuron in the last layer of the primary capsules will be fully connected to each neuron in this layer. These neurons are expected to discover characteristics specific to the capsules’ class. To make the output of the network compatible with a SoftMax classification layer, each neuron within this layer is fully connected to a single output neuron, and the output neurons for all primary capsules are combined depthwise to produce an eight-dimensional vector n, which is passed to the classification layer. The value of each respective element of n will be representative of the likelihood that its corresponding modulation type is present in the received digitally modulated signal.
- Classification layer: In this layer, vector n is passed to the SoftMax layer, which will map each element , , in n to a value:
4. CAP Training and Performance Evaluation
4.1. The Training/Testing Datasets
- For the CAP that uses the I/Q signal data for training and testing, the CFO shift in the testing dataset relative to the training dataset resulted in significant degradation of the classification performance of the CAP and indicated that it was unable to generalize its training to new datasets that contain similar types of signals, but with differences in some of their digital modulation characteristics. This aspect was also reported in [23], and similar results have been reported for CNNs and RESNETs in [24].
- As will be seen in Section 5, the CAP that uses the CC features for training and testing the CFO shift in the testing dataset relative to the training dataset resulted in similar classification performance and indicated that the CAP trained using CC features was resilient to variations of the CFO from the training dataset.
4.2. CAP Training
- In the first training instance, dataset CSPB.ML.2018 was used, splitting the available signals into for training, for validation, and for testing. The corresponding objective and loss functions for the trained CAP are shown in Figure 3, and we note that the probability of the correct classification for the test results was obtained using the test portion of the signals in CSPB.ML.2018.The CAP trained on CSPB.ML.2018 was then tested on dataset CSPB.ML.2022 to assess the generalization abilities of the trained CAP in classifying all signals available in CSPB.ML.2022.
- In the second training instance, the CAP was reset and trained anew using the signals in dataset CSPB.ML.2022, with a similar split of of signals used for training, for validation, and for testing. The corresponding objective and loss functions for the trained CAP were similar to the ones in Figure 3 and were omitted for brevity. The probability of correct classification for the test results was obtained using the test portion of the signals in CSPB.ML.2022.The CAP trained on CSPB.ML.2022 was then tested on dataset CSPB.ML.2018 to assess the generalization abilities of the re-trained CAP when classifying all signals available in CSPB.ML.2018.
4.3. Assessing Generalization Abilities
- If the CAP was trained on a large portion of CSPB.ML.2018 and its performance when classifying a remaining subset of CSPB.ML.2018 was high, but its performance when classifying CSPB.ML.2022 was low, then the CAP’s ability to generalize was low, and its performance was vulnerable to shifts in the signal parameter distributions.
- By contrast, if the classification performance of the CAP on both the remaining subset of CSPB.ML.2018 and on all of CSPB.ML.2022 was high, then its generalization ability was high, and the CAP was resilient to shifts in signal parameter distributions.
5. Numerical Results and Performance Analysis
5.1. Baseline Model Performance
5.2. CC-Trained CAP Performance
5.3. I/Q-Trained CAP Performance
5.4. Confusion Matrix Results
- The confusion matrix for the baseline classification model in Figure 6 showed that, for 5 out of the 8 digital modulation schemes of interest (BPSK, QPSK, 8PSK, DPSK, and MSK), the classification exceeded 95% accuracy, while for the remaining 3 schemes, which were all QAM-based, the classification accuracy was at 72.5% for 16QAM, 55.9% for 256QAM, and 41.7% for 64QAM. We note the “unknown” classification label, which appears in the confusion matrix of the baseline classifier because this was not trained, but rather made its classification decision based on the proximity of the estimated CCs to a modulation’s theoretical CC values as outlined in Section 2.3 and discussed in [10]. Thus, when the baseline classifier was not able to match a signal with a known pattern, it declared it “unknown” instead of confusing it with a different type of signal as DL-based classifiers do.
- For the CC-trained CAPs, we show in Figure 7 the confusion matrix corresponding to the generalization experiment, in which the capsule network was trained on the CSPB.ML.2022 dataset followed by testing using all signals in the CSPB.ML.2018 dataset. We note that the CAP showed almost perfect accuracy (exceeding 99%) for the BPSK, QPSK, 8PSK, DPSK, and MSK modulation schemes, with significant improvement over the baseline model for the remaining QAM modulation schemes, for which the classification accuracy increased to 97.5% for 16QAM, 74% for 256QAM, and 62.3% for 64QAM, which implied about 20% or more improvement over the baseline model classification performance.
- In contrast, the I/Q-trained CAP confusion matrix shown in Figure 8 corresponding to the generalization experiment (the CAP was trained on the CSPB.ML.2022 dataset followed by testing on all signals in the CSPB.ML.2018 dataset) showed very poor classification accuracy, despite having excellent accuracy when classifying the 25% test portion of the signals in the CSPB.ML.2018 dataset [23].
5.5. Computational Aspects
- Blind exhaustive spectral correlation and coherence analysis for N complex values and strips in the strip spectral correlation analyzer (SSCA) algorithm [36]: .
- The cost of estimating the quadrupled-carrier from the FFT of , and therefore the carrier offset, if the CF pattern was not determined to be BPSK-like or SQPSK-like after SSCA analysis: .
- The cost of the cyclic moments in (11) was determined by the cost of creating the needed delay products (such as ) and the DFTs for each combination of lag product and needed cycle frequency. The number of required lag-product vectors is P, which was maximum for BPSK-like and minimum for 8PSK-like, where . Assuming K CFs across all orders n and lag products, the cost of this step was .
- The cost of combining the CTMFs after their computation was negligible compared to the previous sketched costs.
- Total cost (blind processing): .
- In the case of the baseline classifier, the subsequent processing involved a comparison of the estimated CC features to theoretical features to identify the closest, in terms of a distance metric, theoretical CF pattern, as outlined in Section 2.3.
- In the case of the CAP classifier, the classifier was presented with the CC feature at the input, and the classification decision corresponded to the CAP output. We note that the one-time computational cost for training the CAP should also be included in the cost in this case.
6. Discussion
- Investigating the performance and generalization for further datasets with more signal types and randomized multipath channels;
- Determining the performance of the developed capsule network as a function of the input I/Q vector length. Can it reach ? In tandem, can the baseline signal-processing method provide near one for larger input vector lengths?
- What is the fundamental reason that DL neural networks (including capsule networks) fail to generalize with I/Q input data?
- Why do I/Q-trained DL networks not learn CC features? Can they be modified to do so by modifying the form of the feedback error and/or modifying the network layers and structure? Can the I/Q-input neural network be forced to learn CCs by imposing dimensionality or variability constraints on the latent embedding?
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Dobre, O.; Abdi, A.; Bar-Ness, Y.; Su, W. Blind Modulation Classification: A Concept Whose Time Has Come. In Proceedings of the 2005 IEEE/Sarnoff Symposium on Advances in Wired and Wireless Communication, Princeton, NJ, USA, 18–19 April 2005; pp. 223–228. [Google Scholar] [CrossRef]
- Dobre, O.A. Signal Identification for Emerging Intelligent Radios: Classical Problems and New Challenges. IEEE Instrum. Meas. Mag. 2015, 18, 11–18. [Google Scholar] [CrossRef]
- Hameed, F.; Dobre, O.A.; Popescu, D.C. On the Likelihood-Based Approach to Modulation Classification. IEEE Trans. Wirel. Commun. 2009, 8, 5884–5892. [Google Scholar] [CrossRef]
- Xu, J.L.; Su, W.; Zhou, M. Likelihood-Ratio Approaches to Automatic Modulation Classification. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 2011, 41, 3072–3108. [Google Scholar] [CrossRef]
- Panagiotou, P.; Anastasopoulos, A.; Polydoros, A. Likelihood Ratio Tests for Modulation Classification. In Proceedings of the 2000 IEEE Military Communications Conference (MILCOM), Los Angeles, CA, USA, 22–25 October 2000; Volume 2, pp. 670–674. [Google Scholar] [CrossRef]
- Gardner, W.; Spooner, C. The Cumulant Theory of Cyclostationary Time-Series, Part I: Foundation. IEEE Trans. Signal Process. 1994, 42, 3387–3408. [Google Scholar] [CrossRef]
- Spooner, C.; Gardner, W. The Cumulant Theory of Cyclostationary Time-Series, Part II: Development and Applications. IEEE Trans. Signal Process. 1994, 42, 3409–3429. [Google Scholar] [CrossRef]
- Dandawate, A.; Giannakis, G. Statistical Tests for Presence of Cyclostationarity. IEEE Trans. Signal Process. 1994, 42, 2355–2369. [Google Scholar] [CrossRef] [Green Version]
- Spooner, C.M. Classification of Co-channel Communication Signals Using Cyclic Cumulants. In Proceedings of the Conference Record of The Twenty-Ninth Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA, USA, 30 October–1 November 1995; Volume 1, pp. 531–536. [Google Scholar] [CrossRef]
- Spooner, C.M. On the Utility of Sixth-Order Cyclic Cumulants for RF Signal Classification. In Proceedings of the Conference Record of Thirty-Fifth Asilomar Conference on Signals, Systems and Computers, Monterey, CA, USA, 4–7 November 2001; Volume 1, pp. 890–897. [Google Scholar] [CrossRef]
- Spooner, C.M.; Brown, W.A.; Yeung, G.K. Automatic Radio-Frequency Environment Analysis. In Proceedings of the of the Thirty-Fourth Annual Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, CA, USA, 29 October–1 November 2000; Volume 2, pp. 1181–1186. [Google Scholar]
- Swami, A.; Sadler, B.M. Hierarchical Digital Modulation Classification Using Cumulants. IEEE Trans. Commun. 2000, 48, 416–429. [Google Scholar] [CrossRef]
- Dobre, O.A.; Abdi, A.; Bar-Ness, Y.; Su, W. Cyclostationarity-Based Modulation Classification of Linear Digital Modulations in Flat Fading Channels. Wirel. Pers. Commun. 2010, 54, 699–720. [Google Scholar] [CrossRef]
- Wu, H.C.; Saquib, M.; Yun, Z. Novel Automatic Modulation Classification Using Cumulant Features for Communications via Multipath Channels. IEEE Trans. Wirel. Commun. 2008, 7, 3098–3105. [Google Scholar] [CrossRef]
- Yan, X.; Feng, G.; Wu, H.C.; Xiang, W.; Wang, Q. Innovative Robust Modulation Classification Using Graph-Based Cyclic-Spectrum Analysis. IEEE Commun. Lett. 2017, 21, 16–19. [Google Scholar] [CrossRef]
- Yan, X.; Liu, G.; Wu, H.C.; Feng, G. New Automatic Modulation Classifier Using Cyclic-Spectrum Graphs With Optimal Training Features. IEEE Commun. Lett. 2018, 22, 1204–1207. [Google Scholar] [CrossRef]
- Kulin, M.; Kazaz, T.; Moerman, I.; De Poorter, E. End-to-End Learning From Spectrum Data: A Deep Learning Approach for Wireless Signal Identification in Spectrum Monitoring Applications. IEEE Access 2018, 6, 18484–18501. [Google Scholar] [CrossRef]
- Rajendran, S.; Meert, W.; Giustiniano, D.; Lenders, V.; Pollin, S. Deep Learning Models for Wireless Signal Classification With Distributed Low-Cost Spectrum Sensors. IEEE Trans. Cogn. Commun. Netw. 2018, 4, 433–445. [Google Scholar] [CrossRef] [Green Version]
- Zhang, D.; Ding, W.; Liu, C.; Wang, H.; Zhang, B. Modulated Autocorrelation Convolution Networks for Automatic Modulation Classification Based on Small Sample Set. IEEE Access 2020, 8, 27097–27105. [Google Scholar] [CrossRef]
- Bu, K.; He, Y.; Jing, X.; Han, J. Adversarial Transfer Learning for Deep Learning Based Automatic Modulation Classification. IEEE Signal Process. Lett. 2020, 27, 880–884. [Google Scholar] [CrossRef]
- O’Shea, T.J.; Roy, T.; Clancy, T.C. Over-the-Air Deep Learning Based Radio Signal Classification. IEEE J. Sel. Top. Signal Process. 2018, 12, 168–179. [Google Scholar] [CrossRef] [Green Version]
- O’Shea, T.; Hoydis, J. An Introduction to Deep Learning for the Physical Layer. IEEE Trans. Cogn. Commun. Netw. 2017, 3, 563–575. [Google Scholar] [CrossRef] [Green Version]
- Latshaw, J.A.; Popescu, D.C.; Snoap, J.A.; Spooner, C.M. Using Capsule Networks to Classify Digitally Modulated Signals with Raw I/Q Data. In Proceedings of the 14th IEEE International Communications Conference (COMM), Bucharest, Romania, 16–18 June 2022. [Google Scholar] [CrossRef]
- Snoap, J.A.; Popescu, D.C.; Spooner, C.M. On Deep Learning Classification of Digitally Modulated Signals Using Raw I/Q Data. In Proceedings of the 19th IEEE Annual Consumer Communications Networking Conference (CCNC), Las Vegas, NV, USA, 8–11 January 2022; pp. 441–444. [Google Scholar] [CrossRef]
- Zhao, K.; Hu, J.; Shao, H.; Hu, J. Federated Multi-Source Domain Adversarial Adaptation Framework for Machinery Fault Diagnosis with Data Privacy. Reliab. Eng. Syst. Saf. 2023, 236, 109246. [Google Scholar] [CrossRef]
- Zhao, K.; Feng, J.; Shao, H. A Novel Conditional Weighting Transfer Wasserstein Auto-Encoder for Rolling Bearing Fault Diagnosis with Multi-Source Domains. Knowl.-Based Syst. 2023, 262, 110203. [Google Scholar] [CrossRef]
- Jin, B.; Cruz, L.; Gonçalves, N. Deep Facial Diagnosis: Deep Transfer Learning From Face Recognition to Facial Diagnosis. IEEE Access 2020, 8, 123649–123661. [Google Scholar] [CrossRef]
- Cheng, L.; Yin, F.; Theodoridis, S.; Chatzis, S.; Chang, T.H. Rethinking Bayesian Learning for Data Analysis: The art of prior and inference in sparsity-aware modeling. IEEE Signal Process. Mag. 2022, 39, 18–52. [Google Scholar] [CrossRef]
- Li, L.; Huang, J.; Cheng, Q.; Meng, H.; Han, Z. Automatic Modulation Recognition: A Few-Shot Learning Method Based on the Capsule Network. IEEE Wirel. Commun. Lett. 2021, 10, 474–477. [Google Scholar] [CrossRef]
- Sang, Y.; Li, L. Application of Novel Architectures for Modulation Recognition. In Proceedings of the 2018 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS), Chengdu, China, 26–30 October 2018; pp. 159–162. [Google Scholar] [CrossRef]
- Hazza, A.; Shoaib, M.; Saleh, A.; Fahd, A. Robustness of Digitally Modulated Signal Features Against Variation in HF Noise Model. EURASIP J. Wirel. Commun. Netw. 2011, 2011, 24. [Google Scholar] [CrossRef] [Green Version]
- Snoap, J.A.; Latshaw, J.A.; Popescu, D.C.; Spooner, C.M. Robust Classification of Digitally Modulated Signals Using Capsule Networks and Cyclic Cumulant Features. In Proceedings of the 2022 IEEE Military Communications Conference (MILCOM), Rockville, MA, USA, 28 November–2 December 2022; pp. 298–303. [Google Scholar]
- Tekbıyık, K.; Akbunar, Ö.; Ekti, A.R.; Görçin, A.; Kurt, G.K.; Qaraqe, K.A. Spectrum Sensing and Signal Identification With Deep Learning Based on Spectral Correlation Function. IEEE Trans. Veh. Technol. 2021, 70, 10514–10527. [Google Scholar] [CrossRef]
- Spooner, C.; Snoap, J.; Latshaw, J.; Popescu, D. Synthetic Digitally Modulated Signal Datasets for Automatic Modulation Classification. 2022. Available online: https://cyclostationary.blog/data-sets/ (accessed on 1 May 2023). [CrossRef]
- Gardner, W.A.; Spooner, C.M. Detection and Source Location of Weak Cyclostationary Signals: Simplifications of the Maximum-Likelihood Receiver. IEEE Trans. Commun. 1993, 41, 905–916. [Google Scholar] [CrossRef]
- Brown, W.A.; Loomis, H.H. Digital Implementations of Spectral Correlation Analyzers. IEEE Trans. Signal Process. 1993, 41, 703–720. [Google Scholar] [CrossRef]
- Spooner, C.M. Multi-Resolution White-Space Detection for Cognitive Radio. In Proceedings of the 2007 IEEE Military Communications Conference (MILCOM), Orlando, FL, USA, 29–31 October 2007; pp. 1–9. [Google Scholar] [CrossRef]
- Sabour, S.; Frosst, N.; Hinton, G.E. Dynamic Routing between Capsules. In Proceedings of the 31st International Conference on Neural Information Processing Systems—NIPS’17, Long Beach, CA, USA, 4–9 December 2017; pp. 3859–3869. [Google Scholar]
- Zhou, S.; Yin, Z.; Wu, Z.; Chen, Y.; Zhao, N.; Yang, Z. A Robust Modulation Classification Method Using Convolutional Neural Networks. EURASIP J. Adv. Signal Process. 2019, 2019, 15. [Google Scholar] [CrossRef] [Green Version]
- Luce, R.D. Luce’s Choice Axiom. Scholarpedia 2008, 3, 8077. [Google Scholar] [CrossRef]
- Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 7–9 May 2015; Conference Track Proceedings. Bengio, Y., LeCun, Y., Eds.; 2015. [Google Scholar]
Layer | (# of Filters)[Filt Size] | Stride | Activations |
---|---|---|---|
Input | |||
Conv | (56)[] | [] | |
Batch Norm | |||
Tanh | |||
Conv-1-(i) | (56)[] | [] | |
Batch Norm-1-(i) | |||
Tanh-1-(i) | |||
Conv-2-(i) | (72)[] | [] | |
Batch Norm-2-(i) | |||
Tanh-2-(i) | |||
FC-(i) | 7 | ||
Batch Norm-3-(i) | |||
ReLu-1-(i) | |||
Point FC-(i) | 1 | ||
Depth Concat (i = 1:8) | 8 | ||
SoftMax |
Parameter | CSPB.ML.2018 | CSPB.ML.2022 |
---|---|---|
Sampling Frequency, | 1 Hz | 1 Hz |
Carrier Frequency Offset | uniformly | uniformly |
(CFO), | distributed in | distributed in |
Symbol Period, , Range | ||
SRRC Pulse-Shaping | ||
Roll-Off Factor, , Range | ||
In-Band SNR Range (dB) | ||
In-Band SNR Center of Mass | 9 dB | 12 dB |
Classification Model | Results for Dataset CSPB.ML.2018 | Results for dataset CSPB.ML.2022 |
---|---|---|
Baseline Model | ||
CSPB.ML.2018 I/Q-trained CAP | ||
CSPB.ML.2022 I/Q-trained CAP | ||
CSPB.ML.2018 CC-trained CAP | ||
CSPB.ML.2022 CC-trained CAP |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Snoap, J.A.; Popescu, D.C.; Latshaw, J.A.; Spooner , C.M. Deep-Learning-Based Classification of Digitally Modulated Signals Using Capsule Networks and Cyclic Cumulants. Sensors 2023, 23, 5735. https://doi.org/10.3390/s23125735
Snoap JA, Popescu DC, Latshaw JA, Spooner CM. Deep-Learning-Based Classification of Digitally Modulated Signals Using Capsule Networks and Cyclic Cumulants. Sensors. 2023; 23(12):5735. https://doi.org/10.3390/s23125735
Chicago/Turabian StyleSnoap, John A., Dimitrie C. Popescu, James A. Latshaw, and Chad M. Spooner . 2023. "Deep-Learning-Based Classification of Digitally Modulated Signals Using Capsule Networks and Cyclic Cumulants" Sensors 23, no. 12: 5735. https://doi.org/10.3390/s23125735
APA StyleSnoap, J. A., Popescu, D. C., Latshaw, J. A., & Spooner , C. M. (2023). Deep-Learning-Based Classification of Digitally Modulated Signals Using Capsule Networks and Cyclic Cumulants. Sensors, 23(12), 5735. https://doi.org/10.3390/s23125735