1. Introduction
SEI is a technology that uses signal-processing methods to extract essential difference information of different transmitters from the received signal, thereby distinguishing each communication transmitter [
1]. Since the beginning of the 21st century, the development of sensor networks and mobile communication technologies has also enabled the development of the information society from the electronic age to the Internet of Things era and has promoted the birth of countless communication radiation-source devices. In the civil field, the security of wireless communication systems is receiving exponential attention. According to the characteristics of the physical layer, SEI can determine whether the received signal comes from an authorized transmitter without relying on the user key. This can undoubtedly reduce security incidents caused by key theft. Therefore, it is extensively studied. In the military field, the use of received radio signals to determine from which radio station a signal comes is conducive to grasping the information initiative on the battlefield and supporting communication confrontation.
The key to SEI is finding fingerprint features that represent the difference of specific transmitters. Fingerprint features should be universal, unique, stable, and measurable in a given target set. All individual radiation sources have versatility in fingerprint features. Uniqueness shows that the fingerprint features of individuals on different radiation sources are different. The fingerprint features that need to be extracted for stability have certain antinoise interference capabilities, and the measurability shows that the individual identification of specific radiation sources can be realized in engineering. The fingerprint characteristics of a radiation source are mainly divided into transient and steady-state characteristics. An extracted feature from the signal when the emitter is started is transient, and the extracted feature from a signal that the emitter transmits stable information is steady-state. Lin Y. analyzed transient and steady-state features. He used the power spectral density method and the fractional Fourier transform method to analyze transient signals, and the line integral bispectral method to analyze steady-state signals. Results showed that the SEI method using transient signals had a higher recognition rate than that of the SEI method using steady-state signals under a high signal-to-noise ratio (SNR). However, transient features are susceptible to noise. When the SNR is low, the recognition rate rapidly drops [
2] because the transient signal does not contain the interference of sending the message, and it has rich transmitter characteristics. The radio-frequency fingerprint vector extracted from a transient signal can characterize the transmitter device well, but its stability is poor, it is susceptible to noise interference, and it is difficult to obtain a sufficient transient signal in non-cooperative scenarios. This limits its applications in engineering. Due to the presence of modulation information, some features of the steady-state signal are overwhelmed, so the recognition rate of this method is low, but when the SNR is large enough, the features that are not overwhelmed by the message still have very good individual representation performance. Steady-state signals are easy to implement, and it is easy to obtain a sufficient number of steady-state signals. Therefore, steady-state characteristics have always been the main research object of scholars.
For steady-state features, statistical features in the time domain need to first be extracted from a received signal. Reising, Williams, and others extracted the instantaneous amplitude, instantaneous phase, the mean value of instantaneous frequency, variance, kurtosis, and other statistical parameters for the transient and pilot parts of different types of signals (such as GSM, WiMax, and ZigBee signals) [
3,
4,
5,
6]. Regarding the characteristics of the individual radiation source, experiments verified that the use of statistical time-domain parameter characteristics can identify individual radiation sources well. S. Deng proposed an algorithm of linear skewness and linear kurtosis based on skewness and kurtosis. For limited samples, linear moment estimation is more robust and accurate than other estimation methods are, and even better than maximum-likelihood estimation. Therefore, linear skewness and linear kurtosis are not sensitive to outliers, so non-Gaussian high-precision measurement can be achieved [
7]. Research on radiation-source identification methods based on statistical time-domain characteristics is at a relatively early stage, but these methods are susceptible to noise; in this case, they are not enough to analyze non-Gaussian and nonstationary signals. Some subtle features are easily overwhelmed by noise or extracted signal features only reflect noise features. Statistical feature extraction based on high-order moments and high-order spectral parameters can provide richer amplitude and phase information, and this has strong anti-Gaussian noise capabilities, but the amount of calculation is large, so the integrated bispectral feature is often used to characterize individual radiation-source differences. Integrated one-dimensional spectral characteristics can be obtained by integrating the two-dimensional spectrum. According to different integration paths, this can be divided into the radial integrated bispectrum (RIB), axial integrated bispectrum (AIB), square integrated bispectrum (SIB), and circular integral bispectrum (CIB) [
1,
8,
9,
10,
11].
Weak signals such as parasitic modulation and parasitic elements generated by the nonlinearity of radiation-source transmitter equipment are difficult to accurately analyze by time- and frequency-domain analysis methods. Researchers also applied feature extraction in the signal-transformation domain to the individual identification of communication radiation sources. This type of method attempts to observe or count the characteristics of signal parameters in other domains through various signal conversions, to distinguish between different individual radiation sources. Methods such as wavelet analysis, time-frequency analysis, empirical-mode decomposition (EMD) transformation, and intrinsic time-scale decomposition (ITD) transformation were applied to analyze the slightly different information of individual radiation sources, achieving a good individual recognition effect [
1,
12]. He. B. decomposed and preprocessed a received signal from EMD, ITD, or variational-mode decomposition (VMD), and then extracted the skewness and kurtosis of the decomposed signal to form a feature vector that characterized the individual signal. The ITD-based method obtained the highest recognition rate in the SEI problem, followed by EMD, and lastly the VMD decomposition method [
13]. Jie proposed an SEI recognition method based on a 3D Hilbert energy spectrum and multiscale segmentation; the time-frequency energy spectrum was derived via Hilbert–Huang transform, which could be defined as a complicated curved surface in the three-dimensional space, namely, the 3D Hilbert energy spectrum. Then, via fractal theory, four features were extracted to compose the feature vector under multiscale segmentation. Lastly, the communication of the identification of 13 individual emitters was achieved by utilizing a support vector machine (SVM). When SNR > 26 dB, this method exceeded 90% correct recognition rate based on three simulation datasets [
14].
The third type of radiation-feature extraction algorithm is based on the nonlinearity of transmitter hardware. Various analog components inside the communication radiation-source transmitter, such as digital to analog (DAC), power amplifier (PA), modulators, and filters exhibit a certain nonlinearity during operation [
15,
16,
17]. Aiming at the nonlinear error introduced by DAC [
18], Polak used a random Brownian bridge process to model the nonlinear behavior of the radiator transmitter’s DAC device [
19]. Zhang and Liu used a memoryless polynomial model to describe the power-amplifier nonlinearity of different individual radiation sources [
20,
21,
22], which simplified the solution of the nonlinear power-amplifier model, but insufficiently describing the memory effect of the power amplifier. To characterize the nonlinear behavior of a power amplifier with a weak memory effect [
23], Liu presented a radio-frequency front-end nonlinearity estimator that performed SEI based on the knowledge of a training sequence. The algorithm provided robust identification by first using alternative degrees of nonlinearities associated with symbol amplitudes for initial estimation, and then iteratively estimating the channel coefficients and distorted transmit symbols to overcome the intersymbol interference effect [
24]. Jian C. considered that a low-amplitude signal is less affected by the PA, the pulse of the channel response of the low-amplitude symbol could be estimated, and the nonlinearity of the PA could be extracted, reducing the influence of noise iteratively and through larger amplitude symbols. In other words, this method is only applicable to signals with different symbol amplitudes. Simulation results showed that this method could achieve a recognition rate of 90% under the condition of 18 dB [
25].
The essential reason for the difference between radiation-source signals is that various device parameters that constitute a radiation source have a certain error range. These subtle parameter differences lead to differences in the amplitude, frequency, skewness, kurtosis, and double-order spectrum of the signal. We can distinguish a single radiation source from the surface phenomenon of the signal, but extracted features cannot characterize the difference in the internal device parameters of the radiation source. Therefore, in this paper, through the mathematical modeling of the distortion of the internal components of the radiation source and the nonlinearity of the power amplifier, parameters characterizing the distortion and nonlinearity are extracted from the received RF signal as the feature vector of different radiation sources to distinguish different radiation sources. First, the received signal was demodulated to obtain the transmitted message bit sequence, and then some basic parameters were obtained (such as carrier frequency, baud rate, modulation mode, and phase) through analysis of the signal. We could use the obtained parameters and message bit sequence to recover the signal sent by the signal sender through the signal-processing tool. The simulated signal did not pass through the transmitter. We could compare the ‘pure signal’ that we recovered without any nonlinear distortion with the actual received signal. In this way, we know what kind of nonlinear distortion the message bits experience through the transmitter. The entire process is shown in
Figure 1. This method extracts the characteristics of the nonlinear parameters of the transmitter and does not change with the change in the transmitted message signal. Through processing any signal, the characteristic parameters of the hardware difference of the transmitter can be obtained, and the signal sent from the transmitter can be individually identified without being affected by the test signal carrier frequency, modulation mode, and modulation rate.
The rest of this article is arranged as follows:
Section 2 first gives the nonlinear model of the power amplifier, establishes the modulator distortion models of PAM and QPSK signals, and presents the feature-extraction formulas under these two modulation modes. It also briefly introduces the traditional feature-extraction integral bispectral algorithm.
Section 3 verifies the recognition performance of this method via simulation, and presents the recognition performance of pulse amplitude modulation (PAM) signal and quadrature phase-shift keying (QPSK) signals under the condition of variable carrier frequency and modulation rate.
Section 4 presents the results and discussion of the experiment.
2. Modulator Distortion and Nonlinear Power Amplifier (PA) Model Establishment
2.1. Nonlinear PA Model
PA is an important part of various wireless transmitters. In the pre-stage circuit of the transmitter, the power of the RF signal generated by the modulation oscillator circuit is very small, and it needs to pass through the power amplifier to obtain sufficient RF power before it can be fed, radiating towards the antenna. When the power amplifier works in the linear-amplification region, its efficiency is relatively low, but when the power amplifier enters the saturated working region, it produces serious nonlinear distortion, which expands the signal spectrum, interferes with adjacent channels, and produces intermodulation interference [
1,
23].
The mathematical modeling methods of power amplifiers are mainly divided into memory and nonmemory models. When the input signal is narrow-band, a memoryless model is often used to describe the nonlinear behavior of the power amplifier, for example, the polynomial, Saleh, and RAPP models. When the input signal is wide-band, the nonlinear behavior of the power amplifier is often modeled by a memory model, such as a Volterra series model, a polynomial model with memory, a delayed neural network model, a Hammerstein model, and a Wiener model.
In practical applications, even if the input is a narrow-band power-amplifier signal, there is generally a weak memory effect. Therefore, we use the linear memory (Hammerstein) model to describe the nonlinear distortion of the power amplifier. The Hammerstein models are shown in
Figure 2. By introducing intermediate functions, the input–output relationship of the model can be more concise. The content shown in
Figure 2 can be expressed by the following formula:
where
M is the order of the nonlinear model,
is the model coefficient, D is the order of the memory model, and the corresponding model coefficient is
.
2.2. PAM Modulation-Distortion Model
In digital PAM, the signal waveform can be expressed as:
where
is the amplitude of the signal,
is the shaped pulse, and
is the carrier frequency. In digital amplitude modulation, the distortion of the transmitter includes the following. (a) The deviation of the
amplitude. When there are multiple pulse amplitudes, different radiation sources produce the same amplitude due to the allowable error of the digital-to-analog converter (DAC) and other devices. There are subtle differences. (b) The nonlinearity of the power amplifier. The radio-frequency signal is amplified by the PA, which produces nonlinear distortion. The amplitude offset produced by
is represented, and we used the linear memory Hammerstein model to describe the nonlinearity of the power amplifier:
where
is the equivalent baseband complex envelope signal.
There is a frequency multiplication component in
, and the actual signal, except for the center frequency passband, is filtered out by the filter; therefore,
. Formula (5) shows that
i is the only term with retained odd numbers. Therefore, the equivalent low-pass of the above formula can be written as:
Combined with the memory-term formula, the equivalent output of the power amplifier of the Hammerstein model is:
The above formula can be written as:
where
is only related to
and
, and
is only related to model parameters, not related to power amplifier input:
At the receiving end,
in Formula (8) is only related to the current input
of the power amplifier and output
of the power amplifier sometime before, where
can be observed, and
can be obtained by demodulating and reconstructing the signal. Therefore,
is observable. Assuming the channel is additive white Gaussian noise (AWGN), Formula (8) can be written as:
where
is zero-mean Gaussian white noise. For observation time steps
N, the above formula is written in matrix form:
where
Formula (10) can be solved by the least-squares method; then,
Considering the impact of amplitude deviation,
, where
is a rectangular pulse,
is a constant, and
can be considered as a constant
. The final effect is that there is a certain deviation between the amplitude of the actual signal and the amplitude of the reconstructed signal.
We traverse A when A obtains optimal solution , , where represents the variance of solution feature . After searching and solving , Formula (11) is substituted to obtain an individual characteristic vector of the radiation source. The final feature vector is .
2.3. Quadrature Modulation Distortion Mode
The flow chart of quadrature-phase modulation is shown in
Figure 3. The fingerprint characteristics of the radiation source usually come from modules composed of analog devices, so the analog circuit modules after the DAC module shown in the figure may generate the radiation-source fingerprint. The error factor of the in-phase/quadrature (I/Q) modulator includes the four following parts.
This is mainly caused by the difference in the gain characteristics of the components on the I/Q two channels, which are represented as GI and GQ in
Figure 4a; this leads to the phase point to change within the quadrilateral of the ‘abcd’ siege, as shown in
Figure 4a.
This is mainly caused by the DC offset generated by the amplifier on the I/Q path and the carrier leakage generated by the mixer; as shown in
Figure 4, the DC offset causes the origin of the phase coordinate to drift and cause phase errors.
This is mainly due to the difference in the delay characteristics of the analog components (amplifiers, mixers) on the I/Q channel.
When the local oscillator generates two orthogonal local carriers due to the characteristic error of the components, the phase difference is not 90°, that is, the quadrature error. These errors distort the constellation diagram on the demodulation constellation diagram, and the error characteristics of different individual transmitters are often different.
For the I/Q quadrature modulator, by gain imbalance
, the DC offset of the I and Q channels is denoted by
and
, respectively. The delay mismatch is
, and the quadrature error is denoted as
. Then, I channel signal
and Q channel signal
can be expressed as:
where
is the modulation coefficient,
is the symbol sequence generated by the symbol encoder, and
is the instantaneous pulse. The signal obtained by quadrature modulation is:
Substituting Formulas (13) and (14) into Formula (15) produces:
where
is the residual frequency offset,
; let
be the DC offset, and
Combined with the Hammerstein model of the power amplifier, the effect of bias is not considered,
, the equivalent discrete form of the output signal of the power amplifier is:
where
D and
M are the AR order and polynomial order in the Hammerstein model,
;
are the AR coefficient and polynomial coefficient, respectively, and,
and
are defined as follows:
Formula (18) can also be written as:
where
For observation period
N, the above formula can be written in a concise matrix form:
where
Considering the impact of DC bias, . Bias B is a real number, and the final effect is that a certain deviation exists between the amplitude of the signal that did not pass through the power amplifier, and the amplitude of the reconstructed signal. We traverse B when B obtains optimal solution , where represents the variance of solution feature . After searching and solving , Formula (20) is substituted to obtain an individual characteristic vector of the radiation source. The final feature vector is .
2.4. Integral Bispectrum
High-order spectral analysis is widely used in signal processing. In theory, a high-order spectrum can completely suppress any Gaussian noise and symmetrically distributed non-Gaussian noise. It can retain signal amplitude and phase information and is independent of time. Therefore, high-order spectral analysis is the current mainstream feature extraction method [
11,
26,
27]. The third-order spectrum is the simplest high-order spectrum, also known as the bispectrum. Though the processing method is relatively simple, it can describe the nonlinear characteristics of the signal. For the digital zero-IF I/Q signal
of the communication radiation source, its bispectrum can be defined as:
where
is the third-order cumulant of
.
Compared with the power spectrum, it can provide phase information and is widely used. Integrating the bispectrum is the best way to reduce the dimensionality of two-dimensional bispectral features. According to different integration paths, it can be divided into RIB, AIB, SIB, and CIB. The integration path of each integral bispectrum is illustrated in
Figure 5b.
4. Discussion
By demodulating the digital modulation signal, the initial information sequence was obtained; then, by reconstructing the message signal, a signal without any modulation distortion and power amplifier nonlinearity were obtained. By comparing the deconstructed signal with the actual received RF signal through the feature-extraction algorithm, the parameter representation of the difference between the two could be obtained. The parameter that characterized the difference between the two was the feature vector that we needed, which can represent the difference between individual radiation sources. The proposed method puts forward the reason for the difference in a radiation-source signal rather than in the appearance. Through the extraction and identification of distortion parameters, the effective identification of individual radiation sources is achieved.
The traditional SEI idea is to use training-set signals to train the classifier, and then use the signal with the same modulation parameters as the training set as the testing signal. When the test- and training-set signals come from the same transmitter but have different modulation parameters, changes in these modulation parameters make some existing research methods no longer have good robustness. The proposed method extracts the distortion parameters of the inherent hardware of the transmitter, and the distortion parameters do not change with changes in carrier frequency and the rate of the transmitted signal. Therefore, it can identify variable modulation parameters well.
The characteristics of the transmitter extracted by the proposed method can represent the characteristics of the transmitter hardware. The characteristics do not change with the change in signal-modulation parameters (such as modulation mode, modulation rate, and carrier frequency) transmitted by the transmitter. The intelligent identification of a specific transmitter can then be realized.