1. Introduction
The observation of spectral lines can be used to accurately study the physical properties of molecular clouds in the formation area of massive stars, e.g., the temperature, density, radiation field, and astrochemical processes [
1,
2,
3,
4,
5]. Therefore, the observation of spectral lines is critical in radio astronomy. During radio-astronomical observations, digital backends are generally required to have a wide bandwidth and high-frequency resolution [
6,
7,
8,
9]. To trace smaller-scale structures and more detailed dynamic characteristics, spectral line observations with high-frequency resolutions are conducted to extract information from the target celestial bodies. A frequency resolution of several hertz or 1 Hz is required in numerous high-precision spectral line observations. For example, during the Zeeman effect observation, if a neutral HI is observed, the frequency shift generated by each micro-Gauss magnetic field is 2.8 Hz. This requires a digital backend with a frequency resolution that should be higher than 2.8 Hz [
10]. In addition, when observing the spectral lines, especially in the high-frequency band, the broadband receivers can simultaneously receive more information to improve the observation efficiency.
The requirements of the wideband and high-frequency resolution make it difficult to obtain the digital backends for radio-astronomical observations. At present, most signal processing techniques in the digital backends, such as spectral analysis, are implemented with a field programmable gate array (FPGA). Alternatively, according to the task requirements, signal pre-processing is first achieved in the FPGA, and further processing is performed by a computer with graphics processing units [
11,
12,
13]. The maximum number of points of the available fast Fourier transform (FFT) intellectual property (IP) cores in the FPGAs used for the spectral analysis is 65536. If a resolution of 1 Hz is required for a signal with a 1 MHz bandwidth, a 106-point plain FFT is required, which necessitates a specialized design with a significant workload. The operating frequency of an actual FPGA is generally several hundred MHz. This indicates that a signal with an ultra-wide bandwidth greater than several hundred MHz cannot be directly processed by a simple FFT implemented on FPGA. Therefore, the direct application of a large FFT IP core in an FPGA to the ultra-large-scale FFT processing of wideband signals cannot be achieved to obtain an ultra-high-frequency resolution. Two methods are mainly used in radio astronomy to achieve this objective. A potential method involves the use of two-stage FFTs [
14]. A commonly used method involves the combination of a polyphase filter bank (PFB) (sometimes called a WOLA filter bank) and an FFT or cascaded PFB structure [
15,
16,
17,
18,
19,
20]. For example, the combination regime is used as a spectrum analyser for the search for extra-terrestrial intelligence (SETI). It has a frequency resolution of 91 kHz when the processing bandwidth is 1.5 GHz, and a maximum frequency of 1 Hz when the processing bandwidth is 150 MHz [
7,
19,
21]. In comparison with the two-stage FFT method, the aforementioned combined structure improves the frequency response characteristics for the first-stage channelised processing and the computational efficiency. This paper proposes a modified combination of the PFB + FFT to obtain improved flexibility with a more simplified design, and to conserve hardware resources in the FPGA. In addition, this paper proposes a IWOLA + WOLA algorithm to demonstrate an improved performance.
In this paper, a spectral analysis algorithm with an ultra-high-frequency resolution is first proposed for broadband real-valued signals. This algorithm uses an improved WOLA (IWOLA) [
22] channelised filter bank to channelise the broadband signals preliminarily, and decimates the output data from the IWOLA filter bank appropriately. The IWOLA algorithm applies an equivalent Hilbert transform to the normal WOLA filter bank by shifting the centre frequency of every sub-band by half of the frequency bin, such that the IWOLA filter bank provides K/2 independently output complex sub-bands for real-valued signals instead of the general K/2 + 1 sub-bands, thus reducing the subsequent processing units by one set. After the decimation, FFT processing for the decimated data is carried out to perform spectral analysis with an ultra-high-frequency resolution. Furthermore, by selecting only a portion of the output data from the FFT, the interference between the adjacent channels can be eliminated within the entire Nyquist frequency bandwidth. To obtain a higher channelised performance, an optimized algorithm is finally proposed, which uses the IWOLA filter bank and the second-stage WOLA filter bank, instead of the second-stage FFT.
The remainder of this paper is organised as follows. In
Section 2, the underlying principle and architecture of the proposed IWOLA + FFT algorithm are presented. In
Section 3, the characteristics of the IWOLA filter bank and IWOLA + FFT algorithms are detailed, including the channelisation characteristics, spectra performance, and elimination of the overlapping interference between the adjacent channels. To improve performance of the IWOLA + FFT algorithm, the IWOLA + WOLA algorithm is proposed, and its performance analysis results are presented in
Section 4. The simulation of the proposed algorithms and analysis results are presented in
Section 5, followed by the conclusions in
Section 6.
2. Basic Algorithm Principle
The WOLA algorithm was originally used in the short-time Fourier transform (STFT) [
23], and the WOLA filter bank has been widely used in many fields such as sub-band coding, audio processing, and radio astronomy [
24,
25,
26,
27]. The output of the WOLA filter bank involves real- and complex-valued signals, and the number of the independent output sub-channels is
K/2 + 1 (where
K is the number of sub-bands); thus, the processing complexity and consumption of hardware resources is increased. To simplify the hardware implementation of the algorithm, the IWOLA algorithm [
22] was proposed, which shifts the centre frequencies of every sub-band of the WOLA filter bank by π/
K along the positive frequency axis. In particular, the expression of the IWOLA filter bank is as follows:
where
where
l is an arbitrary integer that ranges from 0–
D-1;
k denotes the channel number;
D denotes the decimation factor;
; and
w is a window function with coefficients that can be obtained by the design command fir1 in MATLAB for a finite impulse response (FIR) low-pass filter. The cut-off frequency of the FIR low-pass filter is
fs/2/
K.
This process of shifting frequency is equivalent to a Hilbert transform applied to the WOLA filter bank, thus causing all independent output sub-channels to exhibit identical features. The identities of the sub-bands of the IWOLA filter bank provide simplification for the hardware implementation of the processing. Furthermore, for a real-valued input signal, the output data from the K/2 + 1 sub-channels of the WOLA filter bank should be stored and processed further, whereas the output data from the K/2 sub-channels of the IWOLA filter bank should be stored and processed further. This reduces the amount of data processing and conserves off-chip memory resources, especially for a large second-stage FFT.
The overall framework of the spectral analysis algorithm based on the combined IWOLA + FFT structure is shown in
Figure 1.
The output data of Sub-bands 0–
K/2–1 of the IWOLA filter bank are transposed and obtained in-turn as outputs. If
K =
D, a maximally decimated filter bank (MDFB) [
28] is produced, which has a relatively simple design and consumes less hardware resources. However, it leads to an aliasing effect (overlapping interference), as shown in
Figure 2a. To avoid the aliasing effect,
D ≤
K/2 is required, which results in a non-maximally decimated filter bank (NMDFB), as shown in
Figure 2b. Generally, for a moderately large value of
K,
D is set as
K/2. However, when
K is significantly large, e.g.,
K = 2
20,
D = 2
19. This leads to a low-pass filter prototype
w with an order of magnitude of 2
39, which consumes a significantly large amount of hardware resources. In this case, assuming that
D <
K/2 (e.g.,
K = 2
20 and
D = 2
18 or smaller), the maximum order number of the low-pass filter prototype is 2
38, which conserves half of the hardware resources, especially the multipliers and adders, at the expense of a slight performance reduction (slight decrease in the stopband attenuation). Under the condition of
D <
K/2, the data rate of each sub-band of the IWOLA filter bank is larger than the bandwidth value of each sub-band. Thus, if an FFT is implemented directly after the IWOLA filter bank, a large amount of computing resources are wasted. Therefore, before the FFT process, the data of each sub-band of the IWOLA filter bank should be decimated by a factor
D1, as shown in
Figure 1. The decimation factor
D1 can be expressed as follows:
To make the decimating factor D1 a positive integer, D < K/2 and K is divisible by D. After the decimation by D1, the data rate of each sub-band decreases to a value twice its bandwidth, which is critical for solving the problem of interference from the neighbouring sub-channels. This leads to the maximal conservation of the computing resources, given that two moderate FFTs are used instead of an ultra-large FFT. The overlapping interferences between the adjacent sub-bands can be eliminated by appropriately selecting part of the output data from the N1-point FFT.
After the application of
N1-point FFT to the decimated data of Sub-bands 0–
K/2, the resulting frequency resolution
df for each sub-band can be expressed as follows:
To eliminate the interference between the adjacent sub-bands, a part of the FFT bins of each sub-band is selected for the bin combination. The method of bin selection and the combination is shown in
Table 1. The number of selected FFT bins
N2 for Sub-bands 0–
K/2–1 can be expressed as follows:
It is necessary for N2 to be an integer, which can be readily obtained. Sub-bands 0–K/2–1 are all complex signals, which represent the entire frequency range of real-valued signals.
As shown in
Table 1, the bins from 0–
N2/2–1 and
N1–
N2/2 to
N1–1 are selected separately from the
N1 FFT bins for each of the sub-bands from 0–
K/2–1. The selected bins
N1-
N2/2 to
N1-1 from each of the sub-bands from 0–
K/2–1, respectively, are reversed and re-ordered. The selected bins in Sub-bands 0–
K/2–1 are combined into one group along the order of the channel numbering. This represents the frequency-domain data for the real-valued signals within
fs/2, and the signal power can then be calculated.
Based on the analysis presented above, the algorithm splits an ultra-large-scale FFT into a two-stage process, in which two smaller-scale FFTs are required. The N-point FFT requires a magnitude order of O(Nlog2N) operations. Therefore, if spectral processing with an ultra-high-frequency resolution (N is extremely large) is implemented in the FPGA, the algorithm reduces the operations to the magnitude order of O(Klog2K) + O(N1log2N1) (K and N1 << N). This conserves hardware resources and compensates for the lack of a commercially available enormous point FFT IP core.
As mentioned before, the WOLA filter bank is equivalent to the PFB. They are both efficient implementation of the channelization processing. Therefore, their computational and hardware complexity is equivalent. In comparison with the expressions of the IWOLA filter bank and the WOLA filter bank [
22], the expression of the IWOLA filter bank involves two terms more: (−1)
l and
e-jπp/K. It is obvious that the first term does not increase the computational and hardware complexity; the second term only necessitates one complex multiplication operation and two block RAMs (BRAMs) in the FPGA. The capacity of the two extral BRAMs is
K × 16 bit, given that each data of the second term is expressed with 16 bits in the FPGA. The function of the data decimation is to select or discard data, which does not increase the computational and hardware complexity.
5. Algorithm Simulation Analysis
To verify the accuracy and performance of the proposed broadband spectral analysis algorithms with a high-frequency resolution, the following simulation analyses were carried out in MATLAB (the version: R2017a). The following parameters were set according to Equations (4)–(6): K = 1024, D = 8, N1 = 2048, K1 = 2048, D2 = 8; and the sampling frequency of the real-valued signal fs = 800 MHz, D1 = 64, df = 763 Hz, and N2 = 1024. To analyse the algorithm performances, three types of methods were used to calculate the spectrum for the simulated signals. The first involved the direct performance of an ultra-large FFT (direct FFT) to obtain a spectral resolution of 763 Hz. The remaining two methods involved the use of the IWOLA + FFT and IWOLA + WOLA algorithms to achieve the same spectral resolution.
First, a real sinusoidal signal with a frequency of 40.00 MHz was inputted to the three algorithms, to evaluate their gains and SNRs, as shown in
Figure 17. As can be observed from the figure, the IWOLA + FFT structure exhibited a gain increase of approximately 3 dB when compared with the direct FFT, and a nearly identical SNR. The IWOLA + WOLA structure exhibited a gain decrease of approximately 5 dB when compared with the IWOLA + FFT structure. However, this may not be significant, given that the SNR of the IWOLA + WOLA structure was 6 dB higher than that of the IWOLA + FFT structure. Thus, the gain of the IWOLA + WOLA can be any value, which can then be multiplied by a constant to meet the design requirements only if its SNR is equal or superior to that of the IWOLA + FFT algorithm. Furthermore, to further verify the accuracy of the proposed algorithms, we simultaneously inputted four signals into the three algorithms. The input signal consisted of two real sinusoidal signals with frequencies of 40.00 MHz and 40.000763 MHz, respectively, and two chirp signals, in addition to Gaussian white noise. The starting frequencies of the two chirp signals were 80 MHz and 160 MHz, respectively. Moreover, their bandwidths were 0.78125 MHz, which is equivalent to the subchannel bandwidth of the IWOLA filter bank, and 80 MHz, respectively. The spectral data obtained by the three methods are shown in
Figure 18, where (a) presents a spectrum obtained by the direct FFT algorithm, (b) presents a spectrum obtained by the IWOLA + FFT algorithm, and (c) presents a spectrum obtained by the IWOLA + WOLA algorithm. There were two real sinusoidal signals at A, and one chirp signal at B, which are detailed further in this manuscript. As can be observed from the figure, the proposed algorithms demonstrated flat gain characteristics. The bandwidth of the second chirp signal occupies multiple sub-bands, but its gain between sub-bands are flat and identical.
There were two real sinusoidal signals at A, with frequencies of 40 MHz and 40.000763 MHz. The magnified views for Point A in
Figure 18 are shown in
Figure 19. When comparing
Figure 19a–c, the resultant spectra obtained by the direct FFT and IWOLA + FFT algorithms for the signal at Point A were the same, whereas the IWOLA + WOLA algorithm demonstrated a superior spectrum. The two proposed algorithms demonstrated the same frequency resolution as the direct FFT algorithm, namely, 763 Hz. The magnified views for Point B in
Figure 18 are shown in
Figure 20. The chirp signal frequency started at 80 MHz with a bandwidth of 0.78125 MHz, which was equal to each sub-band width of the IWOLA filter bank.
In addition, a coherent dispersed pulse was generated in MATLAB as a simulated pulsar signal, with a dispersion measure (DM) of 26.776 pc cm
−3. The frequency ranged from 1200–1600 MHz, and its sampling frequency and observation time were 800 MHz and 35 ms, respectively. The simulated pulsar signal with a white Gaussian noise (SNR = 0 dB) is shown in
Figure 21. By setting the decimation factor
D = 8, the number of channels was set as
K = 32, and the bin number
K1 and the decimation factor
D2 of the second-stage WOLA filter bank were 2048 and 8, respectively. The simulated pulsar signal was inputted to the proposed IWOLA + WOLA algorithm. This yielded the spectrum shown in
Figure 22 with a frequency resolution of approximately 24.4 kHz. As can be seen from the figure that Before dedispersion, there was dispersion smearing at the observed frequencies within an observation duration of 35 ms. The pulsar was dispersed within a range of 1200–1600 MHz. After a direct dedispersion algorithm, the dispersed signal was aligned to the arrival time, which corresponded to 1200 MHz. The dedispersed signal is displayed in
Figure 23. It was subjected to a succeeding overlap-adding processing, which yielded the dedispersed time series of the simulated pulsar, as shown in
Figure 24. As can be observed from the figure, the SNR of the pulsar signal output from the IWOLA + WOLA algorithm was 24.07dB.
The simulation results revealed that the proposed IWOLA + WOLA algorithm demonstrated a superior accuracy and performance to the direct FFT and the IWOLA + FFT algorithms. The proposed IWOLA + FFT and IWOLA + WOLA algorithms solved the overlapping interference problem of the adjacent subchannels caused by the first-stage channelisation that is realised by an IWOLA filter bank. By selecting the appropriate K, D, D2, and K1 values, in addition to the operating frequency for hardware such as FPGAs, a broadband spectral analysis system can be designed with a satisfactory performance that consumes less resources.