Most of the existing GNSS-R receiver architectures just measure the correlation peak or the cut in the delay axis of the DDMs once the Doppler frequency has been compensated, which is called a ‘waveform’. This section describes the architecture of an FPGA-based real-time GPS reflectometer that computes the full two-dimensional DDM every 1 ms and performs the coherent and incoherent averaging. The core of the system is a full-custom designed DDM generator that interacts with other in-house and off-the-shelf VHDL cores.
The main dataflow path of the system is that of the reflected GPS signal (
Fig. 1). The PAU-RAD [
11] section of the instruments sends the digitized and quantized I/Q raw data corresponding to the particular beams composed (1 or 4) to a buffer located in the reflectometer FPGA. This buffer plays a key role in the system, since it allocates the two I/Q 5745-sample strips for the up to 4 selected beams, to allow them to be processed at a higher clock rate using hardware reuse techniques. The buffer is based in two RAM-like registers that change their respective inputs and output connections. One is being written while the other is being read to generate the DDM. When the input register is full, it is swapped with the other one. From here the data is sent in blocks of 1 ms (5745 samples at the sampling frequency of the system corresponding to the period of the C/A code) to the DDM generator. Then, once the new DDM is ready, it is sent out of the system to a terminal computer, where it will be processed and stored. The transfer of up to four complex-valued DDMs every 1 ms is performed using the USB 2.0 protocol. The resulting throughput is 2 (real and imaginary parts) times 32 bits per point times 4 beams times 16 x 16 points = 8 Kbytes every 1 ms, that is 7.8 MB/s. In fact, in addition to the DDM data, the time tag of the raw data and the delay and Doppler center coordinates used for its computation are transmitted as well, since not all of the 32 bytes are used to code the DDM point values. Whenever the first of the 4 DDMs has been successfully received by the host computer, an interrupt is issued in the reflectometer system by the USB FPGA controller and the parameters for the generation of the next DDM are transferred to the DDM generator, which starts the computation using the data of the corresponding beam. After 5745 clock cycles (1 ms of data) the DDM generator activates the ‘data_ready’ signal. The microprocessor program asks the USB controller to read the DDM values from the output interface of the DDM generator, and to send these values to the host PC, closing the cycle of processing and data transfer. The integration time can be freely configured, since every 1 ms a new DDM is obtained. It is up to the receiving program to accumulate as many of them as configured, either coherently (in amplitude and phase) or incoherently (in absolute values).
The direct signal from the satellite is also processed to obtain an accurate estimate of the signal delay. To do so an upwards-looking GPS antenna receives the signal and feeds it into an A/D converter. A second buffer stores these data until they are sent to a ‘delay offset’ block that estimates the delay offset.
A third dataflow path is that of the data packets generated by a commercial GPS receiver. They enter the FPGA using an RS-232 interface using the GPS-standard TSIP protocol, and provide the system with parameters of interest such as the elevation, azimuth, power level, and Doppler of the visible satellites, simplifying to a large extent the FPGA design.
2.1. Hardware Setup
The reflectometer system has been implemented using the AVNet VIRTEX-4 LX 60 development board that includes the FPGA core, SDRAM RAM (32 MB), Flash ROM (8 MB), an UART serial interface, an USB 2.0 serial interface, an Ethernet connection, an OLED display and several other connectors. Inside the FPGA a microprocessor, data bus and RAM system is synthetized. Several peripherals are attached to the bus [
10]. The most relevant is the DDM generator, which performs the Doppler compensation and the integration (
Eqn. 3). The core of this generator is shown in
Fig. 2. It is a synchronous block that receives the I/Q samples in a serial fashion, as well as the Doppler values and C/A code samples for all the DDM points. The outcome are the four
partial products (
Eqn. 4) that are conveniently added or subtracted (depending on the sign of the Doppler shift) and accumulated during 5745 clock cycles (1 ms of data) to obtain the real and imaginary parts of each DDM coordinate. In
Fig. 2 it is seen that the I, Q and C/A code inputs are delayed by means of shift registers so that they are synchronized with the demodulating tones generated in the block
signs_and_amplitudes (
Fig. 3a). The values fed to this component determine the Doppler frequencies that are generated (they are the multiplying factors of the base frequency of the block), and thus the sign (1 bit) and amplitude (7 bits) of each frequency are obtained. Then, the block
signs_combinations generates, for each DDM point, the sign of the four
partial products that compose it (
CA.I.SIN, CA.I.COS, CA.Q.SIN and
CA.Q.COS). The core of this component is the
oscillator block (
Fig. 3b), that every clock cycle computes one sine and one cosine sample, at a frequency set by the input value
α. To do so every clock cycle the value of the variable
full_phase is increased in
α units (
Fig. 3b). The 9 most significant bits (
trunc_phase) out of 18 are then fed into the
oscillator_sinc block to generate the
sin and
cos outputs. This allows having a smaller frequency step without having to use a large number of bits to quantify the phase that is converted into sine/cosine amplitude. The
oscillator_sinc component is located at the bottom of the hierarchy level. To efficiently generate the sine and cosine values, it has been taken into account that the output signal shall be quantized with a finite number of levels. The first Most Significant Bit (MSB) tells whether the sine value is positive or negative, whereas the second MSB has information on the slope sign (
Fig. 4). The remaining 7 bits indicate the position within the first wave quarter. Instead of dividing the amplitude in equal-length intervals, the phase domain has been divided in such a way, so that the 7-bit phase of the wave quarter have a linear correspondence with the 7 bits that codify the amplitude. Therefore, if no further correction were made a triangular wave would be obtained. A correction for each level (from 0 to 127) that minimizes the error when comparing it with an ideal quantized sine has been computed, and it is then added to it. At the same time, the cosine is obtained by negating the 7 LSB's of the truncated phase, thus obtaining the complementary level:
The cosine sign (the MSB of
cos) is computed as the exclusive OR of the sine's two MSB:
The sine and cosine waveforms are generated with 8-bit, since the preliminary simulations indicated that using fewer bits resulted in a severe DDM deformation.
The simultaneous generation of several different-delay C/A codes is achieved with the block
CA_generator (
Figs. 2 and
5). As explained in [
5], these codes are the product of two Maximum Length Sequences (MLS)
G1 and
G2 obtained using their respective linear feedback shift registers (LFSR). The satellite unique C/A code is determined by the delay of
G2 with respect to the non-delayed
G1. Therefore, to obtain a C/A code delayed n chips, it is necessary to generate a
G1 sequence delayed n chips and also a
G2 sequence delayed n chips plus the offset that determines the satellite ID. To obtain these delayed sequences the 10-bit
M1 and
M2 masks determine the values of the
G1 and
G2 registers that are used to obtain the delayed outputs. These masks are pre-calculated using state-transition matrices, and are stored for all the possible 1022 chip-shift values [
12]:
where
Mi(
n) is the mask corresponding to the i
th sequence (
i =1, 2) with a delay of
n chips,
p⃗i is a row vector containing the coefficients of the polynomial representing the
Gi LFSR (0010000001 for G
1 and 0110010111 for G
2), ID is the square 9-element identity matrix, 0⃗ is a 9-element zero's column vector and
M0 is the mask for a delay offset of 0 chips ( 0000000001). Thus, the
M1 and
M2 masks contain not only the C/A code delay offset value, but also the satellite ID to be generated. On the other hand, to allow for a code resolution smaller than one chip the block
resampler was conceived (
Fig. 5). Since the system's sampling frequency is 5.745 MHz and the period of the C/A code is 1 ms for 1023 chips, the relation ‘samples per chip’ is 5745/1023, which is equivalent to the irreducible fractional form of 1915/341. Therefore, a modulo-1915 counter increases in 341 units every clock cycle. When the counter “rolls-up” a pulse to drive the
G1 and
G2 LFSRs is issued (
pulse_CA signal in
Fig. 5). The initial value of this counter is related to the non-integer desired code delay. For instance, a non-integer chip value of 0.4 chips would correspond to an initial value of 0.4 x 1915 = 766 units.
Another capital core of the designed embedded system is the buffer for the reflected signal. Both the direct signal buffer and the reflected signal buffer share the same design (
Fig. 1). It receives the multiplexed GPS data from PAU-RAD and demultiplexes it so that a DDM for a single beam can be processed at a time. This peripheral is composed of two storage units that alternatively switch their input and output ports. Whenever one is receiving and storing the I/Q data the other is being unloaded (one of the four beams at a time) to compute a new DDM. The clock frequency to receive the data bits is four times the sampling frequency of the data to allow the simultaneous storage of the data of four different beams. On the other hand, the unloading of the raw data and the generation of the DDM are performed at the clock frequency of the reflectometer (100 MHz), which is significantly higher than that of the incoming data (23 MHz). Thus again, hardware reuse is possible and up to four satellites can be processed in 1 ms. The switching between the storage units takes place whenever the one being written reaches its capacity limit (i.e, 5745 samples for each of the four beams), and an interrupt is issued by the buffer. This interrupt has top priority, since no data loss is allowed, and it triggers the generation of the DDMs associated to the data of the 1 ms under consideration. To achieve this, it is necessary to translate the delay and Doppler values of the signals received from different satellites to a set of parameters (the masks M1 and M2 of the LFSR's that determine the satellite ID and the C/A code offset, and the α values that determine the step of the frequency synthesizers that compensate the Doppler shifts), and send them to the DDM generator before the computation of a DDM.
In order to have the DDM maximum in the center of the window of delay and Doppler values it is necessary to have a good estimate of the delay and Doppler frequency of the received signal. At low and moderate altitudes the Doppler maximum value for both direct and reflected signals is roughly the same, and the difference lies only in the magnitude of the Doppler spread. Taking this into account, and also considering that the temporal derivative of these Doppler shifts (∼ 1 Hz/s [
5]) is much smaller than the 1 s update rate of the parameters provided by the GPS receiver, its value can be used straightforwardly as the Doppler center value. Unfortunately the situation is quite the opposite when it comes to the delay value: it is necessary to estimate the delay difference between the direct and reflected signals from the transmitter-receiver geometry. For example, in the simplest case of a low altitude receiver at constant height
h the excess delay is:
being
θ the zenith angle and
c the light speed. This value must be added to the direct signal delay to obtain the reflected signal delay. However, the temporal derivative of the delay cannot be neglected over the 1 s update interval. Therefore, should the delay value obtained through the serial interface be used, the maximum of the DDM will move along the delay domain at a speed depending on the satellite's position until eventually getting out of view. This drift is very inconvenient, since integration is needed to improve the low SNR of the reflected signal. If the peak moves, the integration will only further degrade the waveform. Therefore it is of capital importance to have readily available an estimate for the delay updated the more frequently the better. To do so a whole new block is required to perform the circular correlation by means of Fast Fourier Transform operations to find the maximum of the correlation of the direct signal with clean CA code replicas for four different satellites simultaneously. The equation applied is [
5]:
where
x is the baseband complex signal
I +
j·
Q. Before computing such a correlation it is necessary to compensate the respective Doppler offsets by using the data provided by the uplooking GPS receiver and to generate for each of the four satellites a local replica with zero delay offset of their C/A codes. This block has been implemented with standard Fast Fourier Transform (FFT) and complex multiplier cores, ensuring its performance and preventing an excessive use of the FPGA resources (
Fig. 6). Since the number of samples (5745 samples for 1 ms of sampled data) is not a power of 2, as needed to perform an FFT, it is necessary to fill the data values until reaching 8192 = 2ˆ13 samples. This results in two correlation maxima instead of only one. What is more, the amplitude of these maxima is position dependent. Fortunately these peaks are always spaced by the same amount of samples (8192-5745 = 2447 samples, as shown in
Fig. 7a). Therefore, if the resulting value lays above a certain threshold, it would be necessary to subtracts from it those 2447 samples to obtain the actual offset value. If the value lays beneath the threshold, then it is already the value being searched (
Fig. 7b).
2.2. Software Setup
The coordinated operation of all the peripherals that compose the reflectometer is ensured by the program running on the MicroBlaze soft processor. This code is written in ANSI C, and controls the GPS incoming data through the UART, the selection of the satellites, the generation of the DDM parameters and the signaling of the beginning of a new DDM computation. This program also transfers the DDM generation parameters to the DDM generator (the masks
M1 and
M2 of the LFSR's that determine the satellite ID and the C/A code offset, and the α values that determine the step of the frequency synthesizers that compensate the Doppler shifts). To do so these inputs are mapped to a RAM-like register at the DDM generator side. Then the read/write operations are performed by simply addressing to the DDM generator's address space. Another task of the code is the decoding of the data packets received from the commercial GPS receiver through the RS-232 link. These packets contain information on the available satellites and their respective delay, Doppler, power, elevation and azimuth values. From these values both the parameters to be fed to the DDM generator and the most suitable array beam for a given satellite are determined. Since it is another PAU system (PAU-RAD [
11]) that is in charge for the beamforming, the beam selection is transmitted to it through a dedicated communication channel. To do so the elevation-azimuth space is divided according to a grid and a beam number is assigned to each region.
The program defines several constants such as the DDM resolution (step in both the delay and Doppler domains), the sampling frequency of the incoming data and a set of conversion factors (i.e., to convert from chips to samples). At the same time, the system variables and structures are also defined. They include the packet structures used to interact with the commercial GPS receiver, the masks of the LFSRs that generate the CA code with a certain delay, or a vector of the selected satellites to work with. The system functions can be grouped in three sets:
The first one includes those that send and receive data from the GPS receiver,
A second set selects which of the available satellites are to be used, taking into account their power and position, and finally
Another set of critical functions computes the DDM parameters and transfers them to the DDM generator.
The system has two sources of interrupts (
Fig. 8). The buffer for the reflected signal (
Fig. 1) issues an interrupt every 1 ms (i.e., whenever new data is available). This Interrupt Service Routine (ISR) plays an important role in the overall operation of the reflectometer, since the buffer interrupt is the synchronism reference of the system. It comprises the following tasks:
First, the UART interrupts are disabled,
Then, for each of the four simultaneous beams/satellites, the buffer is notified which beam has to dump next,
The corresponding DDM parameters are set into the DDM generator,
The generation of the DDM is triggered,
The system waits until the DDM_ready signal goes high,
Then the DDM data sending through the USB controller is enabled,
The ISR waits for the USB controller to assert that the sending process is finished, and finally
Before returning to the main function the UART interrupts are again enabled.
The RS-232 serial port controller issues an interrupt every time new data arrives from the GPS receiver:
First of all, its associated ISR disables the UART interrupts,
Then it reads from the UART buffer all the available data bytes and parses them to decode the TSIP report packets,
Also, depending on the value of a counter, new Elevation & Azimuth packets are requested, and finally
Before returning to the main program the interrupts for the UART are enabled.
The main program initializes the system and waits for interrupts:
First, the UART interrupts are enabled in the RS-232 controller,
Then a configuration command TSIP packet is sent to the GPS receiver, so that it outputs the raw data packet every 1 second,
The satellite table is initialized and its associated DDM parameters are retrieved,
The interrupts in the microprocessor and in the interrupt controller are enabled,
The code waits for interrupts and, after an UART ISR has been executed, gets the new DDM parameters for the four simultaneous beams.