1. Introduction
With the exploration of marine resources and the increase of human underwater activities, the demand for efficient underwater communication has increased significantly. At present, the most widely applied means of underwater wireless communication is acoustic technology, which still suffers from low data rate, high signal attenuation, high latency and severe multipath effect. As for radio frequency (RF) which is commonly used in our daily life, it is not very suitable for underwater communication due to the high attenuation of RF waves in seawater. Another way is utilizing fiber optic to implement long-range and high-speed underwater communication. However, it is not flexible enough for increasingly diverse underwater applications because of the requirement of a physical cable connection between transmitters and receivers. Therefore, underwater visible light communication (UVLC) has aroused wild attention and is considered an alternative or complement to the underwater communication technologies mentioned above [
1,
2,
3].
Visible light communication (VLC) is an emerging wireless communication technology with several advantages such as large transmission capacity, high security, low cost, anti-electromagnetic interference and license-free [
4,
5,
6]. Considering the relatively low attenuation window of seawater in the blue-green portion of the electromagnetic spectrum, VLC has great potential to provide a high data rate and low latency within a short range [
1,
2]. There are two kinds of commonly used VLC transmitters, which are based on laser diodes (LDs) and light-emitting diodes (LEDs). LDs enable longer-range transmission in point-to-point scenarios due to their high power density but require strict alignment between transmitters and receivers. Compared with LDs, LEDs have a larger divergence angle, which allows for more short-range applications, including point-to-point and point-to-multipoint scenarios [
7]. In this paper, we focus on UVLC systems based on LEDs which have become a promising candidate for underwater communication.
Despite the many significant advantages of UVLC systems based on LEDs mentioned above, the bottleneck problems it faces cannot be ignored. The limited bandwidth of LEDs, various noises, and linear and nonlinear distortions largely restrict the communication performance of UVLC systems [
6]. The modulation bandwidth of commercially available LEDs is usually several megahertz (MHz) [
8], which has become one of the main factors limiting the development of high-speed VLC systems. Due to the bandwidth limitation of LEDs and other optoelectronic devices, the received signals suffer from serious linear distortion, thus interfering with the correct demodulation and decoding. When the bandwidth of the transmission signals exceeds the effective bandwidth of VLC systems, the bandwidth limitation behaves as a strong low-pass filter, which will compress the signal spectrum, resulting in serious inter-symbol interference (ISI). Furthermore, the signals’ nonlinear effects will severely deteriorate communication performance, which results from the nonlinearity of devices’ optoelectronic characteristics, distortions caused by amplifiers and so on. When the transmitting power is very high and the channel is complex, nonlinearity will become the main challenge restricting the systems’ performance, especially for underwater applications, which usually require considerable transmitting power.
To alleviate these bottleneck problems, advanced modulation formats are proposed, including carrier-less amplitude and phase (CAP) modulation [
9], adaptive bit loading Orthogonal Frequency Division Multiplexing (OFDM) [
10] and Nyquist single carrier (N-SC) modulation [
11]. To further improve communication performance, Geometrically-shaping (GS) technology [
12] was proposed to increase Euclidean distance between the constellation points and decrease Peak-to-Average Power Ratio (PAPR) by optimizing the constellation distribution. For example, APSK, which is one kind of GS constellation point, has been adopted by the second-generation digital video broadcasting specification for satellite (DVB-S2) and approved by the consultative committee for space data systems (CCSDS) [
13,
14]. Besides, post-equalization algorithms based on advanced digital signal processing (DSP) are proposed to mitigate linear and nonlinear distortions. At present, linear time-frequency domain equalization technology based on least mean square (LMS) [
15], recursive least square (RLS) [
16], direct decision least mean square (DD-LMS), zero-forcing [
17] and other algorithms have been utilized in VLC systems, effectively eliminating linear distortion. In terms of nonlinear distortions, a series of nonlinear equalizers have also been proposed, such as Volterra series-based and Polynomial based algorithms [
18].
Recently, artificial intelligence algorithms, especially neural networks (NN), are emerging as effective techniques to deal with nonlinear problems. Due to NN’s universal approximation theorem, it is commonly used as a post-equalization algorithm in VLC systems. In [
19], a Gaussian kernel-aided deep neural network (GK-DNN) equalizer was utilized to compensate for the high nonlinear distortion of underwater PAM8 VLC channels. In [
20], Lu et al. proposed a memory-controlled deep LSTM neural network post-equalizer for PAM-based VLC systems. In [
21], a nonlinear resilient learning post-equalizer named TFDNet was proposed. It exploits time-frequency image analysis, which considers the time and frequency domains simultaneously and is effective in tackling nonlinear distortions in UVLC systems. Except for equalizers, DNNs were recently used as a waveform to symbol converter, which could replace conventional demodulation, post-equalization, and down-sampling at the receiving end. In [
22], a sparse data-to-symbol neural network (SDSNN) receiver is proposed for UVLC based on nonorthogonal multi-band CAP to mitigate ISI and inter-channel interference (ICI). In [
23], a Neural-network-based waveform to symbol converter (NNWSC) can directly convert the received multiband CAP waveform into quadrature amplitude modulation (QAM) symbols to simultaneously handle the ISI and ICI in a fiber-mmWave system. In [
24], a DNN-based waveform to symbol decoder with three hidden layers was utilized in UVLC systems and achieved better communication performance than a traditional receiver. Moreover, neural networks are now becoming more and more popular in channel estimation [
25] and end-to-end learning [
26,
27].
In this paper, we first construct and optimize a traditional DNN-based waveform-to-symbol converter [
24] to replace conventional demodulation, down-sampling and post-equalization at the receiving end in 64QAM and 64APSK UVLC systems based on CAP modulation. It is regarded as a benchmark that could increase signal Vpp dynamic range by 104% (from 250 mV to 511 mV) for 64APSK and 181% (from 180 mV to 506 mV) for 64QAM with 7% FEC of 3.8 × 10
−3 as a BER threshold, compared with traditional CAP receiver. However, it comes at the cost of the high complexity of 26,210 trainable parameters. To achieve a better tradeoff between communication performance and computation complexity which is represented by total trainable parameters, we then innovatively propose a cascaded receiver consisting of a DNN-based waveform-to-symbol converter and modified NN-based DD-LMS. With fewer taps and nodes than the traditional converter, the front-stage converter could still mitigate the majority of ISI and signal distortions. Then modified NN-based DD-LMS is cascaded to improve communication performance by reducing phase offset, making received constellation points more concentrated and closer to standard constellation points. Compared with the traditional converter, the cascaded receiver could achieve 89.6% of signal Vpp dynamic range with 12.4% of complexity in the 64APSK UVLC system. Moreover, the ratio of signal Vpp dynamic range and total trainable parameters is 1.24 × 10
−1 mV, while that of the traditional converter is 1.95 × 10
−2 mV. It is experimentally validated that the cascaded receiver using 64APSK is an effective method to enhance the performance of UVLC systems based on CAP modulation.
4. Experimental Results
In this section, we first construct and optimize a traditional DNN-based waveform-to-symbol converter (DNN converter) in 64APSK and 64QAM UVLC systems. Based on it, a cascaded receiver consisting of a DNN converter with fewer taps or nodes and an NN-based DD-LMS algorithm is experimentally analyzed in detail.
Figure 3 shows the relationship between the taps of traditional DNN-based waveform-to-symbol converter (DNN converter) and BER performance under different bias currents and signal Vpp, with 7% FEC of 3.8 × 10
−3 as a BER threshold. Taps refer to the number of symbols corresponding to the input signal waveforms processed by the DNN converter at one time. Furthermore, the taps determine the number of inter-symbol interferences that the DNN converter can model. When the taps are less than 19, BER under different bias currents and signal, Vpp decreases rapidly as taps increase because it is not large enough to calculate the ISI. In this system, 19 taps allow a traditional DNN converter to achieve the best BER performance, which is clearly described in
Figure 3. Because of the four times of upsampling, 19 taps mean that the input waveform length of the DNN converter is 76. However, as taps continue to increase, BER becomes worse. More taps lead to a more complex structure of DNN, requiring bigger data sets to train. Because of limited training data, overly complex networks cannot be trained better [
19]. Therefore, there is a tradeoff between taps and the number of training data. The taps of 19 would be the best in our experiment. The constellation points of 19 under different bias currents and signal Vpp are displayed in
Figure 3i–iv.
Next, it is necessary to find the optimal structure of a traditional DNN converter.
Figure 4a–c shows the relationship between BER performance and the number of hidden layers, nodes of each hidden layer, and the activation functions of traditional DNN converter under different signal Vpp, respectively. The bias current is fixed at 150 mA, and signal Vpp varies from 250 mV to 850 mV. In
Figure 4a, the number of nodes in each hidden layer is set as 32. The DNN converters with different hidden layers ranging from 1 to 5 all outperform traditional CAP demodulation, especially in the high signal Vpp region. Thus, DNN-based waveform-to-symbol converters are experimentally demonstrated to be an effective scheme to improve the communication performance of UVLC systems. When the number of hidden layers increases from 1 to 3, the corresponding BER decreases. This is because the DNN converter requires a large enough complexity to fit the nonlinearity and perfectly convert the signal waveforms into the corresponding symbols. As the number of hidden layers increases from 1 to 3, the DNN converter becomes more complex and fits the relationship between input and output more fully. However, as the number of hidden layers continues to increase from 3 to 5, the BER performance has not been further improved but has deteriorated somewhat. Because three hidden layers are sufficient to fit the nonlinear relationship between input and output, and no more layers are required. Instead, more hidden layers consume more computing resources and may lead to overfitting. Therefore, the number of hidden layers is chosen as 3. In
Figure 4b, the DNN converter has three hidden layers with
nodes respectively, so it is referred to as DNN (
). The BER performance of DNN (32, 32, 16), DNN (96, 64, 32), DNN (96, 96, 96) and DNN (128, 128, 128) are shown, compared with traditional CAP demodulation. DNN (96, 96, 96) achieves better performance than DNN (32, 32, 16) and DNN (96, 64, 32) due to greater complexity. Instead, DNN with too complex a structure easily leads to overfitting, which could be verified by DNN (128, 128, 128). In
Figure 4c, the effects of different activation functions (Sigmoid, ReLU and Tanh) and no activation functions (None) have been investigated. The DNN converter with the activation function of Tanh has the best performance, then ReLU and Sigmoid. The DNN converter with no activation function has the worst performance that is similar to traditional CAP demodulation. Therefore, the DNN-based waveform-to-symbol converter we optimize has three hidden layers with 96, 96 and 96 nodes, and Tanh is employed as the activation function.
Figure 5 presents the relationship between the BER performance of different taps and epochs during training, with 7a % FEC of 3.8 × 10
−3 as a BER threshold. The bias current is 150 mA, and the signal Vpp is set as 550 mV. When the epoch is less than 200, the BER of the DNN converter decreases below the threshold rapidly. As the epoch continues to increase, BER drops slowly and converges gradually. It can be seen that the BER using a DNN converter with the taps of 19 is lower than 11 and 27 when training is finished, which is consistent with the results in
Figure 3.
Figure 5i–iv displays the constellation points obtained by the DNN converter with the taps of 19 when the epoch is 2, 50, 100, and 500, respectively. As the epoch increases, the constellation points become more and more clear and distinguishable.
Then the effects of the DNN converter on BER in 64QAM or 64APSK UVLC systems under different bias currents and signal Vpp are investigated, compared with traditional CAP demodulation. The results are presented in
Figure 6, where the dynamic range of signal Vpp and bias current is circled by a black line of
FEC threshold. The bias current varies from 70 mA to 170 mA, and the signal Vpp varies from 250 mV to 750 mV. Comparing
Figure 6a with
Figure 6c, it can be seen that 64APSK systems work mainly in the higher current region where signals suffer from severe nonlinearity, which means that 64APSK is more advantageous to resist nonlinearity and it could obtain higher SNR in the circumstances of strong attenuation in UVLC systems. The DNN converter could significantly extend the dynamic range both in 64APSK and 64QAM systems, especially in high signal Vpp region, because of its powerful ability to handle nonlinearity. However, in low signal Vpp region, BER performance using a DNN converter is similar to that using traditional CAP demodulation in 64APSK or 64QAM systems. This can be predicted because when signal Vpp is relatively low, noise plays a dominant role in the deterioration of communication performance, and the neural network is proven to be almost unable to handle it.
In order to analyze the effects of the DNN converter, more specifically, the BER versus different signal Vpp for 64APSK and 64QAM UVLC systems under the bias current of 150 mA is demonstrated in
Figure 7, compared with traditional CAP demodulation. In terms of traditional CAP demodulation (black curves in
Figure 7a,b), the BER decreases as the increase of signal Vpp when it is smaller than 450 mV due to the enhancement of SNR. When it is larger than 450 mV, the BER versus signal Vpp has an opposite tendency because of nonlinear distortions. Thus, the signal Vpp dynamic range of signal Vpp is 250 mV for 64APSK and 180 mV for 64QAM with 7% FEC of 3.8 × 10
−3, respectively. The communication performance is significantly improved by using DNN converter (red curves in
Figure 7a,b), as the signal Vpp dynamic range enlarges by 104% (from 250 mV to 511 mV) for 64APSK in
Figure 7a and 181% (from 180 mV to 506 mV) for 64QAM in
Figure 7b. Meanwhile, the Q factor of 950 mA biases current increases by 3.12 dB for 64APSK and 2.96 dB for 64QAM, compared with traditional CAP demodulation. It is experimentally verified that the DNN converter has obvious advantages over traditional CAP demodulation in the nonlinear region, thus more suitable for UVLC systems.
We also measured the Q factor for 64APSK and 64QAM UVLC systems at different bit rates under the optimal bias voltage and signaled Vpp with 7% FEC of 3.8 × 10
−3 as a BER threshold. The performance of the DNN converter is in comparison with traditional CAP demodulation in
Figure 8. As expected, the DNN converter could achieve greater communication capacity. To be specific, the highest bit rate is 3.23 Gb/s in the 64APSK system utilizing DNN converter and 3.09 Gb/s in 64QAM system, which is 125 Mb/s and 50 Mb/s faster than traditional CAP demodulation, respectively. 64APSK using the DNN converter is experimentally proved to be a promising scheme for future high-speed UVLC systems.
At this point, the traditional DNN-based waveform-to-symbol converter we optimize is obtained. However, it is of high complexity to sustain the pressure of mitigating ISI, linear and nonlinear distortions, which is not expected for the implementation of UVLC systems. In order to achieve a better tradeoff between communication performance and computation complexity, we innovatively propose a cascaded receiver consisting of a DNN-based waveform-to-symbol converter and an NN-based DD-LMS algorithm. The DNN converter at the front stage is generated by appropriately reducing the taps or the nodes of hidden layers of the DNN converter we optimize above. It could still mitigate the majority of ISI and distortions of received signals. Then, a modified NN-based DD-LMS is utilized for further improving the quality of complex symbols (constellation points) produced by the DNN converter at the front stage. Thus, the algorithm could achieve similar performance to the traditional DNN converter we optimize but with much lower complexity.
In order to investigate the effects of modified NN-based DD-LMS on constellation points, we first pay attention to an important metric, which is phase offset. It is the average phase offset of each constellation point’s class compared with standard constellation points. The results of 64APSK and 64QAM systems are shown in
Figure 9a,b, respectively. The bias current is fixed at 150 mA, and signal Vpp varies from 250 mV to 950 mV. Based on the traditional DNN converter we optimize above, the front-stage DNN converter is obtained by reducing the taps from 19 to 11 and the nodes of three hidden layers from 96 to 32. As can be seen from
Figure 9, the phase offsets after the front-stage DNN converter are still high due to remaining ISI and distortions. Fortunately, the phase offsets could be significantly reduced by modified NN-based DD-LMS, which indicates the improvement of communication performance.
Figure 9i,ii show the movement vector of each constellation points’ center after NN-based DD-LMS for 64APSK and 64QAM systems when signal Vpp is 550 mV, respectively.
In addition to phase offset, we measured the average distance between the received constellation points’ center and standard constellation points. Moreover, a new metric named “Average Cluster Sum of Square” (ACSS) is proposed to represent how concentrated the received constellation points are. The smaller ACSS is, the more concentrated the received constellation points are. ACSS can be expressed as,
where
denotes the
point of
constellation points class and
denotes the center of
constellation points class. The results of DNN (32, 32, 32), DNN (64, 64, 64) and DNN (96, 96, 96) with the taps of 11 are shown in
Figure 10. The average distances and ACSS under different signal Vpp for 64APSK UVLC systems are demonstrated in
Figure 10a,b, respectively. It can be seen that after using NN-based DD-LMS, the average distance and ACSS both become smaller almost under every signal Vpp, which indicates that symbol decision would be more accurate after modified DD-LMS. The details of the 64 constellation points’ class for the 64APSK system are shown in
Figure 10i,ii, where blue and brown columns represent the results of the front-stage DNN converter with or without NN-based DD-LMS, respectively. The experimental results are similar for the 64QAM system, which are provided in
Figure 10c,d.
In summary, NN-based DD-LMS could improve communication performance by reducing phase offset, making constellation points more concentrated and closer to standard constellation points. Then the communication performance of the cascaded receiver we proposed is presented in
Figure 11a–d with 7% FEC of 3.8 × 10
−3 as a BER threshold, compared with traditional CAP demodulation and typical DD-LMS.
Figure 11a shows that in 64APSK UVLC systems, the BER of using a DNN converter is much lower than traditional CAP demodulation in the high-power region. The red line represents the traditional DNN converter we optimize, and it achieves the best communication performance. When reducing the taps from 19 to 11, the communication performance becomes worse due to remaining ISI and signal distortions, in which the dynamic range is 475 mV. After using typical DD-LMS or NN-based DD-LMS, BER performance is enhanced. Furthermore, using modified NN-based DD-LMS could achieve better performance than typical DD-LMS due to the usage of Adam optimizer. In
Figure 11b, cascaded receivers consisting of DNN converters with different complexity and NN-based DD-LMS are investigated in 64APSK systems. The traditional DNN converter we optimize has the largest dynamic range of 511 mV. When reducing the taps from 19 to 11, the dynamic range of the DNN converter with NN-based DD-LMS is 497 mV. Continuing to reduce the complexity, the BER performance is slightly worse. The dynamic range of signal Vpp is 481 mV for DNN (64, 64, 64) with NN-based DD-LMS and 458 mV for DNN (32, 32, 32) with NN-based DD-LMS. The trend is similar for 64QAM in
Figure 11c,d.
The BER performance and trainable parameters that represent the complexity of different receivers for 64APSK and 64QAM systems are compared in
Table 1 and
Table 2, respectively. Ratio* represents the ratio between signal Vpp dynamic range (mV) and total trainable parameters. In
Table 1, the traditional DNN converter we optimize, which points to DNN (96, 96, 96) with taps of 19, has a maximum signal Vpp dynamic range of 511 mV and maximum trainable parameters of 26,210 at the same time in 64APSK systems. In order to make a better tradeoff between communication performance and computation complexity, a cascaded receiver consisting of a DNN converter and NN-based DD-LMS is utilized. The complexity of a DNN converter varies greatly with the number of nodes, but that of NN-based DD-LMS is fixed at 84. As the nodes of each hidden layer reduce from 96 to 32, the dynamic range of the cascaded receiver slightly reduces from 497 mV to 458 mV (from 97.3% to 89.6% of maximum), but the complexity greatly reduces from 23,222 to 3702 (from 88.6% to 14.1% of maximum). The ratio of signal Vpp dynamic range and complexity increases from 2.14 × 10
−2 mV to 1.24 × 10
−1 mV, while that of traditional DNN converter is 1.95 × 10
−2 mV. Notably, DNN (32, 32, 32) with NN-based DD-LMS and taps of 11 could achieve 89.6% of signal Vpp dynamic range with 14.1% of complexity, which experimentally verifies the effectiveness of the proposed algorithm. It can be seen that the conclusions are similar in the 64QAM system in
Table 2. However, the tradeoff between communication performance and computation complexity is worse than 64APSK. To be specific, as the nodes reduce from 96 to 32, the signal Vpp dynamic range reduces from 451 mV to 382 mV (from 89.1% to 75.5%), and the complexity greatly reduces from 23,222 to 3702 (from 88.6% to 14.1%). The ratio of signal Vpp dynamic range and complexity increases from 1.94 × 10
−2 mV to 1.03 × 10
−1 mV, while that of the traditional converter is 1.93 × 10
−2 mV. Therefore, the proposed algorithm is experimentally validated to be an effective receiver to make a better tradeoff between communication performance and computation complexity than traditional DNN converter, especially in UVLC systems using 64APSK modulation format.