1. Introduction
Vortex beams with spiral phase structures have been used extensively in information transmission, radar imaging and rotational target detection, since Allen first investigated the Laguerre-Gaussian vortex beam and its orbital angular momentum (OAM) in 1992 [
1]. Theoretically, there are infinite kinds of eigenstates; different eigenstates are orthogonal to each other, which is quite important in terms of improving communication capacity and imaging resolution in remote sensing. Multiplexing different OAM beams can effectively avoid the crosstalk between different modes in a channel, providing a new communication dimension that is no longer limited to amplitude, phase, frequency and polarization, thereby greatly improving the communication capacity [
2,
3]. In free-space OAM communication systems, the receiver needs to demodulate the OAM beam to recover the information sequence. Traditional OAM demodulation techniques, such as spatial light modulators, the diffraction method, the cylindrical lens method, plane wave interferometry and spherical wave interferometry, are based on optical hardware and have been researched extensively [
4,
5,
6,
7]. The OAM beam is pre-processed by optical hardware to obtain optical pattern features that can be distinguished by the naked eye [
8]. However, on account of the high cost and the limited processing capability of optical hardware, high-performance transmission cannot be guaranteed with a cost-effective vortex optical communication system.
In recent years, with the rapid increase in computing power, OAM mode recognition based on deep learning has attracted growing attention. Some researchers have studied OAM mode recognition based on neural networks. Krenn et al. proposed a self-organizing competitive neural network (SOM) based OAM mode recognition which verified the feasibility of machine learning in vortex optical communication systems for the first time [
9]. They built a long-distance vortex optical communication system on the sea between the Canary Islands, achieving a recognition accuracy of 91.67%, which verified the possibility of the use of vortex beams for long-distance information transmission [
10]. Deep neural network (DNN)-based recognition of different OAM modes is proposed in [
11]. Convolutional neural networks (CNNs) are proposed in OAM mode recognition by Doster et al. [
12]; that method achieved a mode recognition accuracy of up to 99%, which is far superior to the traditional methods. In addition, the CNN-based mode identification method is robust in terms of the influence of turbulence intensity, data size, sensor noise and pixels. Such research has paved the way for the application of CNN in OAM mode detection. Subsequently, many achievements in OAM mode recognition have been realized based on CNNs [
13,
14].
In 2017, Zhang et al. compared the performance of a K nearest neighbor neural network, a plain Bayesian classifier, a Back Projection artificial neural network and a CNN as OAM mode classifiers under different turbulence conditions. They observed that CNN yielded the best results [
15]. The same authors improved the original LeNet-5 network and proposed a decoder scheme that could simultaneously implement OAM mode and turbulence intensity recognition [
16]. OAM mode recognition technology combined with turbo channel coding is proposed in [
17]; this approach effectively improved the recognition accuracy and reliability of communication transmission.
Similarly, many scholars have worked on niche applications of OAM mode detection. In 2018, Zhao et al. applied a CNN to learn a received OAM light intensity map under different tilt angles by adding a view-pooling layer. They also used a hybrid data collection technique to improve the performance [
18]. Misaligned hyperfine OAM mode recognition was carried out in [
19]. Machine learning based the recognition of fractional optical vortex modes in atmospheric environment was studied by Cao et al. [
20]. When the marked data sample was insufficient, an OAM mode recognition method based on Conditional Generative Adversarial Networks (CGAN) was proposed to improve the recognition accuracy [
21]. A Diffractive Deep Neural Network (D2NN) was utilized in OAM mode recognition in [
22], eliminating the need for a CCD camera to capture images and pass them to a computer, making the communication rate independent of the hardware and neural network computation rate.
The above research was dedicated to training and identifying the OAM light distributions captured by CCD cameras; however, some other researchers have performed transformations on the vortex beam before training to highlight the characteristics of different modes. In 2018, Radon transform was introduced to preprocess a light intensity distribution map of an OAM beam to obtain more clearly distinguishing features [
23]. A mode recognition technique based on coherent light interference at the receiver side to obtain more obvious recognition features was reported in [
24]. A SVM-based single-mode recognition method was proposed in [
25], using the relationship between the amount of OAM beam receiving the effect of atmospheric turbulence distortion and the topological charge number as an artificial feature of the design. A joint scheme combining the Gerchberg–Saxton (GS) algorithm and CNN (GS-CNN) to achieve the efficient recognition of the multiplexing LG beams was proposed in [
26]. A technique to measure the OAM of light based on the petal interference patterns of modulated vortex beams and an unmodulated incident Gaussian beam reflected by a spatial light modulator was reported in [
27].
In light of the aforementioned studies, it may be stated that most research has focused on OAM mode detection by neural networks, CNNs or CNN-based combination methods, preprocessing transform before network training, and OAM mode detection in misaligned or tilt angles special cases. However, in practical applications, there are many different multi-modes superpositions of OAM beams corresponding to quite similar light intensity maps, such as OAM = {−2, 3, −5} and OAM = {1, −2, 3, −5,}, OAM= {4, −4} and OAM= {2, −6}, etc. Additionally, for a single mode vortex beam with a large topological charge, the number of fringes in the interferogram is large. Because the area of the device that collects the image at the receiving end is certain, it is difficult to determine when the number of interference fringes is large, which will further affect the accuracy of OAM mode recognition.
To solve these problems, an OAM mode recognition technique based on an attention pyramid convolutional neural network (AP-CNN) is proposed in this paper. Fine-grained image classification [
28] is introduced to make full use of low-level detailed information of a similar intensity in superimposed vortex beams and the dense fringes of a plane wave interfergram of a single-mode vortex beam with a large topological charge. A top-down feature path and a bottom-up attention path structure, combined with an attention pyramid, is adopted to improve the OAM mode recognition accuracy and reduce the bit error rate (BER) for indistinguishable intensity distributions.
The remainder of this paper is arranged as follows. The OAM mode recognition technical framework based on AP-CNN is described in
Section 2. In
Section 3, numerical results and discussions of different transmission conditions are presented to compare the recognition accuracy and bit error ratio.
Section 4 is devoted to the conclusion.
3. Results and Discussion
3.1. Simulation Data Set Construction
In order to verify the performance of the AP-CNN in OAM mode detection for similar distributions of multi-mode vortex beams, four pairs of OAM modes, namely, {1, −2} and {1, −2, −5}, {1, −2, 3, −5} and {−2, 3, −5}, {4, −4} and {2, −6}, {6, −6} and {9, −3}, are selected, as shown in
Figure 5. The light intensity distributions in four columns are similar to each other. In addition, in order to test the detection performance of a single-mode vortex beam with a large topological charge interfering with the plane wave, a plane wave interferogram dataset of a single-mode vortex beam is constructed, choosing eight types of samples with large topological charges, i.e.,
, as shown in
Figure 6.
The wavelength of the OAM communication system is 0.6328 μm and the beam waist radius is 0.3 m. For the comparison experiments under different turbulent conditions, the transmission distance is fixed and six different atmospheric refractive index structure constants, , are selected: , , , , and . For comparison experiments at different transmission distances, the is fixed and six different transmission distances are chosen: 500 m, 1000 m, 1500 m, 2000 m, 2500 m, and 3000 m. When simulating the atmospheric turbulence channel, the power spectrum inversion method is used to decimate the transmission distance in order to obtain ten phase screens with certain intervals. For each transmission condition, 2000 light intensity maps are generated for each OAM mode. In this way, a total of 16,000 light intensity distribution maps are included in the hybrid dataset. This is then divided into a training and a test set at a ratio of 8:2 (12,800 images in the training set and 3200 images in the test set).
3.2. Analysis of Multi-Mode OAM Mode Recognition Based on AP-CNN
In order to compare the performance of the AP-CNN with that of ResNet18, the OAM recognition accuracy, confusion matrix, and BER of both models are numerically analyzed in this section. Variations in OAM mode recognition accuracy are shown in
Figure 7 and
Figure 8 under different turbulence intensities for transmission distances 2000 m and 3000 m, respectively.
As shown, the detection accuracy increases gradually with an increase in the training epoch. Additionally, the comparisons show that the detection accuracy increases significantly more slowly in strong turbulence than in weak ones. When the turbulence is strong, the training process becomes more problematic because the image suffers more serious distortion. Taking a transmission distance of 2000 m as an example, as shown in
Figure 7, after 100-times training in medium turbulence (
,
,
), ResNet18 can still achieve an OAM mode detection accuracy of more than 96.9%, whereas under strong turbulence
and
), the detection accuracy decrease to 91.7% and 87.3%, respectively. The detection accuracy is further reduced from 87.3% to 83.2% when the transmission distance is increased to 3000 m under strong turbulence (
). In other words, the greater the turbulence intensity and the farther the transmission distance, the lower the detection accuracy.
A comparison of
Figure 7 and
Figure 8 reveals that the accuracy of OAM mode recognition based on AP-CNN is superior to that of ResNet18. When the turbulence is weak, the optimization effect of the AP-CNN is minimal, while when the turbulence is strong, it is more obvious. For example, when the transmission distance is 2000 m for
, as shown in
Figure 8, the accuracy of OAM mode recognition based on the AP-CNN can reach up to 98.1% after 100 training epochs, which is only a 0.6% improvement compared with ResNet18. Additionally, under these circumstances, the optimization effect is not obvious. Meanwhile, when
, the recognition accuracy based on ResNet18 is only 92.1%. The accuracy of OAM mode recognition based on AP-CNN, on the other hand, is significantly improved, and the recognition accuracy of the best model reaches 94.4%, showing an improvement of about 2.3%.
When the transmission distance is large, the improvement of the recognition accuracy based on AP-CNN is limited in a highly turbulent environment. For example, when
, the accuracy of OAM mode recognition based on the AP-CNN can reach up to 90.5% after 100 training epochs when the transmission distance is 2000 m, which is about a 3.1% improvement compared to ResNet18, as shown in
Figure 8. However, when the transmission distance extends to 3000 m, the recognition accuracy of the best model can only reach 84.2% after AP-CNN, i.e., 1.2% higher than ResNet18. The reason for this is that in strong turbulence, when the distance is too great, the transmission of the OAM beam is greatly affected by the turbulent disturbance distortion. The light intensity distribution captured by the CCD camera at the receiving end is seriously distorted, and the image features used for recognition are compromised. Therefore, even if the AP-CNN introduces an attention mechanism to mine the underlying image details, it cannot significantly improve the accuracy of OAM mode recognition.
The confusion matrixes of superimposed vortex beams using ResNet18 and AP-CNN, with a transmission distance is 2000 m and when atmospheric refractive index structure constant
, are given in
Figure 9. The results show that {1, −2, 3, −5} has a 16% probability of being incorrectly identified as {−2, 3, −5}, {1, −2} has a 10% probability of being incorrectly identified as {1, −2, −5}, and {2, −6} has an 11% probability of being incorrectly identified as {4, −4}. In contrast, in the APP-CNN network, the accuracy of OAM mode detection improves by up to 7%, and the related incorrect identification rate is reduced by up to 3%, which confirms the necessity of designing similar OAM superposition mode datasets. The accuracy improvement and decrease of incorrect identifications may vary with the transmission conditions.
Assuming the signal-to-noise ratio at the receiver side of the OAM communication system
, the system BER can be calculated based on the recognition accuracy. When the transmission distance is 2000 m, the demodulation performance of the CNN demodulator, ResNet18 demodulator, ResNet18 demodulator with a specified mapping relationship, and the AP-CNN demodulator under six different turbulence intensities
and transmission distances, expressed as BER at the receiver end, are shown in
Figure 10.
It can be seen from
Figure 10 that when the turbulence intensity and transmission distance are certain, ResNet18 can mine more OAM intensity map information due to the presence of more convolution layers compared to CNN. Additionally, the BER of the OAM communication system is lower when using the ResNet18 demodulator compared to the CNN demodulator. The CNN structure used here consists of three convolution layers and two full connection layers. Each convolution network layer is composed of a convolutional layer, a batch normalization layer, and a maxpool layer. The layers are connected by a rectified linear unit (Relu), and each layer uses dropout. The dropout probability is set to 0.3. The convolutional layers of the first, second, and third convolution network layers contains 16 kernels of size 5 × 5, 32 kernels of size 3 × 3, and 64 kernels with size of 3 × 3, respectively. The maxpool layer size of the three convolution network layers is 2 × 2 and the step size is 2; the difference is more obvious in a strong turbulence environment (
). Both the ResNet18 demodulator combined with specified mapping and the AP-CNN demodulator are optimized based on the ResNet18 demodulator, and both have lower BER than the ResNet18 demodulator. When
, the BER using the AP-CNN demodulator is lower than that using the ResNet18 demodulator combined with the specified mapping. However, when the turbulence is quite strong (e.g.,
), the BER using the AP-CNN demodulator is higher than that of the ResNet18 demodulator with the specified mapping relationship. The reason for this is that the two optimization schemes (specifying the mapping relationship and introducing the attention pyramid) do not go in the same direction, as shown in
Figure 11.
3.3. Analysis of Single-Mode OAM Mode Detection Based on AP-CNN
The detection accuracies of OAM mode with the ResNet18 network and AP-CNN are shown in
Table 2 after 100 training cycles under different turbulence intensities and transmission distances. It can be concluded that the stronger the turbulence, the lower the accuracy of OAM detection. In addition, it can be seen from the data in
Table 3 that no matter whether the transmission distance is 2000 m or 3000 m, under different turbulence intensities, AP-CNN has improved accuracy compared with ResNet18. In strong turbulence (
): the detection accuracy of AP-CNN is 85.2%, a 1.3% improvement compared with ResNet18, whereas in medium turbulence conditions (
), the detection accuracy of AP-CNN has a 3.4% improvement compared with ResNet18 at the transmission distance of 3000 m. When the transmission distance is reduced to 2000 m, the detection accuracy of AP-CNN shows 5.5% and 4.3% improvements compared with ResNet18 at
and
, respectively. It should be emphasized that when the transmission distance is long and there is strong turbulence, the light intensity distribution captured at the receiving end is damaged (because the vortex beam is greatly affected by the turbulent distortion during transmission), and the AP-CNN detection rate cannot be significantly improved.
A comparison of the demodulation performance of CNN, ResNet18, ResNet18 combined with plane wave interference, and AP-CNN combined with plane wave interference with a 2000 m transmission distance under six different turbulence intensities is shown in
Figure 12a. The performance of the communication link under different transmission distances when
is compared in
Figure 12b.
It can be seen from
Figure 12 that no matter which demodulation scheme is adopted, when the transmission distance is fixed, the stronger the turbulence, the lower the accuracy of the OAM mode detection at the receiving end. When the turbulence intensity is constant, the longer the transmission distance, the lower the OAM mode detection rate. The detection accuracy of the CNN demodulator with three convolutional layers, a simple structure, and no plane wave interference is low. The ResNet18 has a more complex network structure, including 17 layers of convolution, and has a stronger ability to mine light intensity information. However, it directly identifies the light intensity distribution of single-mode vortex light without interference. Although its detection accuracy is improved compared with that of CNN, it is still not satisfactory. After plane wave interference, the plane wave interferogram collected at the receiving end has more distinguishable characteristics, and the OAM mode detection accuracy of the ResNet18 demodulator is significantly improved compared with no interference. The AP-CNN adds a dual-path structure combined with an attention pyramid after the ResNet18 network. As such, the OAM mode detection accuracy is further improved compared to that of ResNet18. This reveals that the dual path with an attention pyramid is beneficial for single mode detection accuracy.
3.4. Dicussions
In this work, an AP-CNN OAM mode detection method was described. By adding a dual path network with an attention pyramid to the backbone ResNet18, the AP-CNN network was constructed. The effects of turbulence intensity and transmission distance on the improvement of OAM mode detection accuracy were numerically analyzed.
We first studied the performance of the AP-CNN on multi-mode OAM mode detection with similar light intensity distributions. Comparisons of AP-CNN with ResNet18 under different degrees of atmospheric turbulence and transmission distances verified the improvement of the recognition accuracy due to the presence of a dual path structure with attention pyramid. The results reveal that the recognition accuracy increased to some extent with increasing turbulence intensity.
The demodulator performance of the CNN with three convolution layers and two full connection layers, i.e., ResNet18, ResNet18 + Specify mapping and AP-CNN, on a multi-mode vortex beam are studied. When , the BER using AP-CNN demodulator was lower than with the other three methods.
In addition, OAM mode detection in single mode with a large topological charge was simulated under medium and strong turbulence intensities, i.e., ranges from to at 2000 m and 3000 m transmission distances. A comparison between ‘ResNet18′and ‘ResNet18 + coherent’ verified the effect of plane wave interference. The single mode recognition by AP-CNN showed an accuracy improvement of up to 5.5% compared with ResNet18 when at a 2000 m transmission distance, indicating that the former detection method has strong detailed extraction and learning capabilities for dense interference fringes of single mode vortex beams with large topological charges.
4. Conclusions
In this paper, an OAM mode recognition technique based on AP-CNN is proposed. Utilizing ResNet18 as the backbone of the AP-CNN, a dual-path algorithm structure, including a top-down feature path and a bottom-up attention path, is added. Based on the dual-path algorithm structure combined with the attention pyramid, low-level detailed information of the similar light intensity map is fully utilized. In our simulated experiments, the size of the light intensity distribution map of the OAM beam was set to 128 × 128 and was input into the ResNet18 network for training. Then, the output feature maps of the third, fourth, and fifth layers of ResNet18 were selected to build a pyramidal hierarchy. After supervised training with a large OAM communication dataset with different turbulence conditions, the recognition accuracy and the BER were numerically determined. The simulation results showed that the AP-CNN achieved greatly improved OAM mode detection accuracy and demodulation performance compared with the ResNet18 network. When the turbulence was weak, the optimization effect of AP-CNN was not obvious, i.e., a 0.6% improvement, while when the turbulence was strong, the optimization was clear, with an improvement of about 2.3%. The OAM detection accuracy of the AP-CNN was up to 5.5% higher than that of ResNet18 at 2 km with strong turbulence. This technique has significant applications in communication, target detection, and radar imaging. Due to the limitations of experimental conditions, our research was only based on a simulated intensity distribution dataset, and light intensity information was collected without phase information. The training and analysis of the real turbulence OAM communication data under different conditions, as well as the phase information, will be the focus of future work by our team.