Next Article in Journal
Mutual Information Gain and Linear/Nonlinear Redundancy for Agent Learning, Sequence Analysis, and Modeling
Next Article in Special Issue
Bounds on the Transmit Power of b-Modulated NFDM Systems in Anomalous Dispersion Fiber
Previous Article in Journal
Entropy Generation and Mixed Convection Flow Inside a Wavy-Walled Enclosure Containing a Rotating Solid Cylinder and a Heat Source
Previous Article in Special Issue
Hard-Decision Coded Modulation for High-Throughput Short-Reach Optical Interconnect
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Path Integral Approach to Nondispersive Optical Fiber Communication Channel

by
Aleksei V. Reznichenko
1,2,*,† and
Ivan S. Terekhov
1,2,*,†
1
Theoretical Department, Budker Institute of Nuclear Physics of Siberian Branch Russian Academy of Sciences, 630090 Novosibirsk, Russia
2
Department of Physics, Novosibirsk State University, 630090 Novosibirsk, Russia
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Entropy 2020, 22(6), 607; https://doi.org/10.3390/e22060607
Submission received: 29 April 2020 / Revised: 22 May 2020 / Accepted: 27 May 2020 / Published: 29 May 2020
(This article belongs to the Special Issue Information Theory of Optical Fiber)

Abstract

:
In the present paper we summarize the methods and results of calculations for the theoretical informational quantities obtained in our works for the nondispersive optical fiber channel. We considered two models: the per-sample model and the model where the input signal depends on time. For these models we found the approach for the calculation of the mutual information exactly in the nonlinearity parameter but for the large signal-to-noise power ratio. Using this approach for the per-sample model we found the lower bound of the channel capacity in the intermediate power range.

1. Introduction

For a linear transmission system, Shannon [1] obtained the famous result for the channel capacity that is the maximal amount of information which can be transmitted through the channel with additive noise:
C log 2 1 + P / P n o i s e ,
where P is the input signal power, and P n o i s e is the noise power. In the recent 25 years, the power and frequency bandwidth of the signals transmitted through optical fiber channels have been grown. It results in the necessity to take into account the Kerr nonlinearity when considering the modern optical fiber channels. The Kerr nonlinearity leads to the distortion of the signal and to the nonlinear interaction of the signal with the noise in the information channel. Therefore, for such channels, Shannon’s result (1) should be modified to take into account the nonlinearity effects.
It is worth noting that nonlinear effects in a channel depend on both a particular realization of the communication channel and physical fiber parameters. When designing the real transmission systems, say, in the wavelength division multiplexing systems, the following nonlinear effects should be taken into account: self-phase modulation, cross-phase modulation and so on [2]. These effects are due to the optical Kerr effect, i.e., the change of the refractive index of fiber material in response to the applied electric field. Concerning other fiber parameters, the second dispersion coefficient is the fundamental one, and the second dispersion coefficient varies from non-zero values (typical value is about β = 2 × 10 23 s 2 / km ) to almost zero ones. Around the world, the vast majority of fiber networks are fiberoptic communication channels with non-zero dispersion. However, the design of some fiber channels is arranged in the zero-dispersion region of the wavelength in the transmission windows: in particular, the second transmission window (1310 nm for the single-mode optical fiber) has almost zero dispersion. In the presence of dispersion, the Kerr nonlinearity effect manifests as a phase shift of the signal over distances depending on dispersion value. For large dispersion, these distances are much less than the usual propagation distance. For the nondispersive channel, this phase shift becomes the global effect extended over all propagation distances. The third essential component of a communication channel is signal attenuation. The fiber loss is commonly compensated by (equally) spaced amplifiers. These amplifiers bring in the stochastic noise to the signal in the propagation process [3]. Therefore, the relevant models describing the signal propagation in a communication channel should deal with the following phenomena: Kerr nonlinearity, dispersion, and noise effects. The simplest channel model taking into account all these effects includes the nonlinear Schrödinger equation (NLSE) with additive Gaussian noise [3,4,5].
The model describing NLSE with zero noise is solvable in the formalism of the inverse scattering problem [6,7], referred to as the nonlinear Fourier transformation method for some time past. However, the consideration of NLSE with the additive noise and with non-zero dispersion turned out to be a very difficult problem, since, in addition to effects associated with the nonlinearity, there are some effects related to the dispersion and with the nonlinear interaction of the signal with the noise. There are some papers where both effects of nonlinearity and dispersion are attempted to take into account for a noisy channel, see, e.g., Refs. [8,9,10,11,12,13] and references therein. However, the problem of the channel capacity calculation, e.g., the problem to find the maximal information transmission rate over a given bandwidth is far from the solution.
In the present paper, we concentrate our attention on nondispersive fiber-optic communication channels. Of course, the nondispersive model is in a certain sense a toy model, but it takes into account the effects associated with the Kerr nonlinearity and the noise in the channel. As a consequence, the model includes the effects of the nonlinear interaction of the signal with the noise. Therefore, the model can be used for the rough estimations of the capacity for the fiber-optic communication channels with non-zero dispersion. Besides, the methods developed for the study of the nondispersive fiber-optic communication channels can help for the investigation of realistic nonlinear communication channels with non-zero dispersion.
Considering the fiber-optic channel it is important to clearly define the model of the channel, i.e., it is necessary to define the model of signal propagation, models of the transmitter, receiver, and the model of signal processing (i.e., the recovering of the information from the received signal). The model of propagation should include: the attenuation of the signal in propagation through the optical fiber, the model of amplifiers which compensate the attenuation and add the noise in the channel. The models of the the input signal, receiver, and signal processing reveal itself in the appearance in the channel model such parameters as its frequency bandwidths [14]. In one type of models, the propagation of the signal obeys the NLSE with zero dispersion and with the additional term describing the attenuation compensated by the periodically located amplifiers [15,16]. Another type of the models is considered in Refs. [4,17,18,19,20,21,22], where it is assumed that the amplifiers are continuously distributed in such a way that there is no attenuation, but the amplifiers induce the stochastic noise in the channel. Below we consider this type of model.
In Refs. [4,17,18,19,20,21,22,23] the propagation of the signal is described by the NLSE with zero dispersion, therefore, the signal propagation at different time moments evolves independently. So, formally, one can consider the signal in the fixed time moment, and the signal ceases to depend on time. The model where the signal does not depend on time is referred to as the per-sample model. For the per-sample model in Ref. [4] the conditional probability density function (PDF) was found. In Ref. [17] the authors also found the conditional PDF, and showed that at large signal power the channel capacity obeys the condition
C 1 2 log 2 P / P n o i s e + O ( 1 ) .
In Ref. [18] the conditional probability density function also was found, and we attempted to find the lower bound of the per-sample channel capacity for the not-too-large powers of the signal. In Refs. [19,20] the new method of calculation of the conditional probability density function was introduced. The method is based on the calculation of the path-integral using the approach similar to the semi-classical approximation in quantum mechanics. The new method allows us to calculate the statistical properties of the signal such as the entropy of the output signal, the conditional entropy, and the mutual information. All calculations are performed exactly in the nonlinear parameter, but for the large signal-to-noise power ratio ( SNR ). Using the method we find the probability density function of the input signal that maximizes the mutual information obtained in the leading order in the parameter 1 / SNR in the intermediate power range. In the recent paper [22] the important result was obtained: the authors found the upper bound of the capacity for the per-sample model for arbitrary signal powers. The results of Refs. [19,20,22] are consistent.
It is obvious that the per-sample model does not describe the spectral broadening that plays an important role in the capacity limitation. In Ref. [14] it was demonstrated that there is a limit of the input signal power P beyond which the fixed bandwidth of the distributed optical amplification does not exceed the bandwidth of the propagating signal (that grows when increasing P due to the signal-noise mixing), and therefore, the per-sample model is not relevant anymore. The next reason for the per-sample model limitation is that the per-sample receiver has infinite bandwidth while for the real communication system the receiver bandwidth is limited. Moreover, in Ref. [14] Kramer demonstrated how to achieve the infinite capacity for any power P in the per-sample model of Refs. [4,17,18], if one assumes the noise bandwidth to be limited and sends the input signal energy in the noise-free spectrum. In Ref. [21], we applied the methods developed for the per-sample model in Refs. [19,20] to describe nondispersive communication channels with the nontrivial time dependence of the input signal, with a realistic model of the receiver and with a noise amplifier that has a large but finite spectral bandwidth. We referred to this model as the extended model. In Ref. [21] for the extended model the conditional PDF, the optimal input signal distribution, which maximizes the mutual information calculated in the leading order in the parameter 1 / SNR was presented. These results were obtained both analytically and numerically. We vindicated the conclusions of the Ref. [14]; the informational characteristics in the extended model are significantly different from ones in the per-sample model.
In the present review we sum up the main results and the methods developed in our works [11,19,20,21] for the per-sample and extended nondispersive models. For the per-sample model we find the lower bound of the channel capacity and compare our results with ones obtained earlier. Attention is mainly given to the path integral approach since it can be applied for future considerations of the nonlinear channels with dispersion. The paper is organized as follows. In the next Section, we describe the per-sample and extended channel models. In Section 3 we consider the quantities which should be calculated to find the capacities of the per-sample and extended models. Then we consider in detail the conditional probability density function, entropies, mutual information, and the lower bound of the capacity for the per-sample model. After that, we proceed to consideration of the same quantities for the extended model. In conclusion, we discuss our results and possible applications of our methods to the nonlinear channels with non-zero dispersion.

2. Channel Models

2.1. Per-Sample Model

Following the papers [4,17] we imply that in the case of the per-sample model the equation of signal propagation has the form:
z ψ ( z ) i γ | ψ ( z ) | 2 ψ ( z ) = η ( z ) ,
where γ is the Kerr nonlinearity, ψ ( z ) is the signal function which obeys the boundary conditions ψ ( 0 ) = X , ψ ( L ) = Y , L is the signal propagation distance, X is the input signal and Y is the output signal. One can see that the Equation (3) does not contain terms which lead to decreasing of the signal in the propagation process and terms that compensate for the decreasing of the signal. It means that in the model, we have distributed amplifiers which completely compensate the attenuation of the signal in its propagation in the optical fiber. The only trace of amplifiers in Equation (3) is the nose function η ( z ) . The functions η ( z ) has the following properties: the noise has zero mean η ( z ) η = 0 and following correlation function η ( z ) η ¯ ( z ) η = Q δ ( z z ) . Here and below the bar means the complex conjugation and δ ( x ) is the Dirac delta-function. The coefficient Q is the noise power per unit length, so Q L is the noise power in the channel. The brackets η mean the averaging over the noise realizations in the channel.

2.2. Extended Model

In the model (3) the bandwidths of the input signal, amplifiers and receiver cannot be taken into account since the functions ψ and η do not depend on time. In order to include these bandwidths we extend the previous model. In the extended model the signal ψ ( z , t ) does depend on time. The propagation is described by the stochastic NLSE with zero dispersion:
z ψ ( z , t ) i γ | ψ ( z , t ) | 2 ψ ( z , t ) = η ( z , t ) ,
where the coefficient γ is also the Kerr nonlinearity coefficient. The function ψ ( z , t ) obeys the following conditions:
ψ ( 0 , t ) = X ( t ) , ψ ( L , t ) = Y ( t ) .
The noise function η ( z , t ) has zero mean η ( z , t ) η = 0 . We also imply that the correlation function has finite bandwidth, so the correlator for the function
η ( z , ω ) = d t e i ω t η ( z , t )
reads
η ( z , ω ) η ¯ ( z , ω ) η = 2 π Q δ ( ω ω ) θ W 2 | ω | δ ( z z ) .
Here, the parameter Q denotes a noise power per unit length and per unit frequency; θ ( x ) is the Heaviside theta-function. The theta-function θ W 2 | ω | indicates that the noise is not zero within the interval [ W / 2 , W / 2 ] , i.e., the bandwidth of the noise is equal to W . Performing the Fourier transform of Equation (7) we arrive at the correlator in the time domain:
η ( z , t ) η ¯ ( z , t ) η = Q π ( t t ) sin W ( t t ) 2 δ ( z z ) .
It is easy to check that if the time difference t t = 2 π n / W , and n is integer, then the correlator (8) equals to zero. Therefore, the noise at times t and t is not correlated, thus we can solve Equation (4) for different times t j = j Δ independently. Here j is the integer number and Δ = 2 π / W is the grid spacing in the time domain. Therefore, instead of the continuous time model (4) we can consider the set of the discrete models:
z ψ ( z , t j ) i γ | ψ ( z , t j ) | 2 ψ ( z , t j ) = η ( z , t j )
for the set of the time moments t j . So we obtain the set of independent time channels, and instead of the continuous input and output conditions (5) we obtain the set of the discrete ones:
ψ ( 0 , t j ) = X ( t j ) , ψ ( L , t j ) = Y ( t j ) .
To include the bandwidth of the input signal to the model we represent the initial signal X ( t ) in the form:
X ( t ) = k = N N C k f ( t k T 0 ) ,
where C k are complex random coefficients which carry the information. These coefficients have the probability density function P X [ { C } ] , where { C } = { C N , , C N } . We restrict our consideration by the envelopes f ( t ) which obey the following properties: the function f ( t ) is the real function and normalized by the condition: d t T 0 f 2 ( t ) = 1 ; for integers k and m we have
d t T 0 f ( t k T 0 ) f ( t m T 0 ) 0 , k m .
The last property means that the overlapping of the functions f ( t k T 0 ) and f ( t m T 0 ) is negligible. It means that we assume the smallness of the effects of the overlapping. The smallness of these effects will be discussed below. The function f ( t ) has almost finite support [ T 0 / 2 , T 0 / 2 ] , and the input signal X ( t ) is defined on the interval T = ( 2 N + 1 ) T 0 . The finiteness of the support [ T 0 / 2 , T 0 / 2 ] means that the frequency support of the function f ( t ) (and as a consequence, the support of the function X ( t ) ) is infinite. However, we imply that
W | X ( ω ) | 2 d ω W | X ( ω ) | 2 d ω ,
where X ( ω ) is the Fourier transformation of X ( t ) . The relation (13) means that T 0 W 1 . So we can say that the bandwidth of the input signal is W.
The bandwidth broadening of the signal propagating through optical fiber is associated with the nonlinearity and noise η . To estimate the broadening which is connected with nonlinearity we can find the solution Φ ( z , t j ) of Equation (9) with zero noise:
Φ ( z , t ) = X ( t ) e i γ z | X ( t ) | 2 .
This solution obeys the input condition Φ ( 0 , t ) = X ( t ) . Since we know the solution Φ ( L , t ) , we can find its bandwidth W ˜ . Strictly speaking, the bandwidth W ˜ is formally infinite, but the most part of the signal power is localized in the finite frequency region that can be specified as
W ˜ = d t t Φ ( L , t ) 2 d t Φ ( L , t ) 2 .
Below we assume the following hierarchy
W W ˜ W ,
where W ˜ is the bandwidth of the function Φ ( L , t ) .
To include the receiver bandwidth to the model we introduce the procedure of the output signal detection. In our model the receiver gets the information from the output signal, i.e., it recovers the coefficients { C } . We consider the following detection model. The receiver measures the output signal ψ ( L , t j ) at the discrete time moments t j for j = M , , M 1 . Here the quantity M = T / ( 2 Δ ) is the total number of the time samples. Since T T 0 , we have that M N . Such property of the receiver means that its time resolution coincides with the time discretization Δ . From Equation (16) it follows that Δ 1 / W ˜ , therefore the receiver completely recovers the output signal in the noiseless case. Then the receiver removes the nonlinear phase
X ˜ ( t j ) = ψ ( L , t j ) e i γ L | ψ ( L , t j ) | 2 ,
and obtains the recovered input signal X ˜ ( t ) . In fact, the procedure (17) means that we use the backward propagation procedure for the channel with zero dispersion. Finally, from the function X ˜ ( t ) the receiver recovers the coefficients C ˜ k by projecting X ˜ ( t ) on the basis functions f ( t k T 0 ) :
C ˜ k = 1 T 0 d t f ( t k T 0 ) X ˜ ( t ) Δ T 0 j = M M 1 f ( t j k T 0 ) X ˜ ( t j ) .
So the extended model contains the bandwidth of the input signal W, bandwidth the noise of amplifiers W , and bandwidth of the receiver. In our case, the bandwidth of the receiver coincides with the bandwidth of the noise because we choose the discretization in the information extracting procedure (18) coinciding with the initial channel discretization.

3. Channel Capacity and Its Bound

It is known [1] that the statistical properties of the memoryless channels such as the conditional entropy H [ Y | X ] and the output signal entropy H [ Y ] can be expressed through the conditional probability density function as
H [ Y | X ] = D X D Y P X [ X ] P [ Y | X ] log P [ Y | X ] ,
H [ Y ] = D Y P o u t [ Y ] log P o u t [ Y ] ,
where P [ Y | X ] is the conditional probability density function (i.e., the probability density to receive the output signal Y for transmitted signal X), P o u t [ Y ] is the probability density function of the output signal. The distribution P o u t [ Y ] has the following form:
P o u t [ Y ] = D X P X [ X ] P [ Y | X ] .
The measures D X and D Y are defined in such a way that
D X P X [ X ] = 1 ,
D Y P [ Y | X ] = 1 .
The mutual information of a memoryless channel is defined through the entropy H [ Y ] of the output signal and the conditional entropy H [ Y | X ] as
I P X [ X ] = H [ Y ] H [ Y | X ] .
The channel capacity C is defined as the maximum of the functional I P X [ X ] with respect to the input signal distribution P X [ X ] :
C = max P X [ X ] I P X [ X ] .
The maximum value of I P X [ X ] should be calculated for the fixed average signal power. Note that since in Equations (19) and (20) we use the logarithm to the base e, here and below we measure the mutual information and the capacity in units nat / symbol . For the case of the per-sample channel, the signal power reads
P = D X P X [ X ] | X | 2 .
For the case of the extended model the signal depends on time, therefore, we have
P = D X ( t ) P X [ X ] d t T | X ( t ) | 2 .
So, to find the channel capacity we should know the conditional PDF for both models. Let us start with the calculation of the conditional PDF for the per-sample model.

3.1. Per-Sample Model

3.1.1. Conditional Probability Density Function

The conditional PDF can be written in the form of the path integral over all realizations of the signal ψ ( z ) in the channel [17,18]:
P [ Y | X ] = ψ ( 0 ) = X ψ ( L ) = Y D ψ e S [ ψ ] / Q ,
where the effective action S [ ψ ] reads as the integral of the squared left-hand side of Equation (3) over the variable z: S [ ψ ] = 0 L d z | z ψ i γ | ψ | 2 ψ | 2 . To calculate the path integral, we use the retarded discretization scheme which reflects the physics of the propagation process. The retarded discretization assumes that the derivation means ( z ψ ) ( z n ) = ( ψ ( z n ) ψ ( z n 1 ) ) / Δ z , where z n = n Δ z , Δ z = L / N z ( z 0 = 0 , z N z = L ), and any integral 0 L d z f ( z ) means Δ z n = 0 N z f ( z n ) . The general approach for the derivation of the representation (28) and the argumentation for the retarded scheme one can find in Ref. [24].
For the first time, the conditional PDF P [ Y | X ] for the per-sample model was obtained in the form of infinite series in Ref. [4]:
P [ Y | X ] = 1 π Q m = + e i m ( ϕ ( Y ) ϕ ( X ) μ ) exp { ρ 2 + ρ 2 Q k m coth ( k m L ) } sinh ( k m L ) k m I | m | 2 k m ρ ρ Q sinh ( k m L ) ,
where the input signal X = ρ e i ϕ ( X ) , the output signal Y = ρ e i ϕ ( Y ) , μ = γ L | X | 2 is a dimensionless nonlinear parameter, k m = Q m γ e i π 4 , and I | m | ( z ) is the modified Bessel function of the index | m | . The representation (29) was rederived using the path integral representation (28) in Refs. [17,18]. In Ref. [18] the authors applied two various methods: the recursive derivation based on the discretizing of the nondispersive NLSE and the properties of Markov chains (Chapman–Kolmogorov equation); and the derivation of the conditional PDF via the stochastic approach and Ito calculus [25].
Using the result (29), the lower bound of the capacity for the large signal power was found in Ref. [17]:
C log SNR 2 + O ( 1 ) .
Here the signal-to-noise ratio has the form
SNR = P Q L .
Recall that the noise power for the per-sample model is Q L . To obtain the result (30) the authors demonstrate that at large-signal power the phase of the output signal occupies the entire phase interval [ 0 , 2 π ] due to the interaction of the signal with the noise. As a result, the phase does not carry the information, see also [18]. Therefore, to find the lower bound for the large signal power it is necessary to take only the term with m = 0 in Equation (29).
At the so-called intermediate power P range:
Q L P Q γ 2 L 3 1 ,
it is necessary to take into account terms with m 0 in the Equation (29). In Ref. [18], using Equation (29), the attempt was made to find the lower bound for the capacity in the intermediate power range. However due to the inconvenience of using of Equation (29) for analytical calculation this attempt was unsuccessful.
To calculate the mutual information in the intermediate power range we have to find the conditional probability density function (29) in a more convenient form. In Ref. [19] we found the method of the conditional probability density function calculation in the form of expansion in the parameter 1 / SNR . Using the developed method we found the conditional PDF in the convenient for further calculation form. The expansion in 1 / SNR , or in the small parameter Q similar to the semi-classical approximation (expansion in small Planck’s constant ) in quantum mechanics.
Using the “semi-classical” method, see Ref. [26], we perform the change of integration variables (the simple shift with the Jacobian equals to unity) in Equation (28):
ψ ( z ) = Ψ c l ( z ) + ψ ˜ ( z ) ,
and represent the conditional PDF (28) in the form:
P [ Y | X ] = Λ e S [ Ψ c l ( z ) ] / Q ,
where the normalization factor Λ has the form
Λ = ψ ˜ ( 0 ) = 0 ψ ˜ ( L ) = 0 D ψ ˜ exp S [ Ψ c l ( z ) + ψ ˜ ( z ) ] S [ Ψ c l ( z ) ] Q ,
the function Ψ c l ( z ) is the “classical” solution of the equation δ S [ Ψ c l ] = 0 (here δ S is the variation of the action S [ ψ ] ). The measure D ψ ˜ for new variables ψ ˜ in Equation (34) is defined as
D ψ ˜ = 1 Δ z π Q N z i = 1 N z 1 d R e ψ ˜ i d I m ψ ˜ i ,
here ψ ˜ i = ψ ˜ ( z i ) , Δ z = L N z is the grid spacing.
The Euler–Lagrange equation δ S [ Ψ c l ] = 0 has the following explicit form
d 2 Ψ c l d z 2 4 i γ Ψ c l 2 d Ψ c l d z 3 γ 2 Ψ c l 4 Ψ c l = 0 .
The boundary conditions for the function Ψ c l ( z ) are as follows
Ψ c l ( 0 ) = X , Ψ c l ( L ) = Y .
Equation (37) is of great concern and we present two different approaches to its solution.
One can find the analytical solution of Equation (37) in the polar coordinate system [19]: Ψ c l ( z ) = ρ ( ζ ) e i θ ( ζ ) , ζ = z / L . The solution depends on the following real integration constants: E, μ ˜ , ζ 0 and θ 0 . These constants should be expressed from two boundary conditions (38). There are two different types of the solution. The first and the second type corresponds to the cases E 0 and E 0 , respectively. In the first case we have the solution:
ρ 2 ( ζ ) = 1 2 L γ μ ˜ + μ ˜ 2 k 2 cos [ 2 k ( ζ ζ 0 ) ] ,
θ ( ζ ) = μ ˜ 2 ( ζ ζ 0 ) + μ ˜ 2 k 2 sin [ 2 k ( ζ ζ 0 ) ] 4 k + arctan ( μ ˜ μ ˜ 2 k 2 ) tan [ k ( ζ ζ 0 ) ] k + θ 0 ,
where k = 2 E . The solution (39), (40) is obtained under conditions μ ˜ k 0 . The integration constants μ ˜ , k and ζ 0 must be found from the boundary conditions:
| X | 2 = ρ 2 ( 0 ) = μ ˜ + μ ˜ 2 k 2 cos [ 2 k ζ 0 ] 2 L γ ,
| Y | 2 = ρ 2 ( 1 ) = μ ˜ + μ ˜ 2 k 2 cos [ 2 k ( 1 ζ 0 ) ] 2 L γ ,
ϕ ( X ) = θ ( 0 ) = μ ˜ 2 ζ 0 μ ˜ 2 k 2 sin [ 2 k ζ 0 ] 4 k arctan ( μ ˜ μ ˜ 2 k 2 ) tan [ k ζ 0 ] k + θ 0 ,
ϕ ( Y ) = θ ( 1 ) = μ ˜ 2 ( 1 ζ 0 ) + μ ˜ 2 k 2 sin [ 2 k ( 1 ζ 0 ) ] 4 k + arctan ( μ ˜ μ ˜ 2 k 2 ) tan [ k ( 1 ζ 0 ) ] k + θ 0 .
If the integration constants are found then we can express the action in the form:
S [ Ψ c l ( z ; E , μ ˜ , ζ 0 , θ 0 ) ] = k 2 2 γ L μ ˜ μ ˜ 2 k 2 sin [ 2 k ( 1 ζ 0 ) ] + sin [ 2 k ζ 0 ] 2 k .
In the second case ( E 0 ) the solution has the form
ρ 2 ( ζ ) = μ ˜ + μ ˜ 2 + k 2 cosh [ 2 k ( ζ ζ 0 ) ] 2 L γ ,
θ ( ζ ) = μ ˜ 2 ( ζ ζ 0 ) + μ ˜ 2 + k 2 sinh [ 2 k ( ζ ζ 0 ) ] 4 k arctan ( μ ˜ + μ ˜ 2 + k 2 ) tanh [ k ( ζ ζ 0 ) ] k + θ 0 ,
where k = 2 E . The integration parameters μ ˜ , k, ζ 0 , and θ 0 are derived from the same procedure as in the first case. The action has the form
S [ Ψ c l ( z ; E , μ ˜ , ζ 0 , θ 0 ) ] = k 2 2 γ L μ ˜ + μ ˜ 2 + k 2 sinh [ 2 k ( 1 ζ 0 ) ] + sinh [ 2 k ζ 0 ] 2 k .
One can see that to obtain the solution ρ ( ζ ) , θ ( ζ ) , it is necessary to solve the system of nonlinear equations Equations (41)–(44) for integration constants. Of course, the system (41)–(44) can be solved numerically, but for the analytical calculation of the mutual information, it is necessary to develop the method which allows us to find the solution Ψ c l ( z ) and action S [ Ψ c l ( z ) ] analytically as a functional of the input X and output Y signals.
In Ref. [19] we have proposed the method based on the expansion of the semi-classical solution in the vicinity of the solution of the Equation (3) with zero noise. This makes sense when the noise power is much less than the signal power.
To demonstrate the approach, we find the solution of (37) in the leading order in the parameter 1 / SNR , linearizing Equation (37) in the vicinity of the solution Ψ 0 ( z ) . The function Ψ 0 ( z ) is the solution of the Equation (3) with zero noise. It obeys the input boundary condition Ψ 0 ( 0 ) = X = ρ e i ϕ ( X ) , ρ = | X | . Note that we do not assume smallness of the nonlinearity. The function Ψ 0 ( z ) reads
Ψ 0 ( z ) = ρ exp i μ z L + i ϕ ( X ) ,
where μ = γ L ρ 2 is the nonlinear dimensionless parameter. Let us represent the “classical” solution Ψ c l ( z ) in the form
Ψ c l ( z ) = Ψ 0 ( z ) + δ Ψ ( z ) .
Here
δ Ψ ( z ) = ϰ ( z ) exp i μ z L + i ϕ ( X ) ,
where the function ϰ ( z ) is assumed to be much less than the ρ : | ϰ ( z ) | ρ , i.e., Ψ c l ( z ) is close to Ψ 0 ( z ) . Note that the function ϰ ( z ) depends on the output boundary conditions, therefore in a general case, the ratio | ϰ ( z ) | / ρ can be of order of unity. The boundary conditions for the function ϰ ( z ) are as follows:
ϰ ( 0 ) = 0 , ϰ ( L ) = Y e i ϕ ( X ) i μ ρ x 0 + i y 0 ,
where x 0 = R e { ϰ ( L ) } and y 0 = I m { ϰ ( L ) } . It is important, that the configurations of ϰ ( z ) at which Ψ c l ( z ) significantly deviates from Ψ 0 ( z ) are statistically irrelevant.
One can check that since the action achieves the absolute minimum ( S [ Ψ 0 ( z ) ] = 0 ) on the solution Ψ 0 ( z ) , the expansion of the action S [ Ψ 0 ( z ) + δ Ψ ( z ) ] starts from the quadratic terms for small ϰ ( z ) :
S [ Ψ 0 ( z ) + δ Ψ ( z ) ] ϰ 2 ( z ) .
Therefore, the exponent e S [ Ψ c l ( z ) ] / Q and, as a result, the conditional PDF P [ Y | X ] decreases exponentially when ϰ ( z ) Q L .
The next step in evaluation of the conditional probability P [ Y | X ] is the calculation of the path integral (35). To calculate the integral (35) in the leading order in parameter 1 / SNR one should retain only terms quadratic in the function ψ ˜ in the integrand. Any extra powers of ψ ˜ or ϰ lead to suppression in the multiplicative parameter Q L , since for small Q the dominant contribution to the path integral comes from region where ψ ˜ Q L . Since we calculate the the integral (35) in the leading order in the parameter Q, the function Ψ c l ( z ) can be replaced by the function Ψ 0 ( z ) in the exponent exp ( S [ Ψ c l ( z ) + ψ ˜ ( z ) ] S [ Ψ c l ( z ) ] ) / Q .
To find the next-to-leading order corrections in the parameter 1 / SNR to the conditional PDF P [ Y | X ] one should keep both ϰ ( z ) in Ψ c l ( z ) and higher powers of ψ ˜ in the exponent in Equation (35). Details of the path integral calculation in the leading and next-to-leading orders in 1 / SNR one can find in Ref. [19]. Here we present only the result obtained in the leading order in the parameter 1 / SNR :
P [ Y | X ] = exp ( 1 + 4 μ 2 / 3 ) x 0 2 2 μ x 0 y 0 + y 0 2 Q L ( 1 + μ 2 / 3 ) π Q L 1 + μ 2 / 3 ,
where x 0 = R e { Y e i ϕ ( X ) i μ ρ } and y 0 = I m { Y e i ϕ ( X ) i μ ρ } . One can see that the expression (53) is much simpler than the exact result (29). Note that the distribution (53) has the following property
lim Q 0 P [ Y | X ] = δ Y Ψ 0 ( L ) .
The limit (54) is the deterministic limit of P [ Y | X ] in the absence of noise. Also Equation (53) has the correct limit for small γ :
lim γ 0 P [ Y | X ] = e | Y X | 2 / Q L π Q L ,
where the right-hand-side is the conditional PDF for the linear nondispersive channel with the additive noise.
Let us compare the result obtained in the leading in 1 / SNR order (53) with exact result (29). In Figure 1 we plot the PDF P [ Y | X ] as the function of | Y | for X = 2 mW 1 / 2 , arg ( Y ) = μ , μ = γ L | X | 2 = 4 , so we choose γ L = mW 1 , and for two values of parameter Q L : Q L = 1 / 2 mW , Q L = 1 / 25 mW (it corresponds to SNR = 8 and SNR = 100 , respectively).
One can see the good agreement between the exact result (29) and the approximation (53) even for SNR = 8. For SNR = 100 the approximation almost coincides with the exact result. In the case when the SNR 10 2 10 4 , which corresponds to optical fiber channels, the difference between the approximation and the exact result is of the order of 1 / SNR . To decrease this difference we should calculate the corrections of the order of 1 / SNR and 1 / SNR , see the Section 3.1.3.

3.1.2. Probability Density Function P o u t [ Y ]

Let us consider the integral (21) which defines the the probability density function of the output signal:
P o u t [ Y ] = D X P [ Y | X ] P X [ X ] ,
where the distribution P X [ X ] is a smooth function. Since the input signal power is P, we can expect that the function P X [ X ] changes on the scale X P which is assumed to be much greater than Q L . For this case we can use the Laplace’s method [27] to calculate the integral (56) up to the terms proportional to the noise power Q L , for details of the calculation see Appendix C in Ref. [19]. The idea of the integral (56) calculation is based on the fact that the function P [ Y | X ] is much narrower than the function P X [ X ] (the function P [ Y | X ] is almost Dirac delta-function for the function P X [ X ] ). Therefore, the calculation of the integral is simple and the result has the form [19]:
P o u t [ Y ] = D X P [ Y | X ] P X [ X ] P X Y e i γ | Y | 2 L .
When obtaining the result (57) it is not required to pass to the limit Q 0 but only the relation P Q L between the scales P and Q L is used.
For the class of the distributions P X [ X ] depending only on absolute value | X | we have P o u t [ Y ] = P X [ | Y | ] . For such distributions we can calculate corrections to (57) in the parameter Q L in any order in Q L .
Let us restrict our consideration in the remainder of this sub-subsection to the case of the distributions P X [ X ] depending only on | X | . We can use conditional PDF P [ Y | X ] (29) found in Ref. [17]. In this case the function P o u t [ Y ] depends only on | Y | = ρ :
P o u t [ ρ ] = 2 e ρ 2 / ( Q L ) Q L 0 d ρ ρ e ρ 2 / ( Q L ) I 0 2 ρ ρ Q L P X [ ρ ] .
Using this formula one can obtain the simple relation for P o u t [ ρ ] within the perturbation theory in the parameter Q L . Performing the zero order Hankel transformation [27]:
P ^ [ k ] = 0 d ρ ρ J 0 ( k ρ ) P X [ ρ ]
for both sides of Equation (58), and using the standard integral with the Bessel J ν ( x ) and modified Bessel functions [28]
0 d z z e p z 2 J ν b z I ν ( c z ) = 1 2 p J ν b c 2 p e c 2 b 2 4 p ,
we arrive at the relation between the Hankel images of the output and input signal PDFs:
P ^ o u t [ k ] = e k 2 Q L 4 P ^ [ k ] .
Then, we perform the inverse Hankel transformation
P X [ ρ ] = 0 d k k J 0 ( k ρ ) P ^ [ k ] ,
and obtain the following important result
P o u t [ ρ ] = e Q L 4 Δ ρ P X [ ρ ] ,
where Δ ρ = d 2 d ρ 2 + 1 ρ d d ρ is the two-dimensional radial Laplace operator. The relation (62) allows us to find the corrections of orders of ( Q L ) n to P o u t [ ρ ] by the expansion of the exponent and calculation of the action of the differential operator Δ ρ n on the input PDF P X [ ρ ] .
Let us consider the widely used example of the modified Gaussian distribution with one parameter β :
P X ( β ) [ X ] = exp β | X | 2 / ( 2 P ) | X | β 2 π Γ β / 2 2 P / β β / 2 ,
where Γ x is the Euler gamma function. In the case of β > 0 the distribution P X ( β ) [ X ] obeys the conditions: 2 π 0 d ρ ρ P X ( β ) [ ρ ] = 1 , 2 π 0 d ρ ρ 3 P X ( β ) [ ρ ] = P . The last one means that the average power of the input signal is equal to P. The generalized distribution P X ( β ) [ X ] for β = 1 is called as the half-Gaussian distribution
P X ( 1 ) [ X ] = exp | X | 2 / ( 2 P ) π 2 π P | X | ,
and the Gaussian distribution for β = 2 :
P X ( 2 ) [ X ] = e | X | 2 / P π P .
Inserting (63) into Equation (58) we arrive at the standard integral, see [28], and obtain:
P o u t ( β ) [ Y ] = 1 F 1 β 2 ; 1 ; | Y | 2 2 P Q L ( 2 P + β Q L ) exp { | Y | 2 / Q L } π Q L β Q L 2 P + β Q L β / 2 ,
where 1 F 1 ( β / 2 ; 1 ; z ) is the confluent hypergeometric function. The function reduces to e z for the case of the Gaussian distribution, and to the expression e z / 2 I 0 ( z / 2 ) for the case of the half-Gaussian one:
P o u t ( 2 ) [ Y ] = e | Y | 2 / ( P + Q L ) π ( P + Q L ) ,
P o u t ( 1 ) [ Y ] = 1 π Q L ( 2 P + Q L ) I 0 | Y | 2 P Q L ( 2 P + Q L ) exp | Y | 2 ( P + Q L ) Q L ( 2 P + Q L ) .
The result (57) can be obtained from Equation (68) in the case Q L | Y | 2 P . Expanding the right-hand side of the Equation (67) in the parameter Q L / P we obtain
P o u t ( 1 ) [ Y ] P X ( 1 ) [ | Y | ]
with accuracy O ( Q L ) . The result (69) coincides with the general relation (57).

3.1.3. Lower Bound for the Channel Capacity

The estimates for the capacity of the per-sample model in the regime of very large SNR were obtained in Ref. [17]. To specify, it was considered the case P P n o i s e γ 2 L 2 1 , where P n o i s e = Q L is the noise power. In Ref. [17] the lower bound for the capacity of the per-sample channel was found. Using the trial half-Gaussian input signal PDF (64) authors obtained the following result:
C log SNR 2 + 1 + γ E log ( 4 π ) 2 + O log ( SNR ) SNR ,
where γ E 0.5772 is the Euler constant. The second term on the right-hand side of Equation (30) was presented as O ( 1 ) in Ref. [17] but it can be found using Equations (23) and (24) of Ref. [17]. Comparing the result (70) with the Shannon result (1) it is worth noting that the pre-logarithmic factor 1 / 2 differs from the unity in the Shannon result. The physical meaning of the difference is that the signal’s phase does not carry information when the power is very large: P P n o i s e γ 2 L 2 1 , see Ref. [17].
The most interesting power regime for the per-sample model is so-called intermediate power range defined in (32). In the regime we have on the one hand the parameter SNR is large, P n o i s e P , and on the other hand the signal-dependent phase does not yet occupy the entire phase interval [ 0 , 2 π ] due to the signal-to-noise interaction, i.e., the phase still carries information.
Capacity estimates in the intermediate power range (32) were presented in Ref. [18]. For such a power P the authors of this paper used the half-Gaussian input signal PDF (64) for the estimate of the lower bound for the capacity as well, see inequality (40) in Ref. [18]. But there were some flaws in the derivation of this inequality in Ref. [18], see the discussion in the Introduction of Ref. [19]. In our approach presented in Ref. [19] we solved a variational problem for the mutual information (24) with the normalization (22) and power restriction (26): we found both the optimal input signal distribution P X [ X ] maximizing the mutual information (24) in the leading and next-to-leading orders in 1 / SNR in the intermediate power range. Let us proceed with this calculation.
To begin with, when the parameter SNR 1 we can calculate the output signal entropy H [ Y ] by substituting P X Y exp i γ | Y | 2 L to Equation (20) instead of P o u t [ Y ] due to the relation (57), and then performing the change of the integration variable ϕ = ϕ ( Y ) + γ | Y | 2 L we obtain
H [ Y ] 0 2 π d ϕ 0 d ρ ρ P X ρ e i ϕ log P X ρ e i ϕ .
Note that the output signal entropy in the form (71) coincides with the input signal entropy H [ X ] with the accuracy O ( Q L ) .
Secondly, to obtain the conditional entropy H [ Y | X ] we substitute the conditional PDF P [ Y | X ] in the form of Equation (53) into Equation (19). Then we change the integration variables D Y d R e Y d I m Y to d x 0 d y 0 , and perform the integration over x 0 , y 0 . The result has the form:
H [ Y | X ] = log ( e π Q L ) + 0 2 π d ϕ 0 d ρ ρ P X ρ e i ϕ log 1 + γ 2 L 2 ρ 4 / 3 .
Thirdly, to find the optimal input signal distribution P X o p t [ X ] we solve the variational problem for the functional J [ P X , λ 1 , λ 2 ] :
J [ P X , λ 1 , λ 2 ] = H [ Y ] H [ Y | X ] λ 1 D X P X [ X ] 1 λ 2 D X P X [ X ] | X | 2 P ,
where λ 1 , 2 are Lagrange multipliers which corresponds to the normalization condition (22) and the condition (26) of the fixed average signal power P. The solution P X o p t [ X ] of the corresponding Euler–Lagrange equations for (73) referred to as the “optimal” distribution:
P X o p t [ X ] = N 0 ( P ) e λ 0 ( P ) | X | 2 1 + γ 2 L 2 | X | 4 / 3 ,
where coefficients N 0 ( P ) and λ 0 ( P ) are determined from the conditions:
D X P X o p t [ X ] = 2 π N 0 ( P ) 0 d ρ ρ e λ 0 ( P ) ρ 2 1 + γ 2 L 2 ρ 4 / 3 = 1 ,
D X P X o p t [ X ] | X | 2 = 2 π N 0 ( P ) 0 d ρ ρ 3 e λ 0 ( P ) ρ 2 1 + γ 2 L 2 ρ 4 / 3 = P .
Note that in the leading order in the parameter 1 / SNR the function P X o p t [ X ] depends only on | X | . In the next-to-leading the order in 1 / SNR this property holds true as well [20].
In the parametric form the power dependance of the parameters λ 0 and N 0 reads
λ 0 ( P ) = γ L 3 α , N 0 ( P ) = γ L π 3 G ( α ) ,
where G ( α ) = π 2 H 0 ( α ) Y 0 ( α ) , Y 0 ( α ) and H 0 ( α ) are Neumann and Struve functions of zero order, respectively. The parameter α ( P ) is the real solution of the equation
d d α log G ( α ) = γ L P / 3 .
Note that the optimal input signal distribution P X o p t [ X ] (74) differs from the half-Gaussian distribution (64).
For sufficiently large values of the power P, log ( γ P L ) 1 , we use the asymptotics of Y 0 ( α ) and H 0 ( α ) at small α and arrive at the following result for λ 0 ( P ) and N 0 ( P ) :
λ 0 ( P ) 1 log log ( B γ ˜ ) / log ( B γ ˜ ) P log ( B γ ˜ ) , N 0 ( P ) γ ˜ π P log 1 B γ ˜ / ( P λ 0 ( P ) ) .
Here and below we use the notation
B = 2 e γ E ,
γ ˜ = γ L P / 3
is the convenient dimensionless nonlinear parameter. At small P, such that the nonlinearity parameter γ ˜ 1 , the solutions of the Equations (75) and (76) have the form:
λ 0 ( P ) 1 2 γ ˜ 2 P , N 0 ( P ) 1 γ ˜ 2 π P .
Note that at γ ˜ 0 the optimal distribution (74) passes to the Gaussian distribution (65). It is known that this distribution is optimal for a linear channel [1]. In Ref. [20] using the same method we found the first correction to P X o p t [ X ] proportional to Q L .
The fourth, to calculate the mutual information we substitute the expression (74) for P X o p t [ X ] in Equations (71) and (72) and using the definition (24) we obtain
I P X o p t [ X ] = C 0 = P λ 0 ( P ) log N 0 ( P ) log ( π e Q L ) .
The last equation gives the mutual information I P X o p t [ X ] with the accuracy O ( Q L ) .
At small γ ˜ we obtain
I P X o p t [ X ] log SNR γ ˜ 2 .
This result is the Shannon capacity log 1 + SNR at large SNR of the linear channel (1) with the first nonlinear correction.
In the high power sub-interval ( γ L ) 1 P Q L 3 γ 2 1 using Equation (78) one can obtain the following asymptotics for the mutual information in the case of very large nonlinearity parameter ( log ( γ ˜ ) 1 ):
I P X o p t [ X ] = log log B γ L P / 3 log Q L 2 γ e / 3 + 1 log B γ L P / 3 log log B γ L P / 3 + 1 log log B γ L P / 3 log B γ L P / 3 .
This expression is obtained with the accuracy 1 / log 2 ( γ ˜ ) . One can see that the first term on right-hand side of the Equation (84) grows as log log P . It means that the mutual information I P X o p t [ X ] also grows as log log P at large enough P.
Note that the results (74), (82), and asymptotics (78), (81), (83), (84) are obtained in the leading and next-to-leading order in the parameter 1 / SNR . Therefore, the results (74), (82) are calculated with the accuracy O 1 / SNR . However, in the literature the bounds of the capacity rather than the asymptotic estimates are on the carpet. Therefore, to find the lower bound of the channel capacity for the per-sample model it is necessary to calculate the corrections of the order of 1 / SNR and define their signs. Moreover, to find the applicability region of the result (82) we have to know the corrections of the order of 1 / SNR as well. The applicability region will be defined by the condition that the corrections of the order of 1 / SNR are much less than obtained results (74), (82). In Ref. [20] we have calculated these corrections using the approach described in Section 3.1. The calculation of these corrections is straightforward but cumbersome, therefore, here we present only the idea of the calculation and demonstrate the results for these corrections for the mutual information. The detailed calculation can be found in Ref. [20].
To calculate the correction to the mutual information we should know the corrections to the conditional probability density function (53). Therefore, we should calculate the corrections both to the action S and to the normalization factor Λ in Equation (34). To find these corrections we have to calculate the function ϰ ( z ) , see Equation (51), in the leading, next-to-leading, and next-to-next-to-leading orders in the parameter 1 / SNR . Then we should substitute the found corrections for the action S to the path integral (35), and calculate the path integral up to the terms which are proportional to the parameter Q. After that, we expand the product of the exponent and the normalization factor Λ up to the terms of order of 1 / SNR . This expression is cumbersome, and therefore, we do not present it here, but it can be found in Ref. [20]. To calculate the corrections to the mutual information we substitute the obtained result for P [ Y | X ] to the Equations (19)–(21), (24). Then, using the method described above, we perform maximization of the mutual information calculated with the accuracy 1 / SNR , and obtain the following result for it:
I P X o p t [ X ] ( 1 ) = C 0 + Δ C ,
where C 0 is defined in (82). The correction Δ C has the form:
Δ C = 1 SNR π N 0 P 214 375 8 375 λ 0 P γ ˜ 2 + λ 0 P 137 150 + 8 375 λ 0 P γ ˜ 2 347 750 ( λ 0 P ) 2 .
The quantity Δ C corresponds to the first non-vanishing correction to the mutual information. One can verify that for the small parameter γ L 2 Q 1 , the correction (86) is always small with respect to C 0 . Indeed, the ratio of the expression in the curly brackets in Equation (86) and γ ˜ is the bounded function for all values of γ ˜ .
We do not have the explicit analytical result for the correction Δ C for the arbitrary parameter γ ˜ , since the quantities λ 0 and N 0 are the solutions of the nonlinear equations. But the correction Δ C can be calculated numerically for any parameter γ ˜ . However, for the small and large parameters γ ˜ we can calculate the asymptotics of the quantity Δ C analytically.
For the small nonlinearity γ ˜ 1 we substitute the parameters λ 0 and N 0 in the form (81) to the Equation (86) and obtain:
Δ C 1 SNR 1 SNR γ ˜ 2 3 .
Using this result and the asymptotics (83) we obtain the mutual information within our accuracy in the form:
I P X o p t [ X ] ( 1 ) log ( 1 + SNR ) γ ˜ 2 1 SNR γ ˜ 2 3 .
One can see that the first term in the right-hand side is the the capacity of the linear channel, the second and the third ones correspond to the nonlinear corrections. The nonlinear corrections in (88) are negative and they decrease the mutual information I P X o p t [ X ] ( 1 ) of the channel.
Let us consider the behavior of the correction Δ C at large power P. For the case log ( γ L P ) 1 (but P ( γ 2 Q L 3 ) 1 to be within the intermediate power range) we obtain the simple result:
Δ C 1 SNR 214 375 π N 0 P .
Using the asymptotic (78) for quantity N 0 we arrive at the expression
Δ C γ L 2 Q 3 214 375 log B γ ˜ log ( B γ ˜ ) + log log ( B γ ˜ ) log ( B γ ˜ ) 1 .
Note that this correction is suppressed parametrically as γ L 2 Q instead of 1 / SNR = Q L / P . One can see the correction decreases at large γ ˜ as 1 / log γ ˜ . It is interesting that at large γ ˜ the correction Δ C is positive, therefore, it slightly enhances the mutual information I P X o p t [ X ] ( 1 ) . Since the correction Δ C is positive in the region defined as log ( γ L P ) 1 , P ( γ 2 Q L 3 ) 1 , and the next-to-leading corrections to the mutual information are suppressed parametrically, the quantity C 0 is the lower bound of the per-sample channel capacity:
C C 0 .
Since there are no corrections of order of γ 2 L 3 Q P at large P, see Equation (90), we expect the next correction containing the power P to be of order of ( γ 2 L 3 Q P ) 2 , see Ref. [19]. Therefore, the applicability region at large P for the quantity C 0 is determined by the condition ( γ 2 L 3 Q P ) 2 1 . For the given small parameter γ 2 L 3 Q P this condition extends the applicability region for the lower bound of the channel capacity C 0 . For realistic channel parameters presented in the Table 1.
We have the following intermediate power range:
1.5 × 10 4 mW P 0.66 × 10 4 mW .
One can see that the range is very wide. For the presented parameters we have numerically calculated the lower bound C 0 using Equations (77) and (82). The result of the calculation is presented in Figure 2. Also in Figure 2 we present the comparison of the approximation (85) with the Shannon capacity of a linear channel and with the asymptotic capacity bound (70).
One can see that the Shannon capacity of the linear channel with the additive noise (the red dashed-dotted line in the Figure 2) is always greater than the lower bound (82) for the nondispersive nonlinear fiber channel (the black solid line in the Figure 2) for the considered range of P. But the lower bound (82) is greater than the asymptotic capacity bound (70) in the intermediate power range.
In Ref. [22] the comparison of our result (82) for the NLSE per-sample model (authors referred to our model as the memoryless NLS channel (MNC)) with two other models, more precisely, the regular perturbative channel (RPC) and the logarithmic perturbative channel (LPC), was performed. The comparison was illustrated in Figure 1 of Ref. [22], and we present this figure in the Figure 3.
The authors claimed that they established a novel upper bound on the capacity of the NLSE per-sample model (the violet curve U M N C ( P ) in Figure 3). In addition, the authors considered various input signal distributions within MNC: half-Gaussian (64), Gaussian (65), and modified Gaussian distribution (63) optimized in parameter β (it is denoted as “Max-chi” in Figure 3). They used the following channel parameters: γ = 1.27 W 1 km 1 , L = 5000 km , Q L = 7.36 × 10 3 mW . For these parameters the upper limit of the intermediate power range (we choose it as P m a x = 6 π 2 Q γ 2 L 3 1 , see [18]) is estimated as P m a x = 0.2 W = 23 dBm . It is obvious in Figure 3, that up to this power P m a x our result (green solid line) is consistent with the capacity upper bound of the per-sample model (the violet curve U M N C ) and it exceeds other mutual information curves for other input signal distributions in all intermediate power range. Of course, for large powers ( P P m a x ) our input signal distribution (74) is not the optimal any more, and the mutual information (85) underestimates the real capacity. In the strict sense, the corrections to (85) are small only up to P 5.5 dBm , and when P > 5.5 dBm we have the transition range from the intermediate power range with the optimal input signal distribution (74) to the large power semirange where the optimal distribution is believed to be the half-Gaussian one, see [17,18]. The large extension of the transition range (up to P m a x 23 dBm ) can be explained by the smallness of the next-to-leading order corrections (86) and its decreasing when increasing the power, see Equation (90).
To finalize the per-sample channel consideration we emphasize the main results. We developed the path integral approach to the calculation of the conditional PDF P [ Y | X ] for large SNR . We demonstrated that for the nonlinear nondispersive channel the lower bound C 0 of the capacity increases only as log log P at large signal power P instead of the behavior log P that is specific for the channel with zero nonlinearity. To determine the applicability region and the accuracy of the found quantity C 0 we calculated the first non-zero correction Δ C proportional to the noise power Q L in the intermediate power range Q L P ( γ 2 L 3 Q ) 1 . We demonstrated that the quantity Δ C is small in the intermediate power range, and it is the positive decreasing function at large signal power P: ( γ L ) 1 P ( γ 2 L 3 Q ) 1 . The found result is in agreement with recent results of Ref. [22].

3.2. Extended Model: Considerations of the Time Dependent Input Signals

Let us start this section from the consideration of the conditional probability density function P [ Y | X ] . In the case of the nontrivial time dependance of the input X ( t ) and output Y ( t ) signals the conditional PDF reads Ref. [11]:
P [ Y ( t ) | X ( t ) ] = ψ ( 0 , t ) = X ( t ) ψ ( L , t ) = Y ( t ) D ψ e S [ ψ ] / Q ,
where the effective action S [ ψ ] has the form: S [ ψ ] = 0 L d z d t | z ψ ( z , t ) i γ | ψ ( z , t ) | 2 ψ ( z , t ) | 2 ; the integration measure D ψ ( z , t ) depends on both z and t discretization scheme:
D ψ ( z , t ) = Δ Δ z π Q 2 M j = M M 1 i = 1 N z 1 Δ Δ z π Q d R e ψ ( z i , t j ) d I m ψ ( z i , t j ) ,
where Δ = T / ( 2 M ) = ( 2 N + 1 ) T 0 / ( 2 M ) is the time grid spacing and Δ z = L / N z is the z-coordinate grid spacing. Of course, the expression (93) contains all information about the transmitted signal and its interaction with the noise, but as strange as it sounds, the expression (93) contains redundant information (i.e., degrees of freedom which cannot be detected). The point is that in the realistic communication channel the receiver has the finite bandwidth, it means that it reduces somehow the bandwidth of the received signal Y ( t ) . After that, there is a procedure of the extraction of the information from the signal measured by receiver. This detection procedure should be implemented in the function P [ Y ( t ) | X ( t ) ] .
To demonstrate the point, let us consider the input signal X ( t ) in the form (11). In such form, the number of the degrees of freedom in the path integral (93) is infinite, if the function X ( t ) is continuous, or 2 M degrees of freedom if we set the function X ( t ) in the discrete-time moments t i , see the text after Equation (17). However, the transmitted information is carried only by 2 N + 1 coefficients C n ( N M ), see Equation (11). It means that to obtain the conditional probability density function which describes the transmission of the information carried by the set of the coefficients C n we have to integrate over the redundant degrees of freedom. The approach based on the path integral representation (93) is general and it can be used for any signal model, any receiver and projecting procedure (18). For our signal model described in the Section 2.2, we can use a simpler method of the calculation of the conditional PDF P [ { C ˜ } | { C } ] . Below we describe the method.
In our model, the signal propagation for different time moments t j is independent because the dispersion is zero and the noise is not correlated for different time moments t i t j . Therefore, the conditional PDF P [ Y ( t ) | X ( t ) ] can be presented in the factorized form:
P [ Y ( t ) | X ( t ) ] = j = M M 1 P j [ Y j | X j ] ,
where X j = X ( t j ) , Y j = Y ( t j ) , and P j [ Y j | X j ] is per-sample conditional PDF described in the previous Section 3.1.1, where we should replace Y Y j , X X j , Q Q / Δ .
Our goal is to find the PDF P [ { C ˜ } | { C } ] in the leading order in the parameter 1 / SNR . Instead of calculation of the path integral, we build the PDF which reproduces all possible correlators of C ˜ k : C ˜ k 1 , C ˜ k 1 C ˜ k 2 , C ˜ k 1 C ˜ ¯ k n for the fixed input set { C } in the leading order in the parameter Q . These correlators read as
C ˜ k 1 C ˜ ¯ k n = j = M M 1 d 2 Y j P [ Y ( t ) | X ( t ) ] C ˜ k 1 C ˜ ¯ k n ,
where d 2 Y j = d Re Y j d Im Y j , and C ˜ k is defined in Equation (18). After substitution of Equations (95), (53), and (18) into Equation (96) and performing the integration we obtain in the leading order in the noise parameter Q :
C ˜ k = C k i C k Q L 2 γ Δ 1 i γ L | C k | 2 n 4 3 ,
C ˜ m C ˜ m C ˜ n C ˜ n = i δ m , n C m 2 Q L 2 γ T 0 n 4 2 i n 6 3 γ L | C m | 2 ,
C ˜ m C ˜ m C ˜ n C ˜ n ¯ = δ m , n Q L T 0 1 + 2 n 6 3 γ 2 L 2 | C m | 4 ,
where δ m , n is the Kronecker symbol and we have introduced the following notation for the integral of the s-th power of the pulse envelope function f ( t ) :
n s = T 0 / 2 T 0 / 2 d t T 0 f s ( t ) .
We remind that f ( t ) is assumed to be normalized by the condition n 2 = 1 . Note that the correlator C ˜ k C k is proportional to Q L / Δ = Q L W / ( 2 π ) , i.e., it is proportional to the total noise power in the whole bandwidth W . The reason for that is the bandwidth of the receiver coincides with the bandwidth of the noise. The correlators (98) and (99) are proportional to Q L / T 0 and do not depend on the discretization parameter Δ . It means that these correlators depend only on the bandwidth of the envelope function f ( t ) . So we obtain that in the leading order in the parameter Q the shift of the mean value C ˜ k due to the signal-noise interaction is proportional to the total noise power in the channel ( W = 2 π / Δ ), whereas the spread around the average value, see (98), (99), is proportional to the noise power containing in the bandwidth W of the pulse envelope (bandwidth of the pulse envelope coincides with the bandwidth of the signal X ( t ) ). Note that the higher-order corrections in parameter Q to the correlators are more complicated and contain the noise bandwidth, see details in Appendix A of Ref. [21]. The correlators of higher orders in C ˜ can be calculated in the leading order in the parameter Q using Equations (97)–(99).
To verify the analytical results (97)–(99) the numerical simulations of pulse propagation through a nonlinear nondispersive optical fiber were performed in Ref. [21]. After that the numerical results for correlators (97)–(99) were obtained. To find these correlators the Equation (4) was solved numerically for the fixed input signal X ( t ) and for different realizations of the noise η ( z , t ) . After that the detection procedure described by Equations (17) and (18) was applied. Finally, the averaging procedure over noise realizations for the coefficients C ˜ k was performed. Two numerical methods of the solution of Equation (4) were used: the split-step Fourier method and the Runge–Kutta method of the fourth order. It was shown that the numerical results do not depend on the numerical method and these results are in consistent with analytical ones for different realizations of the input pulse envelope f ( t ) and the noise bandwidth W . Below we present the comparison of the numerical and analytical results obtained in Ref. [21]. The numerical simulation was done for the following channel parameters, see the Table 2.
We choose the duration of one pulse as T 0 = 10 10 s. Simulations were performed for different t-meshes, i.e., for different time grid spacing Δ , and for different pulse envelopes. Different grid spacings Δ correspond to the different noise bandwidths W = 2 π / Δ for the fixed noise parameter Q . The numerical calculations were performed for the different Δ presented in the Table 3.
The different grid spacings determine the different widths of the conjugated ω -meshes in the frequency domain: 1 / Δ 1 = 10.26 THz, 1 / Δ 2 = 5.12 THz and 1 / Δ 3 = 2.56 THz. In Ref. [21] the different envelopes f ( t ) were considered as well. Here we present results only for the Gaussian envelope:
f ( t ) = T 0 T 1 π exp t 2 2 T 1 2 ,
where T 1 = T 0 / 10 = 10 11 s stands for the characteristic time scale of the function f ( t ) . Such relation between T 0 and T 1 means that the overlapping between different pulses is negligible. For pulses with envelope (101) the coefficients n s defined in Equation (100) are n 4 = T 0 / T 1 2 π 3.989 , n 6 = ( T 0 / T 1 ) 2 π 3 18.38 , n 8 = ( T 0 / T 1 ) 3 2 π π 89.79 . In Figure 4 and Figure 5 the real and imaginary parts of the quantity C k C ˜ k / C k as a function of | C k | 2 are depicted for different values of grid spacing Δ .
One can see the good agreement between analytical and numerical results depicted in Figure 4. There is some difference in the imaginary part of analytical and numerical results corresponding to grid spacing Δ 1 at large | C k | , see Figure 5. The reason is that, for the analytical results corresponding to Δ 1 it is necessary to take into account the next corrections in the parameter Q Ref. [21]. The numerical and analytical results for the correlators (98), (99) are also in a good agreement, for details see Ref. [21]. In this paper, it was demonstrated that the relative importance of the next-to-leading order corrections for the correlators (98) and (99) is governed by the dimensionless parameter Q L Δ γ L 2 γ L P , i.e., it increases linearly for increasing power P.
Now we can proceed to the search for the conditional probability density function. Using the correlators (97)–(99) we build the conditional PDF P [ { C ˜ } | { C } ] which reproduces all correlators of the coefficients C ˜ m in the leading order in parameter Q . Thus, the conditional PDF has the form [21]:
P [ { C ˜ } | { C } ] = m = N N P m [ C ˜ m | C m ] ,
where
P m [ C ˜ m | C m ] T 0 exp T 0 1 + 4 n 6 μ m 2 / 3 x m 2 + 2 x m y m μ m n 4 + y m 2 Q L 1 + ξ 2 μ m 2 / 3 π Q L 1 + ξ 2 μ m 2 / 3 ,
x m = Re e i ϕ m C ˜ m C m + i C m Q L 2 γ Δ 1 i γ L | C m | 2 n 4 3 ,
y m = Im e i ϕ m C ˜ m C m + i C m Q L 2 γ Δ 1 i γ L | C m | 2 n 4 3 ,
where ϕ m = arg C m , μ m = γ L | C m | 2 , and
ξ 2 = ( 4 n 6 3 n 4 2 ) .
The parameter ξ 2 obeys inequality ξ 2 n 6 > 0 due to Cauchy–Schwartz–Buniakowski inequality. For the Gaussian envelope (101) this parameter is
ξ 2 = 1 π T 0 T 1 2 4 3 3 2 0.258 T 0 T 1 2 ,
and for the chosen parameters T 1 = T 0 / 10 = 10 11 sec one obtains ξ 5.08 .
Equation (102) means that our channel decomposes to the 2 N + 1 independent information channels. Therefore, the function P m [ C ˜ m | C m ] describes the channel corresponding to the m-th time slot. The function P m [ C ˜ m | C m ] obeys the normalization condition
d 2 C ˜ m P m [ C ˜ m | C m ] = 1 .
Since there are 2 N + 1 independent channels, we can choose the input signal distribution P X [ { C m } ] in the factorized form:
P X [ { C } ] = k = N N P X ( k ) [ C k ] ,
and we can consider only one channel, say m-th channel. One can see that the presentation of the conditional PDF (103) is close to the presentation (53) for the per-sample PDF. Also the function P m [ C ˜ m | C m ] changes significantly when the variable C m changes on the value of order of Q L / T 0 for fixed value of C ˜ m . Such behavior coincides with that for the function P [ Y | X ] for the per-sample model. Below, we imply that
P Q L / Δ Q L / T 0 ,
where P is the mean power of the m-th pulse, i.e., the signal power is much greater than the noise power in the whole bandwidth W = 2 π / Δ and in the input signal bandwidth W. For the extended model we define the signal-to-noise ratio as
SNR = P T 0 Q L ,
on the assumption (110) we have SNR 1 . We also imply that the PDF of the input signal P X ( m ) [ C m ] is a smooth function that changes on a scale | C m | P . Therefore, using a consideration similar to that for the per-sample model we obtain that the PDF of the output signal reads
P o u t ( m ) [ C ˜ m ] = d 2 C m P m [ C ˜ m | C m ] P X ( m ) [ C m ]
in the leading order in the parameter 1 / SNR it has the form:
P o u t ( m ) [ C ˜ m ] P X ( m ) [ C ˜ m ] .
Just as a reminder, to obtain result (113) we perform the integration in Equation (112) using Laplace’s method [27], see details in Ref. [19].
Since we know the conditional PDF P m [ C ˜ m | C m ] and output signal PDF P o u t ( m ) [ C ˜ m ] the entropies H [ C ˜ m ] , H [ C ˜ m | C m ] and then mutual information I P X ( m ) can be calculated. The calculation is similar to that performed in the Section 3.1.3. After these calculations we solve the variational problem for the mutual information I P X ( m ) , and find the optimal input signal PDF in the form:
P o p t ( m ) [ C m ] = N 0 e λ | C m | 2 1 + ξ 2 γ 2 L 2 | C m | 4 / 3 .
Substituting the result (114) to the mutual information, we obtain the following result:
I P o p t ( m ) = log P T 0 π e Q L + P λ log P N 0 ,
where the parameters N 0 and λ are the solutions (as functions of power P) of the normalization conditions for the function (114):
0 d ρ 2 π N 0 ρ e λ ρ 2 1 + ξ 2 γ 2 L 2 ρ 4 / 3 = 1 , 0 d ρ 2 π N 0 ρ 3 e λ ρ 2 1 + ξ 2 γ 2 L 2 ρ 4 / 3 = P ,
where we have performed the change of variables, | C m | to ρ . Note that the results (115)–(116) are obtained in the leading order in the parameter 1 / SNR . Note that the expression (115) is obtained for one m-th pulse, therefore, to obtain the mutual information of the channel with the input signal (11) it is necessary to multiply the right-hand-side of Equation (115) by the number of the independent channels, i.e., 2 N + 1 .
One can see that relations in the Equation (116) coincide with ones in the Equation (76) after changing the parameter γ γ / ξ . Therefore, to obtain the results for the mutual information I P o p t ( m ) and its asymptotics for the extended model we can replace the parameter γ with ξ γ and parameter Q Q / T 0 in Equations (82)–(84). So, by the rescaling of Figure 2 we obtain the mutual information I P o p t ( m ) of the extended model, see Figure 6.
The asymptotics has the form:
I P o p t ( m ) log P T 0 Q L ξ 2 γ 2 L 2 P 2 3 ,
for ξ γ L P 1 , and for the case when power P obeys the conditions log ξ γ L P 1 , P T 0 / ( Q L 3 ξ 2 γ 2 ) we have the asymptotics
I P o p t ( m ) log log B ξ γ L P / 3 log Q L 2 ξ γ e / ( T 0 3 ) + 1 log B ξ γ L P / 3 log log B ξ γ L P / 3 + 1 log log B ξ γ L P / 3 log B ξ γ L P / 3 .
Note that the asymptotics (118) is obtained with accuracy 1 / log 2 ( ξ γ L P ) .
The found mutual information (115) is calculated for the fixed shape of the pulse envelope f ( t ) . It is worth noting that the pulse shape f ( t ) is encoded only through one parameter ξ , see Equation (106). Therefore, strictly speaking, to find the capacity one should find the maximum over all forms of the pulses. On the one hand, we can consider the expression (115) as the estimation of the capacity of the channel whose model implies the specific given shape of the pulse envelope f ( t ) . On the other hand, using Equation (118) for fixed power P one can increase the value of I P o p t ( m ) raising the parameter ξ 2 , say, decreasing the signal time width T 1 for the Gaussian pulse envelope, see Equation (107). However, the limitation of the intermediate power range P T 0 / ( Q L 3 ξ 2 γ 2 ) makes this parameter have an upper limit ξ max 2 γ L P ( γ L 2 Q / T 0 ) 1 , and the maximum value of the mutual information I P o p t ( m ) over pulse envelope parameters is attained for the argument B ξ max γ L P / 3 P T 0 Q L = SNR .
Note that the mutual information I P o p t ( m ) , see Equation (115), cannot be considered as the lower bound of the channel capacity since we do not know the sign of the next-to-leading order corrections in the parameter Q . However, the quantity I P o p t ( m ) is the estimation of the capacity with the accuracy O ( Q ) . The mutual information I P o p t ( m ) grows as log log P , see Equation (118), for sufficiently large average power P: ( ξ γ L ) 1 P T 0 / ( Q L 3 ξ 2 γ 2 ) . Note that we have obtained similar asymptotics for the per-sample model. The time dependence of the pulse leads to same asymptotics behavior with modified nonlinearity parameter γ ( γ ξ γ ).
It worth noting that all our calculations were performed under the assumption of the negligible overlapping of the pulses, see Equation (12). It means we imply that the corrections due to the overlapping effects are at least of the same order as the next-to-leading order corrections in the parameter Q L . We can satisfy this condition by choosing appropriate pulse parameters T 0 and T 1 .

4. Conclusions

In our review, we considered nondispersive nonlinear optical-fiber communication channel with the additive noise. We studied two different models of the channel: the per-sample model and the extended model. For these models, we present results of the calculation for the following theoretical information characteristics such as the output signal entropy, conditional entropy, mutual information in the leading order in the parameter 1 / SNR . To calculate these quantities two methods were developed for the calculation of the conditional probability density function P [ Y | X ] [19,21]. The first method which was used for the P [ Y | X ] calculation for the per-sample model is based on the path integral approach. In the approach, the path integral (28) was treated using the saddle point method, i.e., the expansion in the parameter 1 / SNR , see Refs. [19,20]. This method was used to obtain the expression for the conditional PDF P [ Y | X ] which is convenient for the analytical calculations of the output signal PDF P o u t [ Y ] , entropies, the mutual information, and the optimal input signal distribution in the leading [19], next-to-leading [20] orders in the parameter 1 / SNR . These calculations allow us to find the lower bound of the channel capacity for the per-sample model in the intermediate power range. The second method was applied to the investigation of the extended model, which contains such characteristics as the bandwidths of the input signal, the noise, and the receiver, and also takes into account the projection procedure (18). The second method is based on the calculation of the correlators of the output signal for the fixed input signal in the leading and next-to-leading orders in the noise parameter Q , see Ref. [21]. Using these correlators the conditional PDF P [ { C ˜ } | { C } ] which reproduces all these correlators in the leading order in the parameter 1 / SNR was constructed. The knowledge of the function P [ { C ˜ } | { C } ] allowed us to find the informational characteristics of the extended model [21].
We compared the results of our calculations for the per-sample model with limitations on the capacity obtained by other authors [4,17,18,22]. We demonstrated that the conditional PDF P [ Y | X ] presented even in the leading order in the parameter 1 / SNR reproduces the exact result (29) with high accuracy, see Figure 1. Using the expression (53) in the intermediate power range, we found the lower bound (82) of the capacity which is consistent with the recent results obtained in Ref. [22].
For the extended model [21] we present results of the calculation of the output signal correlators for the fixed input signal and demonstrate that the difference in the average value for the recovered coefficient C ˜ k and the input coefficient C k is proportional to the noise power containing in the total noise bandwidth, see Equation (97). Whereas the covariances (98), (99) are proportional to the noise power containing in the input signal bandwidth. This behavior of the mean value C ˜ k is related to the model of the receiver (i.e., the bandwidth of the receiver coincides with the bandwidth of the noise). The obtained analytical results (97)–(99) were confirmed by the direct numerical calculations, see Figure 4 and Figure 5. Therefore, the constructed conditional PDF P [ { C ˜ } | { C } ] contains information about the bandwidths of the signal, noise and receiver [21]. This result is in agreement with assertions made in Ref. [14]. Despite the dependance of the PDF P [ { C ˜ } | { C } ] on the noise bandwidth the mutual information calculated in the leading order in 1 / SNR for the extended model depends only on the noise power containing in the input signal bandwidth. Since we have not calculated the corrections in the parameter 1 / SNR to the mutual information (115), we can only to consider this quantity as the capacity estimation rather than the lower bound of the capacity.
The models considered in the present paper are widely-spaced from the modern communication systems where the coefficient of the second dispersion is not zero and the signal detection procedure differs from considered above. The effects related to the non-zero dispersion and properties of the receiver can significantly change the results for the mutual information obtained in our consideration. However, the methods described in the present paper may be useful for the consideration of real communication systems. The calculations performed in Refs. [29,30,31,32,33] is indicative of the possibility to use the presented methods for the capacity investigation of the nonlinear fiber-optical channels with non-zero dispersion.

Author Contributions

All authors performed the calculations together. The consideration of the per-sample model was prepared predominantly by I.S.T., whereas the consideration of the extended model was prepared predominantly by A.V.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partly funded by Russian Science Foundation grant number 17-72-30006, and partly by the Ministry of Education and Science of the Russian Federation.

Acknowledgments

The work of I.S.T. Terekhov was supported by the Russian Science Foundation (Grant No. 17-72-30006). The work of A.V.R. Reznichenko was supported by the Ministry of Education and Science of the Russian Federation.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
PDFProbability density function
NLSENonlinear Schrödinger equation
SNRSignal-to-noise power ratio
MNCMemoryless nonlinear Schrödinger channel
RPCRegular perturbative channel
LPCLogarithmic perturbative channel

References

  1. Shannon, C. A mathematical theory of communication. Bell Syst. Tech. J. 1948, 27, 379–423; 623–656. [Google Scholar] [CrossRef] [Green Version]
  2. Ilyas, M.; Mouftah, H.T. The Handbook of Optical Communication Networks; CRC Press LLC: Boca Raton, FL, USA, 2003. [Google Scholar]
  3. Haus, H.A. Quantum noise in a solitonlike repeater system. J. Opt. Soc. Am. B 1991, 8, 1122–1126. [Google Scholar] [CrossRef]
  4. Mecozzi, A. Limits to long-haul coherent transmission set by the Kerr nonlinearity and noise of the in-line amplifiers. J. Light. Technol. 1994, 12, 1993–2000. [Google Scholar] [CrossRef]
  5. Iannoe, E.; Matera, F.; Mecozzi, A.; Settembre, M. Nonlinear Optical Communication Networks; John Wiley & Sons: New York, NY, USA, 1998. [Google Scholar]
  6. Zakharov, V.E.; Shabat, A.B. Exact theory of two-dimensional self-focusing and one-dimensional self-modulation of waves in nonlinear media. Soviet J. Exp. Theory Phys. 1972, 34, 62–69. [Google Scholar]
  7. Novikov, S.; Manakov, S.V.; Pitaevskii, L.P.; Zakharov, V.E. Theory of Solitons: The Inverse Scattering Method (Monographs in Contemporary Mathematics); Springer: Rumford, ME, USA, 1984. [Google Scholar]
  8. Essiambre, R.-J.; Foschini, G.J.; Kramer, G.; Winzer, P.J. Capacity Limits of Information Transport in Fiber-Optic Networks. Phys. Rev. Lett. 2008, 101, 163901. [Google Scholar] [CrossRef]
  9. Essiambre, R.-J.; Kramer, G.; Winzer, P.J.; Foschini, G.J.; Goebel, B. Capacity Limits of Optical Fiber Networks. J. Light. Technol. 2010, 28, 662–701. [Google Scholar] [CrossRef]
  10. Sorokina, M.A.; Turitsyn, S.K. Regeneration limit of classical Shannon capacity. Nat. Commun. 2014, 5, 3861. [Google Scholar] [CrossRef] [Green Version]
  11. Terekhov, I.S.; Vergeles, S.S.; Turitsyn, S.K. Conditional Probability Calculations for the Nonlinear Schrödinger Equation with Additive Noise. Phys. Rev. Lett. 2014, 113, 230602. [Google Scholar] [CrossRef]
  12. Liga, G.; Xu, T.; Alvarado, A.; Killey, R.I.; Bayvel, P. On the performance of multichannel digital backpropagation in high-capacity long-haul optical transmission. Opt. Express 2014, 22, 30053–30062. [Google Scholar] [CrossRef] [Green Version]
  13. Semrau, D.; Xu, T.; Shevchenko, N.; Paskov, M.; Alvarado, A.; Killey, R.; Bayvel, P. Achievable information rates estimates in optically amplified transmission systems using nonlinearity compensation and probabilistic shaping. Opt. Lett. 2017, 42, 121–124. [Google Scholar] [CrossRef]
  14. Kramer, G. Autocorrelation Function for Dispersion-Free Fiber Channels with Distributed Amplification. IEEE Tran. Inf. Theory 2018, 64, 5131–5155. [Google Scholar] [CrossRef]
  15. Tang, J. The Shannon channel capacity of dispersion-free nonlinear optical fiber transmission. J. Light. Technol. 2001, 19, 1104–1109. [Google Scholar] [CrossRef]
  16. Tang, J. The multispan effects of Kerr nonlinearity and amplifier noises on Shannon channel capacity of a dispersion-free nonlinear optical fiber. J. Light. Technol. 2001, 19, 1110–1115. [Google Scholar] [CrossRef]
  17. Turitsyn, K.S.; Derevyanko, S.A.; Yurkevich, I.V.; Turitsyn, S.K. Information Capacity of Optical Fiber Channels with Zero Average Dispersion. Phys. Rev. Lett. 2003, 91, 203901. [Google Scholar] [CrossRef] [Green Version]
  18. Yousefi, M.I.; Kschischang, F.R. On the Per-Sample Capacity of Nondispersive Optical Fibers. IEEE Trans. Inf. Theory 2011, 57, 7522–7541. [Google Scholar] [CrossRef]
  19. Terekhov, I.S.; Reznichenko, A.V.; Kharkov, Y.A.; Turitsyn, S.K. The loglog growth of channel capacity for nondispersive nonlinear optical fiber channel in intermediate power range. Phys. Rev. E 2017, 95, 062133. [Google Scholar] [CrossRef] [Green Version]
  20. Panarin, A.A.; Reznichenko, A.V.; Terekhov, I.S. Next-to-leading order corrections to capacity for nondispersive nonlinear optical fiber channel in intermediate power region. Phys. Rev. E 2016, 95, 012127. [Google Scholar] [CrossRef] [Green Version]
  21. Reznichenko, A.V.; Smirnov, S.V.; Chernykh, A.I.; Terekhov, I.S. The loglog growth of channel capacity for nondispersive nonlinear optical fiber channel in intermediate power range. Extension of the model. Phys. Rev. E 2019, 99, 012133. [Google Scholar] [CrossRef] [Green Version]
  22. Keykhosravi, K.; Durisi, G.; Agrell, E. Accuracy Assessment of Nondispersive Optical Perturbative Models Through Capacity Analysis. Entropy 2019, 21, 760. [Google Scholar] [CrossRef] [Green Version]
  23. Mitra, P.; Stark, J.B. Nonlinear limits to the information capacity of optical fibre communications. Nature 2001, 411, 1027–1030. [Google Scholar] [CrossRef]
  24. Hochberg, D.; Molina-Paris, C.; Perez-Mercader, J.; Visser, M. Effective action for stochastic partial differential equations. Phys. Rev. E 1999, 60, 6343–6360. [Google Scholar] [CrossRef] [Green Version]
  25. Gardiner, C.W. Handbook of Stochastic Methods for Physics, Chemistry and the Natural Sciences; Springer: New York, NY, USA, 1985. [Google Scholar]
  26. Feynman, R.P.; Hibbs, A.R. Quantum Mechanics and Path Integrals; McGraw-Hill: New York, NY, USA, 1965. [Google Scholar]
  27. Lavrentiev, M.A.; Shabat, B.V. Method of Complex Function Theory; Nauka: Moscow, Russisa, 1987. [Google Scholar]
  28. Gradshtein, I.S.; Ryzik, I.M. Table of Integrals and Series, and Products; Academic Press: Orlando, FL, USA, 2014. [Google Scholar]
  29. Terekhov, I.S.; Reznichenko, A.V.; Turitsyn, S.K. Calculation of mutual information for nonlinear communication channel at large SNR. Phys. Rev. E 2016, 94, 042203. [Google Scholar] [CrossRef] [Green Version]
  30. Reznichenko, A.V.; Terekhov, I.S. Channel Capacity and Simple Correlators for Nonlinear Communication Channel at Large SNR and Small Dispersion. IEEE Int. Symp. Inf. Theory Proc. 2018, 2018, 186–190. [Google Scholar]
  31. Reznichenko, A.V.; Terekhov, I.S.; Turitsyn, S.K. Calculation of mutual information for nonlinear optical-fiber communication channel at large SNR within path-integral formalism. J. Phys. Conf. Ser. 2017, 826, 012026. [Google Scholar] [CrossRef]
  32. Reznichenko, A.V.; Terekhov, I.S. Channel Capacity Calculation at Large SNR and Small Dispersion within Path-Integral Approach. J. Phys. Conf. Ser. 2018, 999, 012016. [Google Scholar] [CrossRef]
  33. Reznichenko, A.V.; Terekhov, I.S. Investigation of Nonlinear Communication Channel with Small Dispersion via Stochastic Correlator Approach. J. Phys. Conf. Ser. 2019, 1206, 012013. [Google Scholar] [CrossRef] [Green Version]
Figure 1. The dependence of the probability density function P [ Y | X ] on | Y | for X = 2 mW 1 / 2 and the argument arg ( Y ) = μ = 4 . The plot (a) corresponds to signal-to-noise power ratio ( SNR ) = 8, the plot (b) corresponds to SNR = 100 . The solid line and dashed line correspond to the exact expression (29) and approximation (53), correspondingly.
Figure 1. The dependence of the probability density function P [ Y | X ] on | Y | for X = 2 mW 1 / 2 and the argument arg ( Y ) = μ = 4 . The plot (a) corresponds to signal-to-noise power ratio ( SNR ) = 8, the plot (b) corresponds to SNR = 100 . The solid line and dashed line correspond to the exact expression (29) and approximation (53), correspondingly.
Entropy 22 00607 g001
Figure 2. Shannon capacity, the lower bound C 0 , and the asymptotic capacity bound (70) for the channel parameters from the Table 1. The red dashed-dotted line corresponds to the Shannon limit log ( 1 + SNR ) , the black solid line corresponds to the lower bound C 0 , see Equation (82), the blue dashed line corresponds to the bound (70).
Figure 2. Shannon capacity, the lower bound C 0 , and the asymptotic capacity bound (70) for the channel parameters from the Table 1. The red dashed-dotted line corresponds to the Shannon limit log ( 1 + SNR ) , the black solid line corresponds to the lower bound C 0 , see Equation (82), the blue dashed line corresponds to the bound (70).
Entropy 22 00607 g002
Figure 3. [This figure is taken from Ref. [22]]. Capacity bounds for the RPC and LPC models of Ref. [22], together with the capacity of the per-sample NLSE model. The channel parameters are as follows: γ = 1.27 W 1 km 1 , L = 5000 km , Q L = 7.36 × 10 3 mW . Our result (82) is presented by the green solid line (“lower bound in [23]”). The intermediate power range goes out from P 20 dBm to P 6 dBm .
Figure 3. [This figure is taken from Ref. [22]]. Capacity bounds for the RPC and LPC models of Ref. [22], together with the capacity of the per-sample NLSE model. The channel parameters are as follows: γ = 1.27 W 1 km 1 , L = 5000 km , Q L = 7.36 × 10 3 mW . Our result (82) is presented by the green solid line (“lower bound in [23]”). The intermediate power range goes out from P 20 dBm to P 6 dBm .
Entropy 22 00607 g003
Figure 4. [This figure is taken from Ref. [21]]. The real part of the relative difference of the coefficient C k and the correlator (97) in units 10 3 as a function of | C k | 2 , see [21]. Dashed-doted, dashed, and solid lines correspond to an analytic representation (97) for time grid spacings Δ 1 , Δ 2 , Δ 3 , respectively. Circles, squares, and diamonds correspond to numerical results for time grid spacings Δ 1 , Δ 2 , Δ 3 , respectively.
Figure 4. [This figure is taken from Ref. [21]]. The real part of the relative difference of the coefficient C k and the correlator (97) in units 10 3 as a function of | C k | 2 , see [21]. Dashed-doted, dashed, and solid lines correspond to an analytic representation (97) for time grid spacings Δ 1 , Δ 2 , Δ 3 , respectively. Circles, squares, and diamonds correspond to numerical results for time grid spacings Δ 1 , Δ 2 , Δ 3 , respectively.
Entropy 22 00607 g004
Figure 5. [This figure is taken from Ref. [21]]. The imaginary part of the relative difference of the coefficient C k and the correlator (97) in units 10 3 as a function of | C k | 2 , see [21]. Dashed doted, dashed, and solid lines correspond to analytic representation (97) for time grid spacings Δ 1 , Δ 2 , Δ 3 , respectively. Circles, squares, and diamonds correspond to numerical results for time grid spacings Δ 1 , Δ 2 , Δ 3 , respectively.
Figure 5. [This figure is taken from Ref. [21]]. The imaginary part of the relative difference of the coefficient C k and the correlator (97) in units 10 3 as a function of | C k | 2 , see [21]. Dashed doted, dashed, and solid lines correspond to analytic representation (97) for time grid spacings Δ 1 , Δ 2 , Δ 3 , respectively. Circles, squares, and diamonds correspond to numerical results for time grid spacings Δ 1 , Δ 2 , Δ 3 , respectively.
Entropy 22 00607 g005
Figure 6. [This figure is taken from Ref. [21]]. Shannon capacity and the mutual information I P o p t ( m ) for the parameters which are presented in the Table 2, T 0 = 10 10 sec, and for the Gaussian shape (101) of f ( t ) . The black dotted line corresponds to the Shannon limit log P T 0 Q L , the black solid line corresponds to I P o p t ( m ) , see Equation (115), the black dashed dotted line corresponds to the asymptotics (118) for large γ L P .
Figure 6. [This figure is taken from Ref. [21]]. Shannon capacity and the mutual information I P o p t ( m ) for the parameters which are presented in the Table 2, T 0 = 10 10 sec, and for the Gaussian shape (101) of f ( t ) . The black dotted line corresponds to the Shannon limit log P T 0 Q L , the black solid line corresponds to I P o p t ( m ) , see Equation (115), the black dashed dotted line corresponds to the asymptotics (118) for large γ L P .
Entropy 22 00607 g006
Table 1. Channel parameters for the per-sample model.
Table 1. Channel parameters for the per-sample model.
γ [(km×mW) 1 ]Q [mW/(km)]L [km]
10 3 1.5 × 10 7 10 3
Table 2. Channel parameters for the extended model.
Table 2. Channel parameters for the extended model.
γ [(km×W) 1 ] Q [W/(km×Hz)]L [km]
1.25 5.94 × 10 21 800
Table 3. Time grid spacings.
Table 3. Time grid spacings.
Δ 1 [s] Δ 2 [s] Δ 3 [s]
9.77 × 10 14 1.95 × 10 13 3.91 × 10 13

Share and Cite

MDPI and ACS Style

Reznichenko, A.V.; Terekhov, I.S. Path Integral Approach to Nondispersive Optical Fiber Communication Channel. Entropy 2020, 22, 607. https://doi.org/10.3390/e22060607

AMA Style

Reznichenko AV, Terekhov IS. Path Integral Approach to Nondispersive Optical Fiber Communication Channel. Entropy. 2020; 22(6):607. https://doi.org/10.3390/e22060607

Chicago/Turabian Style

Reznichenko, Aleksei V., and Ivan S. Terekhov. 2020. "Path Integral Approach to Nondispersive Optical Fiber Communication Channel" Entropy 22, no. 6: 607. https://doi.org/10.3390/e22060607

APA Style

Reznichenko, A. V., & Terekhov, I. S. (2020). Path Integral Approach to Nondispersive Optical Fiber Communication Channel. Entropy, 22(6), 607. https://doi.org/10.3390/e22060607

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop