1. Introduction
Polar codes are the first family of error correcting codes that provably achieve the capacity for symmetric binary-input discrete memoryless channels (B-DMCs) with low complexity successive cancellation (SC) decoding [
1]. Arikan introduced the channel polarization that splits the capacities of information channels to one or zero as block length becomes longer [
1]. As a dual phenomenon of the channel polarization, the conditional entropies in source coding context are also polarized when the joint entropy is broken down by the chain rule (this phenomenon is called the “source polarization”) [
2]. The source coding using polar codes based on the source polarization is also proven to be optimal for both lossless [
3] and lossy source coding [
4]. In addition, polar codes have been extended to multi-terminal problems such as multiple access and broadcast channels in channel coding [
5,
6,
7,
8], and Slepian–Wolf and Gelfand–Pinsker problems in source coding [
9,
10,
11].
From the channel coding perspective, even though the SC decoding of polar codes achieves the capacity, it was not as competitive in finite length performance as low-density parity-check (LDPC) codes and turbo codes [
12,
13]. However, when both the concatenation of polar codes with cyclic redundancy check (CRC) codes and the SC list (SCL) decoding that keeps the best multiple decoding paths are used, the finite-length polar coding performance becomes comparable to the LDPC coding [
12]. Furthermore, a parity-check concatenation was proposed to improve the performance of polar codes based on CRC-aided SCL decoding [
14]. Recently, thanks to their excellent error performance and fine rate flexibility [
15], polar codes have been adopted as a channel code for the control channel of 5G new radio of the 3rd generation partnership project [
16].
In order for polar codes to show comparable error performance to the LDPC codes in moderate block lengths, the SCL decoding requires a rather large list size, e.g.,
. The list size
L determines the maximum number of decoding paths and affects the complexity of log-likelihood ratio (LLR) calculation and sorting of path metrics (PMs). It is a linear factor in the decoding complexity. According to [
17,
18], even for a list size smaller than 10, the SCL decoder has larger complexity than the LDPC decoder, which leads to a considerable gap in power consumption between those decoders. Therefore, a low complexity realization of the SCL decoding is important from a practical perspective, and some recent research has reduced the decoding complexity by efficiently managing decoding paths. In [
19], decoding paths do not split when one of two child-paths is dominant. In addition, the decoding paths that have low reliability are eliminated during decoding in [
20]. The operational decoding complexity of those schemes was lowered by the reduced average number of decoding paths.
Lately, some studies have used multiple CRC codes to address different complexity issues of the decoding [
21,
22,
23]. The scheme in [
21] aims to reduce the memory complexity by selecting only the most likely path after each CRC operation where more than one CRC is applied. In [
22], the complexity reduction due to CRC-based early stopping of decoding was reported. Finally, the method in [
23] reduces the worst decoding latency and space complexity for storing intermediate LLRs.
This paper addresses polar coding with multiple CRC codes similarly to [
21,
22,
23], but we focus more on optimization of the scheme in terms of decoding complexity. First, we define the operational complexity, and then minimize it by optimizing CRC positions. In addition, the optimization of CRC positions is extended to a modified decoding that uses a complexity reduction trick called “instant decision” (first proposed in [
24]). It is shown that the optimized CRC placement considerably lowers the decoding complexity of the multi-CRC scheme in a wide range of signal-to-noise ratio (SNR).
The rest of this paper is organized as follows. In
Section 2, the SCL decoding and its operational complexity are introduced, and in
Section 3, the encoding and decoding methods for the multi-CRC scheme, and the optimization algorithm for CRC positions are explained.
Section 4 provides the performance and complexity of the proposed and conventional schemes, and
Section 5 concludes this paper.
3. Low Complexity SCL Decoding with Multiple CRC Codes
As the list size
L increases, the performance of the SCL decoding is improved, but the operational complexity also increases. Actually, in the LLR-based SCL decoding under consideration, the total operational complexity is the sum of
as in Equation (
5), where each contributing term at
is linearly or slightly superlinearly related with
. Therefore, the goal of this work is to reduce the operational complexity of the SCL decoding by managing the temporary list size lower by the aid of multiple CRC codes.
Previous research utilized multiple CRC codes to reduce the memory and time complexity of the corresponding SCL decoding [
21,
23]. Although multiple CRC codes were used to reduce the operational complexity in [
22], in the complexity evaluation, the temporary list size was not counted and only uniform (equal-length) message partitioning was considered.
However, we consider the operational complexity reduction via nonuniform message partitioning and temporary list size management. If we can keep the temporary list size small, the overall complexity can be reduced. In this paper, in order to minimize the complexity effectively, the positions of CRC codes are optimized and a modified SCL decoding is employed.
3.1. Encoding and Decoding of the J-Tuple CRC Scheme
Let us consider a precoding of a partitioned message block with multiple CRC codes, followed by polar encoding. Before the polar encoding, the message vector
is partitioned into
J subblocks
, and they are encoded serially with
J-tuple CRC codes as shown in
Figure 3. A CRC encoding is carried out recursively to the concatenation of a CRC encoded block and a message subblock. CRC code lengths are defined by
, and the CRC parities are denoted by
. Through this nested encoding structure, undetected error by previous CRCs can be detected by following CRCs. The length of
j-th message subblock is denoted by
and the total sum of
is
K. Likewise, the total sum of
is
P. The CRC encoded message is encoded by a polar code. This scheme is termed the “
J-tuple CRC scheme”. In this scheme, the lengths of
J message subblocks can be determined by optimization of CRC positions that will be explained in
Section 3.2.
We introduce a decoding scheme for the J-tuple CRC scheme that is described in Algorithm 2. The temporary list size update is included in Algorithm 2 for a better presentation of the decoding procedure. The decoding is based on the conventional SCL decoding. However, the decoding is different in two aspects. First, CRC operations are additionally conducted to discard incorrect paths (lines 12–13 in Algorithm 2). Thus, due to the added CRCs, the temporary list size vector varies depending on the decoding instance. Note, however, of the SCRC scheme is invariant because decoding path validity is checked only once at the bottom of the decoding tree. If any of the CRCs fail in all decoding paths during the decoding, then the decoding is terminated and a block error is declared. This event is called “early stopping” that is rare at a high SNR but frequent at a low SNR. If the decoder finds paths with valid CRC, all such paths proceed to the next step. Note that we do not assume that best effort decoding is required.
Algorithm 2 The ID-SCL decoding for the J-tuple CRC scheme |
Require: Received vector, , , , Ensure: Estimated codeword
- 1:
; - 2:
for do - 3:
if then - 4:
if then - 5:
SCL decoding of maintaining number of survival paths and ; - 6:
else - 7:
SCL decoding of with list size of and ; - 8:
end if - 9:
else - 10:
Frozen bit decoding and ; - 11:
end if - 12:
if i indicates the last parity bit of the j-th CRC then - 13:
Discard jth-CRC-invalid paths and update ; - 14:
end if - 15:
end for - 16:
return ML codeword that passes the J-th CRC;
|
At each CRC operation, only valid paths—from a CRC perspective—survive so that the temporary list size is spontaneously reduced to one or just a few. The temporary list grows as the decoding proceeds, but if frozen bits follow consecutively, then the growth can progress slowly. This sporadic reduction of temporary list due to multiple CRC operations induces non-negligible complexity reduction.
Second, we consider a simple complexity reduction criterion for list management that was first proposed in [
24] for reduction of sorting complexity. When a highly reliable bit is processed, the list is forced not to grow (lines 4–5 in Algorithm 2) and the sorting operation is skipped. Let
be the set of indices where the capacity of corresponding split channels exceeds a certain threshold. (
may alternatively be defined in terms of bit error rate.(period position)) When
is processed, if
, we run the SC decoding instead of SCL decoding. The SC decoding decides the value of
so that the temporary list size is maintained as
even if
. Because
for
is highly reliable, the correct path may not be ruled out by the value decision of
. The list decoding with this criterion is termed the “instant-decision-aided SCL (ID-SCL) decoding”. (Here, we view the conventional SCL decoding operation as a delayed decision.)
In fact, the instant decision significantly reduces the complexity even though multiple CRC codes are not applied because it skips sorting operations that account for a large portion of the total complexity. In addition, the instant decision also has an effect of maintaining the temporary list size lower. Therefore, it is expected that the multi-CRC scheme including its optimization can also reduce the complexity under the ID-SCL decoding.
In the next subsection, the method of how to optimize CRC positions and determine is explained based on the aforementioned encoding and decoding methods. Note, however, that the J-tuple CRC scheme can be decoded with or without the instant bit decision. Obviously, CRC positions can be also optimized without the instant decision. The decoding without the instant decision is called the basic SCL decoding for the J-tuple CRC scheme.
3.2. Optimization of J-Tuple CRC Positions with Respect to Complexity
We can optimize the positions of CRCs to minimize the operational decoding complexity of the J-tuple CRC scheme. However, as mentioned above, since the decoding complexity of the J-tuple CRC scheme is variable depending on the decoding instance, we perform the optimization considering the average complexity based on the assumption that errors occur randomly in wrong paths. In order to simplify the problem, we input predetermined lengths of the J-tuple CRC codes into the optimization. Through decoding experiments, we determine empirically with regard to performance and complexity such that , where P is the CRC length of the compared SCRC scheme. In addition, because the minimum distance of the CRC concatenated polar code is affected by the length of the last CRC, we allocate more bits to the last CRC code than others.
Now let us first consider optimization of the positions of
J CRCs under the basic SCL decoding. Note that the CRC positions are determined by the lengths of message subblocks. As mentioned, the basic SCL decoding for the
J-tuple CRC scheme follows Algorithm 2 without instant bit decision. Therefore, for the
J-tuple CRC scheme with a predetermined
, the optimization follows as:
where
is the set of integers. The average temporary list
is obtained by Algorithm 3 with the input
,
and
. Algorithm 3 determines the average temporary list size
by
first, and when the
j-th CRC is possible, reduces the
according to the function of
as shown at line 5. Here, we assume that channel quality is high enough so that the correct path always exists in the list, and many errors randomly occur in wrong paths. Then,
paths survive with probability of
by the
j-th CRC. Through the Monte Carlo simulation introduced in
Section 4, we confirmed that the estimated complexity based on this assumption is very close to the experimental one. The average sorting complexity vector
is generated by Equation (
3) with
obtained by Algorithm 3. Note that the optimization is appropriate for a small
J because it is a joint optimization of
J variables. In order to avoid high complexity search for a large
J, one can apply a greedy search that determines CRC positions, one after another.
For a lower complexity implementation, the ID-SCL decoding is combined with the optimization of the
J-tuple CRC positions of Equation (
6). As mentioned in
Section 3.1, the ID-SCL decoding instantly decides the information bits
, maintaining the temporary list size. Assume the good information set
is defined by
, where
is the bit error probability of the
i-th split channel, and
is a certain threshold. Let us consider the effect of the ID-SCL decoding on the CRC position optimization. For example, assume there exists a burst of good split channels defined by
, and a CRC is located before the burst of good split channels. Then, the CRC reduces the number of decoding paths, and the number of survival paths remains until the end of the burst because of the ID-SCL decoding. Eventually, the interworking of the ID-SCL decoding and CRC position optimization will give obvious complexity reduction. A more detailed example about the temporary list size is presented in
Section 4.
Algorithm 3 Calculation of ’s for the basic SCL decoding of the J-tuple CRC scheme |
Require: , , Ensure:
- 1:
- 2:
for do - 3:
; - 4:
if i indicates the last parity bit of the j-th CRC then - 5:
; - 6:
end if - 7:
end for - 8:
return ;
|
When we employ the ID-SCL decoding, the average temporary list size
is computed by
of Algorithm 4 instead of
. Similarly, let
in Algorithm 4. Entire
is obtained by the modified version of Algorithm 3 in which
at line 3 is replaced with
g(·), and the input of Algorithm 3 is changed to
,
,
and
. In addition, the average complexity of the sorting operation is modified as
where the bitonic sorter is assumed. Eventually, the optimization of CRC positions combined with the ID-SCL decoding is performed by Equation (
6) with new
and
calculated with Algorithm 4 and Equation (
7).
Algorithm 4 Calculation of with for the ID-SCL decoding, g(,,,i) |
Require: , , , i Ensure:
- 1:
if then - 2:
if then - 3:
; - 4:
else - 5:
; - 6:
end if - 7:
else - 8:
; - 9:
end if - 10:
return ;
|
In the next section, we compare the proposed and conventional schemes with mixed criteria of complexity and performance. That is, we assess the complexity reduction with only negligible performance degradation.
4. Simulation Results and Analysis
This section provides performance and complexity of the proposed and conventional schemes. In this paper, a Monte Carlo simulation has been carried out to evaluate the decoding performance and complexity. The block lengths for simulation are 1024, 4096 and only rate one-half is considered. We use a binary-input additive white Gaussian noise channel. The number of total CRC parities
and the list size
for all schemes. For
J-tuple CRC schemes,
are considered, and
are used, respectively. All codes are optimized at
by the density evolution [
28]. As the conventional schemes, the SCRC scheme and the
J-tuple CRC scheme that has
J uniform message subblocks [
22] are evaluated for comparison. The former is abbreviated as “SCRC” and the latter as “U-JCRC”. The
J-tuple CRC scheme is optimized in terms of the basic SCL decoding or the ID-SCL decoding. They are all abbreviated as “O-JCRC”. For all the schemes, simulation is performed under the basic and the ID-SCL decoding.
Table 1 shows the message vector
of O-JCRC and U-JCRC. In general, the optimal
,
, differs depending on whether the ID-SCL decoding is applied or not. The threshold
for the ID-SCL decoding also differs according to the block length and
J because it is determined empirically to minimize the performance loss, and maximize the complexity reduction for the SCRC scheme. The ratio
is about 0.746 and 0.806 for
, respectively.
The comparison of the average temporary list sizes between the proposed and conventional 3-CRC schemes for
,
and
is shown in
Figure 4.
Figure 4a is under the basic SCL decoding, and
Figure 4b under the ID-SCL decoding. For SCRC in
Figure 4a, the temporary list size slowly increases up to
L until the bit index reaches about 220. In the case of U-3CRC, the average temporary list size decreases from
L to
two times due to use of two 4-bit CRC codes. Its temporary list size, however, quickly increases to
L again. For the proposed scheme, the temporary list size of O-3CRC is also dropped two times, but the list size is maintained small for a burst of consecutive frozen bits. In
Figure 4b, the trend of the temporary list size is similar to that of
Figure 4a. However, for O-3CRC, the list size stays small longer due to instant decisions of highly reliable bits on top of frozen operations.
Table 2 and
Table 3 show the complexity ratio of the
J-tuple CRC schemes to the SCRC scheme with the basic SCL and the ID-SCL decoding. The complexity ratios are estimated by the analysis in
Section 3, and the estimation is validated by simulation results. Here, O-2CRC and O-3CRC schemes are optimized for serial and parallel sorters separately. As mentioned, the analysis is done under the assumption that there is no early stopping of decoding that indicates a detected codeword error. This assumption is consistent with a high SNR environment. Thus, the experimental complexity is measured at the high SNR, e.g.,
dB for
, respectively, by averaging the number of addition-equivalent operations. The complexity ratio from the experiment is in the parentheses of
Table 2 and
Table 3. It is observed that the experimental values are very close to the estimated ones. Note that the optimization of
J-CRC scheme is performed based on the estimated complexity and the estimation turned out to be pretty accurate. By the way, a large complexity reduction of the ID-SCL decoding in
Table 3 is obtained due to the skipping of sorting operations.
Figure 5,
Figure 6 and
Figure 7 give the performance and complexity comparison for the optimized 2-CRC and 3-CRC schemes in a wide range of SNR. The complexity ratio includes the reduction from the early stopping in the low SNR region, and it is calculated based on the complexity of SCRC (under the basic SCL decoding). Only the parallel bitonic sorting is assumed in the scheme optimization and complexity evaluation. The average complexity is obtained via simulation.
Figure 5 and
Figure 6 show the performance and complexity ratio of the proposed and conventional 2-CRC schemes. In
Figure 5a and
Figure 6a, the negligible performance gap between the proposed and conventional schemes is observed, and the error performance loss is within 0.05 dB. However,
Figure 5b and
Figure 6b display obvious visible difference in complexity. Especially, when the ID-SCL decoding is applied to the SCRC scheme, about 40% of the complexity is reduced. For
, the complexity reduction of O-2CRC under the basic SCL decoding is about 7%
points compared to U-2CRC at
dB, but when the ID-SCL decoding is used, about 4%
points are reduced compared to “U-2CRC with ID”. In addition, at the low SNR, more complexity reduction is acquired thanks to the early stopping. For
, the complexity reduction of O-2CRC under the basic SCL decoding is about 22%
points compared to U-2CRC at
dB, and about 18%
points are reduced under the ID-SCL decoding compared to “U-2CRC with ID”.
Figure 7 depicts the complexity ratio of the proposed and conventional 3-CRC schemes. The performances for the 3-CRC schemes are not included in this paper because they are virtually the same as for the 2-CRC scheme. For the 3-CRC scheme, the complexity reduction from the early stopping is smaller than the 2-CRC scheme because the CRC of four bits located in the middle of vector
can not eliminate most incorrect paths. Especially, if
and a CRC of four bits is used, then
paths survive on average, which implies a lower probability of early decoding termination than the 2-CRC scheme. However, if the ID-SCL decoding is applied, there is more complexity reduction at the high SNR. For
, about 12%
points (about 9%
points) are reduced compared to U-3CRC (U-3CRC with ID), and about 11%
points (about 9%
points) for
compared to U-3CRC (U-3CRC with ID).
Note that, for the proposed J-tuple CRC scheme, the complexity reduction can be increased, while keeping the performance level by selecting rigorously, not based on a threshold . This paper provides a simple and systematic algorithm that minimizes the complexity and performance loss by adjusting the single parameter .