Next Article in Journal
FBANet: Transfer Learning for Depression Recognition Using a Feature-Enhanced Bi-Level Attention Network
Next Article in Special Issue
The c-Differential-Linear Connectivity Table of Vectorial Boolean Functions
Previous Article in Journal
Physics-Based Differentiable Rendering for Efficient and Plausible Fluid Modeling from Monocular Video
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Bounds on the Probability of Undetected Error for q-Ary Codes

1
School of Mathematical Sciences, Anhui University, Hefei 230601, China
2
State Grid Anhui Electric Power Co., Ltd., Hefei 230601, China
3
I2M, CNRS, Aix-Marseille Univetsity, Centrale Marseille, 13009 Marseilles, France
*
Author to whom correspondence should be addressed.
Entropy 2023, 25(9), 1349; https://doi.org/10.3390/e25091349
Submission received: 13 August 2023 / Revised: 13 September 2023 / Accepted: 15 September 2023 / Published: 17 September 2023
(This article belongs to the Special Issue Discrete Math in Coding Theory)

Abstract

:
We study the probability of an undetected error for general q-ary codes. We give upper and lower bounds on this quantity, by the Linear Programming and the Polynomial methods, as a function of the length, size, and minimum distance. Sharper bounds are obtained in the important special case of binary Hamming codes. Finally, several examples are given to illustrate the results of this paper.

1. Introduction

Let A = { a 1 , , a q } be an alphabet with q distinct symbols, where q 2 and the alphabet do not have any structure. For instance, A can be F q , the finite field with q elements, or Z q , the ring of integers modulo q. Moreover, a linear [ n , k ] code is a subspace of the vector space F q n and k is the dimension of the subspace. For every two vectors x , y A n , the (Hamming) distance d H ( x , y ) between x and y is defined as the number of coordinates where they are different. A nonempty subset C of A n with cardinality M is called a q-ary ( n , M ) code, whose elements are called codewords. The minimum distance d of the code C is the minimum distance between any two different codewords in C. The distance distribution of C is defined as
A i = 1 M | { ( x , y ) : x , y C , d H ( x , y ) = i } | , i = 0 ,   1 , ,   n .
Assume that the code C is used for error detection on a discrete memoryless channel with q inputs and q outputs. Each symbol transmitted has a probability 1 p of being received correctly and a probability p q = p / ( q 1 ) of being transformed into each of the q 1 other symbols. It is natural to let 0 p ( q 1 ) / q . Such a channel model is called a q-ary symmetric channel q S C ( p ) . When such a code is used on the symmetric q-ary channel q S C ( p ) , errors occur with a probability p q 1 per symbol.
Let x C be the codeword transmitted and y = x + e F q n be the vector received, where e = y x is the error vector from the channel noise. Obviously, e C if and only if y C . Note that the decoder will accept y as error free if y C . Clearly, this decision is wrong, and such an error is not detected. Thus, when error detection is being used, the decoder will make a mistake and accept a codeword which is not the one transmitted if and only if the error vector is a nonzero codeword [1,2]. In this way, the probability that the decoder fails to detect the existence of an error is called the probability of undetected error and denoted by P u e ( C , p ) , which is defined as
P u e ( C , p ) = j = 1 n A j p q 1 j ( 1 p ) n j .
In general, the smaller the probability of undetected error P u e for some p, the better the code performs in error detection. However, this function is difficult to characterize in general.
As for the code C, comparing its P u e with the average probability P u e ¯ [3,4] for the ensemble of all q-ary linear [ n , k ] codes is a natural way to decide whether C is suitable for error detection or not, where
P u e ¯ ( p ) = q ( n k ) 1 ( 1 p ) k .
According to [4], there exists a code C such that P u e ( C , p ) > q ( n k ) and there are many codes, the P u e of each of whom is smaller than q ( n k ) . In fact, it was commonly assumed that P u e ( C , p ) q ( n k ) for the linear [ n , k ] code C in [5], where q ( n k ) = q r is called the q r bound. The q r bound is satisfied for certain specific codes, e.g., Hamming codes and binary perfect codes, when 0 < p < 1 / 2 .
For the worst channel condition, i.e., when p = ( q 1 ) / q ,
P u e C , q 1 q = q ( n k ) 1 1 q 1 q k = P u e ¯ q 1 q .
From the above formula, a code C is called good if P u e ( C , p ) P u e ( ( q 1 ) / q ) for all 0 < p < ( q 1 ) / q . In particular, if P u e ( C , p ) is an increasing function of p in the interval [ 0 , ( q 1 ) / q ] , then the code is good, and the code is called proper. There are many proper codes [1], for example, perfect codes (and their extended codes and their dual codes), primitive binary 2-error correcting BCH codes, a class of punctured of Simplex codes, MDS codes, and near MDS codes (see [5,6,7,8,9] for details). Moreover, for practical purposes, a good binary code C may be defined a bit different, i.e., P u e ( C , p ) c P u e ( C , 1 / 2 ) for every 0 p 1 / 2 and a reasonably small c 1 . Furthermore, an infinite class C of binary codes is called uniformly good if there exists a constant c such that for every 0 p 1 / 2 and C C , the inequality P u e ( C , p ) c P u e ( C , 1 / 2 ) holds. Otherwise, it is called ugly, for example, some special Reed–Muller codes are ugly (see [10]).
Another way to assess the performance of a code for error detection is to give bounds of the probability of undetected error. In [11], Abdel-Ghaffar defined the combinatorial invariant  F j of the code C and proved that
P u e ( C , p ) = j = 1 n F j p q 1 j 1 q p q 1 n j ,
where
F j = i = 1 j A i n i n j , j = 1 ,   2 , ,   n .
Using combinatorial arguments, Abdel-Ghaffar [11] obtained a lower bound on the undetected error probability P u e ( C , p ) . Later, Ashikhmin and Barg called F j the binomial moments of the distance function and derived more bounds for P u e (see [12,13]).
In particular, constant weight codes are attractive and many bounds are developed, for example, binary constant weight codes (see [14,15]) and q-ary constant weight codes (see [16]). In fact, the probability of an undetected error for binary constant weight codes has been studied and can be given explicitly (see [14,16]).
Note that when A = F q and p 0 , according to Equation (2), we have
P u e ( C , p ) A d p q d ( 1 p ) n d ,
where p q = p / ( q 1 ) , d is the minimum distance of C and A d is called the kissing number of the linear code C. In 2021, Solé et al. [17] studied the kissing number by Linear Programming and the Polynomial Method. They gave bounds for A d under different conditions and made tables for some special parameters. Motivated by the work, this paper is devoted to studying the function P u e using the same techniques.
The rest of this paper is organized as follows. In Section 2, we briefly give the definition of the (dual) distance distribution of q-ary codes and give some trivial bounds of the probability of an undetected error. In Section 3.1, linear programming bounds are discussed. The applications of Krawtchouk polynomial (Polynomial Method) to error detection are given in Section 3.2. In Section 4, some bounds better than the 2 m bound are given for binary Hamming codes. Finally, we end with some concluding remarks in Section 5.

2. Preliminaries

Recall some basic definitions and notations from [2,18,19,20]. Throughout this paper, to simplify some formulas, we let p q = p q 1 and k = log q | C | for some real k. Furthermore, in this paper, it is natural to define p < ( q 1 ) ( 1 p ) , equivalently, p q < 1 p .

2.1. Dual Distance Distribution

Assume that A = F q is the finite field of size q and C is a subspace of F q n , i.e., C is a linear code over F q . Then, the dual code  C of C is the orthogonal complement of the subspace C. That is to say,
C = { v F q n : v · u = 0 f o r   a l l u C } ,
where v · u = i = 1 n v i u i , u = ( u 1 , , u n ) and v = ( v 1 , , v n ) . The distance distribution A i of C can be determined similarly. It is well known (see Chapter 5. §2. in [2]) that
A i = 1 | C | i = 0 n A j P i ( j ) ,
where P i ( j ) denotes the Krawtchouk polynomial of degree i. For each integer q 2 , the Krawtchouk polynomial P k ( x ; n ) is defined as
P k ( x ; n ) = j = 0 k ( 1 ) j x j n x k j ( q 1 ) k j .
When there is no ambiguity for n, the function P k ( x ; n ) is often simplified to P k ( x ) .
Note that Equation (4) holds when C is linear. When C is nonlinear, the dual distance distribution  A i is defined by Equation (4). Furthermore, by the MacWilliams–Delsarte inequality,
A i 0 ,
holds for all i = 0 ,   1 , ,   n . Moreover, A 0 = 1 and
q k = 1 + j = 1 n A j , when | C | = q k .

2.2. Probability of Undetected Error

The q-ary symmetric channel with symbol probability p, where 0 p ( q 1 ) / q , is defined as follows: symbols from some alphabet A with q elements are transmitted over the channel, and
P ( b received a sent ) = 1 p , b = a , p q 1 , b a ,
where P ( b received a sent ) is the conditional probability that b is received, given that a is sent. For a q-ary code C, when it is used on such a channel, it is possible that the decoder fails to detect the existence of the errors. Thus, P u e , the function in terms of the weight distribution of C is given in Equation (2). Clearly, this is a difficult computational problem for large parameters n, k, d, and q (see [2]). Hence, it is better to give bounds for P u e . For example, here are some trivial bounds.
Theorem 1.
For every q-ary code C with | C | = q k , if p < ( q 1 ) ( 1 p ) , then
( q k 1 ) p q n P u e ( C , p ) ( q k 1 ) p q d ( 1 p ) n d ,
where p q = p q 1 . Especially, when q = 2 and 0 < p < 1 2 , we have
( 2 k 1 ) p n P u e ( C , p ) ( 2 k 1 ) p d ( 1 p ) n d .
Proof. 
It is easy to check that p q j ( 1 p ) n j > p q j + 1 ( 1 p ) n j 1 if and only if p < ( q 1 ) ( 1 p ) . Hence,
P u e = j = d n A j p q j ( 1 p ) n j p q d ( 1 p ) n d j = d n A j = ( q k 1 ) p q d ( 1 p ) n d ,
since p q j ( 1 p ) n j p q d ( 1 p ) n d when j d . The lower bound can be obtained similarly. □
The above bounds are trivial. However, they are both tight, because simplex codes over the finite field F q attain these bounds.

2.3. Some Special Bounds

It is clear that the general bounds given by Theorem 1 will be much larger (or smaller) than the true value of P u e for a fixed code. If the distance distribution is known, one computes P u e ( C , p ) (as a function of p), and if we know some particular information about the distance distribution, then we may get some bounds. The following is a special case and more thoughts can be seen in Section 4.
Theorem 2.
Let C be a binary code with A n = 1 and A i = A n i for 1 i n 1 , then
P u e = p n + j = d t A j p j ( 1 p ) n j + p n j ( 1 p ) j , n = 2 t + 1 , p n + A t p t ( 1 p ) t + j = d t 1 A j p j ( 1 p ) n j + p n j ( 1 p ) j , n = 2 t .
Moreover, when d t , we have
P u e p n + 2 k 1 1 p d ( 1 p ) n d + p t + 1 ( 1 p ) t , n = 2 t + 1 , p n + A t p t ( 1 p ) t + 2 k 1 A t 2 1 p d ( 1 p ) n d + p t + 1 ( 1 p ) t 1 , n = 2 t ,
and
P u e p n + 2 k 1 1 p t ( 1 p ) t + 1 + p n d ( 1 p ) d , n = 2 t + 1 , p n + A t p t ( 1 p ) t + 2 k 1 A t 2 1 p t 1 ( 1 p ) t + 1 + p n d ( 1 p ) d , n = 2 t ,
where 0 < p < 1 2 and d t .
Proof. 
By the definition of P u e , Equation (7) holds if A i = A n i and A n = 1 . Due to 0 < p < 1 2 , It is easy to check that p n j ( 1 p ) j p j ( 1 p ) n j , where 0 j n / 2 . In addition, if n = 2 t + 1 , then j = d t A j = ( 2 k 2 ) / 2 = 2 k 1 1 . Similarly for the case n = 2 t . Hence, we get the bounds. □
Remark 1.
If the binary code C satisfies A i = A n i and A n = 0 , we can get the following bounds:
P u e 2 k 1 p d ( 1 p ) n d + p t + 1 ( 1 p ) t , n = 2 t + 1 , A t p t ( 1 p ) t + 2 k 1 A t + A 0 2 p d ( 1 p ) n d + p t + 1 ( 1 p ) t 1 , n = 2 t ,
and
P u e 2 k 1 p t ( 1 p ) t + 1 + p n d ( 1 p ) d , n = 2 t + 1 , A t p t ( 1 p ) t + 2 k 1 A t + A 0 2 p t 1 ( 1 p ) t + 1 + p n d ( 1 p ) d , n = 2 t .
Here, 0 , the all zero vector, may not be a codeword.
Example 1.
For a binary linear code, if the all-one vector 1 is a codeword, then A i = A n i . So, Theorem 2 can be applied to many codes, for example, Hamming codes. It is known that the binary Hamming code H m is a linear [ n = 2 m 1 , k = 2 n 1 m , 3 ] code. The distance distribution of the [ 15 , 11 , 3 ] Hamming code H 4 is listed in Table 1. According to Theorem 2, the values of the bounds and true probability can be seen in Figure 1.

3. Universal Bounds for q-Ary Codes

In this section, we will discuss the bounds for P u e using different methods. These bounds are for general codes, thus they do not look so good. Meanwhile, compared with some known bounds, they do not perform better. However, it is the first as far as we know to give bounds for P u e using the following two methods, though they have been shown in [21,22] due to different thoughts.

3.1. Linear Programming Bounds

Consider the linear programming problem M ( n , k , d , p ) that maximizes the objec- tive function
j = 1 n A j p q j ( 1 p ) n j
under the constraints:
(1)
A j 0 ,
(2)
j = 1 n A j = q k 1 ,
(3)
j = 1 n A j P i ( j ) P i ( 0 ) ,
(4)
A 1 = A 2 = = A d 1 = 0 .
Likewise, let m ( n , k , d , p ) be the minimization of the same objective function under the same constraints.
Theorem 3.
If C is a q-ary code of parameters ( n , q k , d ) , then m ( n , k , d , p ) P u e M ( n , k , d , p ) .
Proof. 
The objective function expression comes from (2). Constraint ( 1 ) is immediate by the definition of the distance distribution. Constraints ( 2 ) and ( 3 ) come from Equation (6) and Equation (5), respectively. Constraint ( 4 ) is a consequence of the definition of minimum distance. □
Remark 2.
Let f ( x ) and g ( x ) be two functions of x, then f g if f < g or f g , when x 0 , where 0 < x < 1 . For example, let f ( x ) = x 2 + x and g ( x ) = x 3 + x , then f ( x ) > g ( x ) when 0 < x < 1 . But f ( x ) g ( x ) , then f ( x ) g ( x ) when 0 < x < 1 and x 0 .
Motivated by Equation (3) and [17], we have the following result.
Theorem 4.
Let C be a q-ary [ n , k , d ] q linear code, then when p 0 ,
( q k 1 L ) p q d ( 1 p ) n d P u e ( C , p ) ( q k 1 S ) p q d ( 1 p ) n d ,
where L (resp. S) denotes the maximum (resp. minimum) of j = d + 1 n A j subject to the 2 n d constraints
P i ( 0 ) ( q k 1 ) P i ( d ) j = d + 1 n A j P i ( j ) P i ( d ) ,
for i = 1 ,   2 , ,   n and j = d + 1 ,   d + 2 , ,   n .
Proof. 
It is clear that P u e ( C , p ) A d p q d ( 1 p ) n d , then by [17], we get the left side of Equation (8). As for the right side, if A d < q k 1 S and p is small enough, then by Equation (3), P u e ( C , p ) < ( q k 1 S ) p q d ( 1 p ) n d . Otherwise, A d = q k 1 S and then, P u e ( C , p ) ( q k 1 S ) p q d ( 1 p ) n d . □
Table 2 is a part of Table I in [17], which is helpful to give bounds for P u e .
Example 2.
Let C 1 be a binary [ 15 , 4 , 8 ] code, then
P u e ( C 1 , p ) 15 p 8 ( 1 p ) 7 .
As for the binary [ 12 , 4 , 6 ] code C 2 , we have
11 p 6 ( 1 p ) 6 < P u e ( C 2 , p ) < 14 p 6 ( 1 p ) 6 .
Obviously, for any [ n , k , d ] code, one can give bounds for its P u e .
Remark 3.
From the above discussion, it is clear that our bounds depend solely on the three parameters [ n , k , d ] of the code, and [ n , k , d ] is the minimal requirement to use a code in practice.

3.2. Polynomial Method

In this section, we will give some general bounds for P u e for any binary ( n , 2 k , d ) code. Recall the definition of the Krawtchouk polynomials and some properties. The following identity is a Polynomial Method of expressing the duality of LP.
Lemma 1.
Let β ( x ) Q [ x ] be the polynomial whose Krawtchouk expansion is
β ( x ) = j = 0 n β j P j ( x ) .
Then we have the following identity
i = 0 n β ( i ) A i = q k j = 0 n β j A j .
Proof. 
Immediate by Equation (4), upon swapping the order of summation. □
From now on, we denote the coefficient of Krawtchouk expansion of the polynomial f ( x ) of degree n by f j , j = 0 ,   1 , ,   n , i.e., f ( x ) = j = 0 n f j P j ( x ) .
The first main result of this section is inspired by Theorem 1 in [23], and given as follows.
Theorem 5.
Let β ( x ) and γ ( x ) be polynomials over Q such that β j 0 , γ j 0 for j 1 and γ ( i ) p q i ( 1 p ) n i β ( i ) for all i with A i 0 . Then we have the upper bound
P u e q k β 0 β ( 0 ) ,
and the lower bound
P u e q k γ 0 γ ( 0 ) .
Proof. 
By Lemma 1, we have
j = 0 n A j β ( j ) β 0 q k .
Returning to the definition of P u e and using the property of β ( j ) p q j ( 1 p ) n j , we get
P u e = j = 1 n A j p q j ( 1 p ) n j j = 1 n A j β ( j ) q k β 0 β ( 0 ) .
The proof of the lower bound is analogous and ommitted. □
Remark 4.
The above result is a special case of Proposition 5 in [22]. More general setting of the linear programming bounds from Section 3 (Theorem 5) were already considered in [21,22].
The following are some properties of the Krawtchouk expansion, and we omit the proof, since they are not difficult.
Lemma 2
([24] Corollary 3.13). Let f ( x ) = j = 0 n f j P j ( x ) and g ( x ) = j = 0 n g j P j ( x ) be polynomials over Q , where f j 0 ,   g j 0 , 0 j n . Then the coefficients of the Krawtchouk expansion of λ f ( x ) + μ g ( x ) are nonnegative, where λ , μ are nonnegative rational numbers.

3.2.1. Upper Bounds

For convenience, let δ i , j be the Kronecker symbol, i.e.,
δ i , j = 1 , if i = j , 0 , if i j .
Lemma 3.
For general q, the coefficients of the Krawtchouk expansion of the following polynomial
g i ( x ) = ( 1 ) i 1 ( i 1 ) ! ( n i ) ! j = 1 n ( j x ) i x ,
are all nonnegative if and only if i is odd, where 1 i n is an integer and 0 ! = 1 . Moreover, g i ( j ) = δ i , j .
Proof. 
Let
h ( x ) = q n d + 1 s x j = d n 1 x j = j = 0 n h j P j ( x ) ,
where d s n . Then, by Proposition 5.8.2 in [20],
h i = 1 q n j = 0 n h ( j ) P j ( i ) = 1 q d 1 j = 0 d 1 n j n d + 1 P j ( i ) s j / n d 1 1 q d 1 s j = 0 d 1 n j n d + 1 P j ( i ) / n d 1 = 1 s n i d 1 / n d 1 0 .
Note that if d = 1 , we have
h ( x ) = q n n ! ( 1 ) i 1 ( i 1 ) ! ( n i ) ! g s ( x ) .
According to Lemma 2, the coefficients of the Krawtchouk expansion of ( 1 ) i 1 g i ( x ) are all nonnegative.
Obviously, for any j i , g i ( j ) = 0 , because j is a root of g i ( x ) . Moreover,
g i ( i ) = ( 1 ) i 1 ( i 1 ) ! ( n i ) ! = 1 i 1 ( i ) = i + 1 n ( i ) = ( 1 ) i 1 ( i 1 ) ! ( n i ) ! ( 1 ) i 1 ( i 1 ) ! ( n i ) ! = 1 ,
which means g i ( j ) = δ i , j . □
Theorem 6.
Let C be a binary code with the distance distribution A j , where A j = 0 for all possible odd j, then
P u e e v e n i p i ( 1 p ) n i n i 1 2 n k + 1 ,
where even i means that i runs through the even intergers between d and n.
Proof. 
According to Lemma 3, the coefficients of the Krawtchouk expansion of the following polynomial:
g i ( x ) = ( 1 ) i 1 ( i 1 ) ! ( n i ) ! j = 1 n ( j x ) i x
are nonnegative if and only if i is odd. Then, let
f ( x ) = even i p i ( 1 p ) n i g i ( x ) = j = 0 n f j P j ( x ) .
Hence, f j 0 , f ( i ) = p i ( 1 p ) n i for even i and f ( i ) = 0 for odd i. By the proof of Theorem 5,
P u e 2 k f 0 f ( 0 ) ,
where
f ( 0 ) = even i ( 1 ) p i ( 1 p ) n i n i ,
and
f 0 = 1 2 n even i p i ( 1 p ) n i n i .
Thus, the upper bound follows from Theorem 5. □
Remark 5.
If C is linear, then A i is the number of codewords of weight i, which implies that A i n i . Hence,
P u e i I p i ( 1 p ) n i n i ,
where I = { i | A i 0 } . Moreover, if A i = 0 for all odd i, then
P u e even i p i ( 1 p ) n i n i .
Example 3.
Consider the Nordstrom–Robinson code, it is a binary nonlinear code with the distance distribution in Table 3. Moreover, the weight distribution is the same as the distance distribution. By Equation (2),
P u e = 112 p 6 ( 1 p ) 10 + 30 p 8 ( 1 p ) 8 + 112 p 10 ( 1 p ) 6 + p 16 .
According to Theorem 6, the values of the upper bound and true probability can be seen in Figure 2.
Example 4.
Let E be the set of binary vectors of length n and even weight, then it is actually the Reed–Muller code R M ( n 1 , n ) in Problem 5 in [2] and is generated by all the binary vectors of weight 2. Hence,
P u e ( E , p ) = i = 1 n / 2 n 2 i p 2 i ( 1 p ) n 2 i .
Remark 6.
The bound is suitable for many codes, and thus it seems not good. In fact, there exists some code C, whose P u e is very large.
Motivated by [17], we have the following upper bounds for linear codes over F 2 .
Proposition 1.
When C is a q-ary linear [ n , k , d ] code and p is small enough, we have the following statements:
(1) 
If n + 1 + q d n q > 0 , then
P u e q k + n q n 1 n n q + 1 + q d p q d ( 1 p ) n d ;
(2) 
If n + q d n q 1 < 0 , then
P u e q k 2 n ( q n n q d + 1 ) + n ( d 1 ) n d p q d ( 1 p ) n d ;
(3) 
If q = 2 , n 2 d > 0 , ( n 2 d + 2 ) 2 > n , and A i 0 only if d i n 2 d , then
P u e 2 k 2 ( ( n 2 d + 2 ) 2 n ) + ( d 1 ) ( n d + 1 ) n + 1 2 d p d ( 1 p ) n d .
Proof. 
These three bounds can be deduced easily by Equation (3) and Corollaries 4–6 in [17]. □
Remark 7.
The results in Corollary 4–6 in [17] are actually the upper bounds of A d under different conditions. Considering Equation (3), it is necessary to make p small enough. According to the proof of Theorem 4, if A d does not meet such bounds, then “<” holds.

3.2.2. Lower Bounds

Similar to Proposition 1, by Corollaries 1–3 in [17], we have
Proposition 2.
If C is a q-ary linear code, then we have the following statements:
(1) 
If d = ( n 1 ) ( q 1 ) / q , then
P u e q k n q + n 1 ( n d ) q n + 1 p q d ( 1 p ) n d ;
(2) 
If q d > n q n 2 q + 1 , then
P u e q k 2 n ( n q n + q d + 2 q 1 ) n d n n d p q d ( 1 p ) n d ;
(3) 
If q = 2 and all weights of C are in [ d , n d ] , with n 2 d > 0 and ( n 2 d 1 ) 2 < n + 1 , then
P u e 2 k 2 ( n 2 4 n d 3 n ) + ( 2 k + 1 ) d ( d + 1 ) 2 d n d 1 p d ( 1 p ) n d .
When using quadratic polynomials, we have the following bound.
Proposition 3.
Let f 0 , f 1 and f 2 be nonnegative rational numbers such that
f 0 f 1 n + f 2 n 2 p d ( 1 p ) n d and f 1 + n f 2 2 d f 2 ,
then, for a binary ( n , 2 k , d ) code, we have
P u e 2 k f 0 p d ( 1 p ) n d 2 f 1 n ,
where 0 p 1 2 .
Proof. 
It is known that, when q = 2 , P 0 ( x ) = 1 , P 1 ( x ) = n 2 x and P 2 ( x ) = 2 x 2 2 n x + n 2 . Let f ( x ) = f 0 P 0 ( x ) + f 1 P 1 ( x ) + f 2 P 2 ( x ) and then it is a quadratic function whose axis of symmetry is f 1 + n f 2 2 f 2 . Considering that p i + 1 ( 1 p ) n i 1 p i ( 1 p ) n i , it is sufficient to show that
f ( n ) p d ( 1 p ) n d and f 1 + n f 2 2 f 2 d ,
i.e., f ( i ) f ( n ) p d ( 1 p ) n d p i ( 1 p ) n i for i d . Equivalently,
f 0 f 1 n + f 2 n 2 p d ( 1 p ) n d , f 1 + n f 2 2 d f 2 .
The result follows from Theorem 5. □

4. Good Bounds for Hamming Codes

Recall that the weight enumerator of the code C is the homogeneous polynomial
W C ( x , y ) = c C x n w t ( u ) y w t ( u ) ,
where w t ( u ) means the Hamming weight the codeword u . The binary Hamming code H m is a [ n = 2 m 1 , k = n m , d = 3 ] code, with the weight enumerator
( x + y ) n + n ( x + y ) ( n 1 ) / 2 ( x y ) ( n + 1 ) / 2 n + 1 ,
whose distance distribution A i satisfies
i = 1 n i A i y i 1 + i = 0 n A i y i + i = 0 n 1 ( n i ) A i y i + 1 = ( 1 + y ) n ,
and the recurrence A 0 = 1 , A 1 = 0 ,
( i + 1 ) A i + 1 + A i + ( n i + 1 ) A i 1 = n i .
Moreover,
( 1 + y ) n = i = 1 n i A i y i y + i = 0 n A i y i + n y i = 0 n 1 A i y i y i = 0 n 1 i A i y i = i = 1 n 1 A i y i i y i y + ( n y + 1 ) i = 1 n 1 A i y i + y n + n y n 1 + n y + 1 .
Let α F 2 m be a primitive element and let g ( x ) F 2 [ x ] be the minimal polynomial of α with respect to F 2 . According to Exercise 7.20 in [20], g ( x ) can be regarded as the generator polynomial of a Hamming code. Since deg ( g ( x ) ) = m > 1 , then
g ( x ) x n 1 x 1 = 1 + x + x 2 + + x n 1 ,
which implies that the all-one vector is a codeword of the Hamming code and A n = 1 .
Note that
P u e = i = 1 n A i p i ( 1 p ) n i = ( 1 p ) n i = 1 n A i p 1 p i .
Hence,
i = 1 n 1 A i p 1 p i = P u e p n ( 1 p ) n .
Let y = ε = p 1 p , where p ( 0 , 1 / 2 ) , then
( n ε + 1 ) P u e p n + ( 1 p ) n i = 1 n 1 A i ε i i ε i ε = 1 p n n p ( 1 p ) n 1 n p n 1 ( 1 p ) ( 1 p ) n .
According to Chapter 6, Exercise(E2), page 157 in [2], there are n 4 nonzero weights of H m . Considering that A n = 1 , we have A i = 0 if and only if i = 1 ,   2 ,   n 1 ,   n 2 . Since 0 < p < 1 / 2 , then 0 < ε < 1 and we have
3 ε 3 ε i ε i ε n 3 ε ( n 3 ) ε .
Obviously,
( 1 p ) n i = 1 n 1 A i ε i i ε i ε ( 1 p ) n i = 1 n 1 A i ε i n 3 ε ( n 3 ) ε = n 3 ε ( n 3 ) ε i = 1 n 1 A i p i ( 1 p ) n i = n 3 ε ( n 3 ) ε ( P u e p n ) .
Similarly,
( 1 p ) n i = 1 n 1 A i ε i i ε i ε 3 ε 3 ε ( P u e p n ) .
Thus,
P u e 1 p n n p ( 1 p ) n 1 n p n 1 ( 1 p ) ( 1 p ) n 3 ε 3 ε + n ε + 1 + p n = p ( 1 p ) p n + 1 ( 1 p ) n p 2 ( 1 p ) n n p n ( 1 p ) 2 p ( 1 p ) n + 1 ( n 1 ) p 2 5 p + 3 + p n
and
P u e 1 p n n p ( 1 p ) n 1 n p n 1 ( 1 p ) ( 1 p ) n n 3 ε ( n 3 ) ε + n ε + 1 + p n = p ( 1 p ) p n + 1 ( 1 p ) n p 2 ( 1 p ) n n p n ( 1 p ) 2 p ( 1 p ) n + 1 ( n 1 ) p 2 ( 2 n 7 ) p + n 3 + p n .
Summarize the above discussions, we get
Theorem 7.
Let H m be the binary [ n = 2 m 1 , k = n m , 3 ] Hamming code, then when 0 < p < 1 / 2 and m 3 , we have the upper bound Equation (14) and the lower bound Equation (15) for P u e , respectively.
Proof. 
Note that the upper bound should be larger or equal than the lower bound, then
( 2 n 7 ) p + n 3 5 p + 3 = ( n 6 ) ( 1 2 p ) 0 .
It is sufficient to solve the inequality n = 2 m 1 > 6 , due to 1 2 p > 0 . Hence, m 3 . □
Remark 8.
The difference of the upper bound and the lower bound is small.
Let U ( n , p ) = H 1 / H and L ( n , p ) = H 2 / H be the bound given by Equation (14) and Equation (15), respectively, where H 1 = ( n 1 ) p 2 5 p + 3 , H 2 = ( n 1 ) p 2 ( 2 n 7 ) p + n 3 and
H = p ( 1 p ) p n + 1 ( 1 p ) n p 2 ( 1 p ) n n p n ( 1 p ) 2 p ( 1 p ) n + 1 .
In fact, H is a polynomial of p whose degree n + 2 and the leading coefficient is
h n + 2 = 1 + ( 1 ) n + 2 n n + ( 1 ) n + 2 = 1 + ( 1 ) n + n ( 1 ) n 1 0 ,
while the product H 1 H 2 is just a polynomial whose degree is 4. Then,
U ( n , p ) L ( n , p ) = ( H 2 H 1 ) H H 1 H 2 = ( n 6 ) ( 1 2 p ) H H 1 H 2 ( n 6 ) ( 1 2 p ) h n + 2 p n + 2 ( n 1 ) 2 p 4 0 ( n + ) .
That is to say, the lower bound and the upper bound are very close. On the other hand,
H 1 12 n 37 4 ( n 1 ) a n d H 2 n + 1 4 .
Then,
U ( n , p ) L ( n , p ) = ( H 2 H 1 ) H H 1 H 2 = ( n 6 ) ( 1 2 p ) H H 1 H 2 < ( n 6 ) ( 1 2 p ) p ( 1 p ) H 1 H 2 < ( n 6 ) ( 1 2 p ) p ( 1 p ) 12 n 37 4 ( n 1 ) n + 1 4 = 16 ( n 1 ) ( n 6 ) ( n + 1 ) ( 12 n 37 ) p ( 1 p ) ( 1 2 p ) 3 18 16 ( n 1 ) ( n 6 ) ( n + 1 ) ( 12 n 37 ) 2 3 27 0.1283 ( n + ) .
Here, let F ( p ) = p ( 1 p ) ( 1 2 p ) , then its derivative is F ( p ) = 6 p 2 6 p + 1 . Note that the roots of F ( p ) are 3 ± 3 6 . Since 0 < p < 1 / 2 , then we choose the root p 0 = 3 3 6 . Hence,
F ( p ) F ( p 0 ) = 3 18 0.0962 .
Thus the difference of the upper bound and the lower bound is about 0.1283 at most, and tends to 0 when n + .
Example 5.
Using the bounds in Theorem 7, the results in Figure 1 can be improved. See Figure 3.
When m = 5 , the bounds Equations (15) and (14) are also valid. See Figure 4.
Note that the difference of the bounds Equations (15) and (14) is about 0.05 , which is much smaller than the given 0.1283 .
It is known that the Hamming codes satisfy the 2 m bound when 0 < p < 1 / 2 i.e., P u e 2 m . See [5] for more details. In fact, the obtained new bound is better than the ordinary 2 m bound, when p is not large.
Theorem 8.
Let H m be the binary [ n = 2 m 1 , k = n m , 3 ] Hamming code, then when 0 < p < 1 / 2 and m 3 , we have
P u e p p 2 ( n 1 ) p 2 5 p + 3 + p n .
Moreover, if p < p 0 , this upper bound is better than the 2 m bound, where p 0 is the smaller root of the equation ( 2 m + 1 2 ) x 2 ( 2 m + 5 ) x + 3 = 0 .
Proof. 
Assume that
p p 2 ( n 1 ) p 2 5 p + 3 < 1 2 m ,
then it is sufficient to solve the inequality
( 2 m + 1 2 ) p 2 ( 2 m + 5 ) p + 3 > 0 .
Obviously, the inequality holds when p < p 0 , where
p 0 = ( 2 m + 5 ) ( 2 m + 5 ) 2 12 ( 2 m + 1 2 ) 2 ( 2 m + 1 2 )
is the smaller root of the equation ( 2 m + 1 2 ) x 2 ( 2 m + 5 ) x + 3 = 0 . □
Example 6.
It is clear that when p is small enough, the new upper bound Equation (14) is smaller than the 2 m bound in Figure 3 and Figure 4.
Remark 9.
Of course, the weight distribution of the binary Hamming codes can be computed and expressed by the sum of combinatorial numbers, which are usually very large when m is large. So, the method in this section is to estimate P u e quickly. Compared with the 2 m bound, our bounds are better when p is small enough.

5. Conclusions

In this paper, we studied the probability of an undetected error P u e and gave many bounds for P u e . The main contributions of this paper are the following:
(1)
The bounds obtained from the linear programming problem are given in Theorem 4. The bounds obtained from the Polynomial Method are given. According to the main Theorem 5, we get Theorem 6 (applied to the codes with even distances) and Proposition 3.
(2)
Combining the results of [17], we give the bounds in Propositions 1 and 2.
(3)
We find sharper bounds for binary Hamming codes (see Theorems 7 and 8).
To the best of our knowledge, that is the very first time that the LP method has been applied to bound P u e . Even though computing P u e exactly requires knowledge of the code weight spectrum, our bounds depend solely on the three parameters [ n , k , d ] , of the code. The weight frequencies are only used as variables in the LP program. Knowing the three parameters [ n , k , d ] is the minimal requirement to use a code in applications.
To sum up, our bounds are most useful when the exact weight distribution is too hard to compute. Our bounds perform well when p is small enough and the kissing number A d is known, and there are many such codes.
We mention the following open problems. The readers interested in Hamming codes are suggested to derive bounds for general q-ary Hamming codes with q > 2 . Moreover, it is worth mentioning that the linear programming problem works better numerically than the Polynomial Method. The interest of the latter lies in producing bounds with closed formulas. It is a challenging open problem to derive better bounds with polynomials of degree higher than 2.

Author Contributions

Conceptualization, P.S.; methodology, P.S.; software, H.L.; validation, X.W. and P.S.; formal analysis, X.W. and P.S.; investigation, X.W. and P.S.; resources, H.L.; data curation, H.L.; writing—original draft preparation, X.W.; writing—review and editing, X.W. and P.S.; visualization, H.L.; supervision, P.S.; project administration, P.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Data are available in a publicly accessible repository.

Acknowledgments

The authors are grateful to Minjia Shi, Li Chen, and his colleagues for their helpful suggestions for improving the presentation of the material in this paper and pointing out the references [6,21,22,24].

Conflicts of Interest

The authors have no conflict of interest to declare that are relevant to the content of this article.

References

  1. Dodunekova, R.; Dodunekov, S.M.; Nikolova, E. A survey on proper codes. Discret. Appl. Math. 2008, 156, 1499–1509. [Google Scholar] [CrossRef]
  2. MacWilliams, F.J.; Sloane, N.J.A. The Theory of Error Correcting Codes; Elsevier: Amsterdam, The Netherlands, 1981. [Google Scholar]
  3. Massey, J. Coding techniques for digital data networks. In Proceedings of the International Conference on Information Theory and Systems, NTG-Fachberichte, Berlin, Germany, 18–20 September 1978; Volume 65. [Google Scholar]
  4. Wolf, J.K.; Michelson, A.M.; Levesque, A.H. On the probability of undetected error for linear block codes. IEEE Trans. Commun. 1982, 30, 317–324. [Google Scholar] [CrossRef]
  5. Leung-Yan-Cheong, S.K.; Hellman, M.E. Concerning a bound on undetected error probability. IEEE Trans. Inform. Theory 1976, 22, 235–237. [Google Scholar] [CrossRef]
  6. Baldi, M.; Bianchi, M.; Chiaraluce, F.; Kløve, T. A class of punctured Simplex codes which are proper for error detection. IEEE Trans. Inform. Theory 2012, 58, 3861–3880. [Google Scholar] [CrossRef]
  7. Kasami, T.; Lin, S. On the probability of undetected error for the maximum distance separable codes. IEEE Trans. Commun. 1984, 32, 998–1006. [Google Scholar] [CrossRef]
  8. Leung-Yan-Cheong, S.K.; Barnes, E.R.; Friedman, D.U. On some properties of the undetected error probability of linear codes. IEEE Trans. Inform. Theory 1979, 25, 110–112. [Google Scholar] [CrossRef]
  9. Ong, C.; Leung, C. On the undetected error probability of triple-error-correcting BCH codes. IEEE Trans. Inform. Theory 1991, 37, 673–678. [Google Scholar] [CrossRef]
  10. Kløve, T. Reed-Muller codes for error detection: The good the bad and the ugly. IEEE Trans. Inform. Theory 1996, 42, 1615–1622. [Google Scholar] [CrossRef]
  11. Abdel-Ghaffar, K.A.S. A lower bound on the undetected error probability and strictly optimal codes. IEEE Trans. Inform. Theory 1997, 43, 1489–1502. [Google Scholar] [CrossRef]
  12. Ashikhmin, A.; Barg, A. Binomial moments of the distance distribution: Bounds and applications. IEEE Trans. Inform. Theory 1999, 45, 438–452. [Google Scholar] [CrossRef]
  13. Barg, A.; Ashikhmin, A. Binomial moments of the distance distribution and the probability of undetected error. Des. Codes Cryptogr. 1999, 16, 103–116. [Google Scholar] [CrossRef]
  14. Xia, S.T.; Fu, F.W.; Jiang, Y.; Ling, S. The probability of undetected error for binary constant weight codes. IEEE Trans. Inform. Theory 2005, 51, 3364–3373. [Google Scholar] [CrossRef]
  15. Xia, S.T.; Fu, F.W.; Ling, S. A lower bound on the probability of undetected error for binary constant weight codes. IEEE Trans. Inform. Theory 2006, 52, 4235–4243. [Google Scholar] [CrossRef]
  16. Xia, S.T.; Fu, F.W. Undetected error probability of q-ary constant weight codes. Des. Codes Cryptogr. 2008, 48, 125–140. [Google Scholar] [CrossRef]
  17. Solé, P.; Liu, Y.; Cheng, W.; Guilley, S.; Rioul, O. Linear programming bounds on the kissing number of q-ary Codes. In Proceedings of the 2021 IEEE Information Theory Workshop (ITW), Kanazawa, Japan, 17–21 October 2021; pp. 1–5. [Google Scholar]
  18. Kløve, T. Codes for Error Detection; Kluwer: Singapore, 2007. [Google Scholar]
  19. Van Lint, J.H. Introduction to Coding Theory, 3rd ed.; Springer: Berlin/Heidelberg, Germany; New York, NY, USA, 1999. [Google Scholar]
  20. Xing, C.; Ling, S. Coding Theory: A First Course; Cambridge University Press: Cambridge, UK, 2003. [Google Scholar]
  21. Boyvalenkov, P.; Dragnev, P.; Hardin, D.; Saff, E.; Stoyanova, M. Energy bounds for codes in polynomial metric spaces. Anal. Math. Phys. 2019, 9, 781–808. [Google Scholar] [CrossRef]
  22. Cohn, H.; Zhao, Y. Energy-minimizing error-correcting codes. IEEE Trans. Inform. Theory 2014, 60, 7442–7450. [Google Scholar] [CrossRef]
  23. Ashikmin, A.; Barg, A.; Litsyn, S. Estimates on the distance distribution of codes and designs. IEEE Trans. Inform. Theory 2001, 47, 1050–1061. [Google Scholar] [CrossRef]
  24. Levenshtein, V. Universal bounds for codes and designs. In Chapter 6 of Handbook of Coding Theory; Pless, V.S., Huffman, W.C., Eds.; Elsevier: Amsterdam, The Netherlands, 1998; pp. 499–648. [Google Scholar]
Figure 1. Bounds in Theorem 2 of P u e for the Hamming Code H 4 .
Figure 1. Bounds in Theorem 2 of P u e for the Hamming Code H 4 .
Entropy 25 01349 g001
Figure 2. The Probability of Undetected Error of the Nordstrom–Robinson Code.
Figure 2. The Probability of Undetected Error of the Nordstrom–Robinson Code.
Entropy 25 01349 g002
Figure 3. Bounds in Theorem 7 of P u e for the Hamming Code H 4 .
Figure 3. Bounds in Theorem 7 of P u e for the Hamming Code H 4 .
Entropy 25 01349 g003
Figure 4. Bounds in Theorem 7 of P u e for the Hamming Code H 5 .
Figure 4. Bounds in Theorem 7 of P u e for the Hamming Code H 5 .
Entropy 25 01349 g004
Table 1. Distance Distribution of the Hamming Code H 4 .
Table 1. Distance Distribution of the Hamming Code H 4 .
i0345678910111215
A i 135105168280435435280168105351
Table 2. Bounds of A d for Some Binary Codes.
Table 2. Bounds of A d for Some Binary Codes.
Parameters [ 9 , 4 , 4 ] [ 10 , 4 , 4 ] [ 11 , 4 , 5 ] [ 12 , 4 , 6 ] [ 13 , 4 , 6 ] [ 14 , 4 , 7 ] [ 15 , 4 , 8 ]
Upper Bound141571414815
Lower Bound6125114815
Table 3. Distance Distribution of the Nordstrom–Robinson Code.
Table 3. Distance Distribution of the Nordstrom–Robinson Code.
i0681016
A i 1112301121
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, X.; Liu, H.; Solé, P. Bounds on the Probability of Undetected Error for q-Ary Codes. Entropy 2023, 25, 1349. https://doi.org/10.3390/e25091349

AMA Style

Wang X, Liu H, Solé P. Bounds on the Probability of Undetected Error for q-Ary Codes. Entropy. 2023; 25(9):1349. https://doi.org/10.3390/e25091349

Chicago/Turabian Style

Wang, Xuan, Huizhou Liu, and Patrick Solé. 2023. "Bounds on the Probability of Undetected Error for q-Ary Codes" Entropy 25, no. 9: 1349. https://doi.org/10.3390/e25091349

APA Style

Wang, X., Liu, H., & Solé, P. (2023). Bounds on the Probability of Undetected Error for q-Ary Codes. Entropy, 25(9), 1349. https://doi.org/10.3390/e25091349

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop