1. Introduction and Some Definitions
We assume standard details to be found in textbooks [
1,
2] about codes over finite fields, their duals, generator matrices and so on. A subspace of dimension
k of a vector space (with a fixed coordinate system) over field
F is called a
code with parameters
. The starting point here is a kind of linear code called a
Linear Complementary Dual (LCD) code [
3,
4,
5,
6,
7]. These have been used recently in applications involving protection schemes against side-channel and fault injection attacks [
8,
9,
10].
Let C be a subspace of dimension k of a vector space of dimension n (C is a linear code). A complement D of C (or complementary subspace to C) is a subspace of the complementary dimension such that both subspaces together generate the whole space.
The Grassmann dimension rule for any subspaces
X and
Y is that
where
denotes the smallest linear space containing both
X and
Thus, for C of dimension k and for any subspace D of dimension , we have showing that if and only if .
A linear code C over a field F with parameters has the LCD property if its dual code is a complement of C in the vector space . Since , we have an equivalent condition that the only word of C orthogonal to all the words of C is the zero word. Note that for the real field , every code is LCD, but for finite fields this is false.
Before discussing the various algebraic and geometric ideas related to LCD codes, we shall clarify the definitions; for example, the term “invariant” has various meanings depending on the situation.
A square matrix with a single non-zero entry in every row and every column with all the other entries as 0 is a generalised permutation matrix GPM; it is, more precisely, a permutation matrix PM if the non-zero entries are 1.
Suppose is a matrix over F. Let . As usual, denotes the determinant of a square matrix B over F. Consider the following conditions on a polynomial p in the entries of A:
for any matrix B;
for any GPM C of size ;
for any PM C of size .
A polynomial function homogeneous of degree in the variables may be known as the following:
It is clear that if
, there is only one kind of invariant that is
. For non-square matrices, there are quasi-invariants that are constructed, as in [
11], from formulae involving subdeterminants of the matrix
A. One of the authors in [
12], for the cases
being any prime power, constructed invariants
for prime characteristic fields. We shall see that there is a quasi-invariant that determines the binary LCD codes.
Another kind of (non-matrix) invariant is that of Tutte–Grothendieck in the theory of matroids. Here, the Tutte (bivariate) polynomial is connected with the binary LCD codes.
The well-known Reed–Muller codes are certain linear codes defined from polynomial functions [
1,
2]. In
Section 2, we relate the
binary codes to the self-orthogonal codes that satisfy
. There are also finite geometry codes. Given a fixed dimension
d of the subspace of a finite projective space
over the field
, there is a linear code called the
subspace code. This has a length of the number of points
of
and words that are generated by the characteristic functions of the subspaces of dimension
d. The dual code to this is called the
geometric code for dimension
d of subspace. It is the space of functions from the point-set to
such that the sum of the values on any subspace of dimension
d is zero.
The material is arranged as follows.
Section 2 relates the theory of binary LCD codes to the permanent function.
Section 3 considers Grassmannian varieties, linear and quadratic complexes and how they are related to LCD codes, especially looking at the binary case.
Section 4 discusses constructions of LCD codes and the number of them. It uses the counts of polarities in binary space and the numbers of self-orthogonal or certain Reed–Muller codes. At the end, we discuss the Tutte invariant of matroids and how it is used to count bases in the matroid, which, in the binary case, evaluates the permanent of the corresponding binary matrix.
Section 5 summarises this note.
3. Grassmannian Varieties and Geometrical Considerations
Here, we recall some classical geometrical ideas and use them to understand the LCD codes better. In particular, we can enumerate LCD codes in a certain way. The main reference we use here is ([
14], §7), but one should be aware of some differences with fields of prime characteristic, especially when quadrics (degree 2 hypersurfaces) are considered. However, the main results for Grassmannian and other concepts such as null polarities and linear complexes are usually identical.
As established originally by Grassmann (we refer to the elder H.G. Grassman, not his son H.E. Grassman who was also a high-level mathematician) in the 19th century, one can coordinatise subspaces of a particular dimension in projective space , the projective space of dimension over the (commutative) field F, using Grassmann coordinates. We are reminded that Grassmann was one of the first to investigate higher-dimensional geometry in a systematic way. The Grassmann coordinates generalise the Plücker coordinates for lines in 3-dimensional space that were discovered earlier in that century.
We consider a set of k independent points (a basis) in for the particular subspace of dimension . Then, we form a matrix A with the coordinates of these points as the k rows, and then the Grassmann coordinates for the subspace is the list (in a particular fixed order) of subdeterminants of A. Thus, the number of coordinates is . Note that the choice of basis for the subspace does not affect the Grassmann coordinates since multiplying on the left of A by a non-singular matrix B will only multiply the Grassmann coordinates by , and Grassmann coordinates are homogeneous. The same is true for non-zero multiples.
There are many properties of these coordinates for subspaces, including formulae for dual subspaces, for how subspaces intersect, and the quadratic relations between the coordinates. Suffice it to say that the set of Grassmann coordinates, a “Grassmannian variety”, is an algebraic variety that is the precise intersection of a number of quadrics in the space . The smallest non-trivial example is the Klein quadric in with point coordinates , which is the Grassmann coordinate for the lines in 3D space. In particular, if denotes the subdeterminant with columns , then . Note that the dimension of the space of lines in is four, and this is the same as the dimension of the Klein quadric (or any hypersurface) in .
Theorem 4. An -code over a field F is LCD if and only if as a subspace its Grassmann coordinates satisfy ; that is, the norm of the Grassmann coordinates’ vector is non-zero.
Proof. This follows directly from Theorem 2 and the definition of Grassmann coordinates. □
Given that, in general, the Grassmann coordinates satisfy many quadratic conditions we have the following:
Corollary 2. There are many possible equivalent quadratic conditions on the Grassmann coordinates to determine an LCD code (when ).
For example, in the case above, gives the space of all quadratic conditions. In the characteristic two case, in particular when F is the binary field , the condition for LCD is that the sum of the Grassmann coordinates is non-zero, a linear condition. In that case, we can characterise the LCD codes as the points in the complement of the Grassmannian with a certain hyperplane.
In the case
, the intersection of the Grassmannian of lines of
with any hyperplane corresponds, in general, to a “linear complex” of
. This is because any linear complex is just a set of lines satisfying a single non-trivial (homogeneous) linear condition on the Grassmann coordinates of the lines. Then, it is known [
14] that a linear complex determines and is determined by a null polarity: two points are conjugate under the polarity if and only if the line joining them is in the complex.
Conjugate means that one point is on the image (a hyperplane) of the other point under the polarity. Note that
null means that every point is self-conjugate. The image of any point under the null polarity will be the hyperplane that is the union of all the lines of the linear complex passing through that point.
Often, we substitute the word symplectic instead of “null” since symplectic groups can be defined as the non-singular linear transformations that commute with a null polarity.
Theorem 5. In the characteristic two field case, non-LCD codes correspond to lines of a linear complex of and a null polarity. If n is even, then the linear complex and null polarity are non-degenerate, but, if n is odd, they are degenerate.
Proof. The condition of Corollary 1 that the sum of the
subdeterminants in the
generator matrix for
C is zero is a linear condition in the Grassmann coordinates for the lines of
. This translates ([
14], §XI, p. 380) to the null polarity with matrix
, where
is the all ones matrix and
is the identity matrix of size
n. By using row reductions (subtracting the first row from all the others) then adding the first column to all the other columns, we obtain an upper triangular matrix with a main diagonal of
. Thus,
if
n is odd, and 1 if
n is even. Another way to execute this is to calculate
so that modulo two
is self-inverse (
, invertible, for even
n. In addition, nilpotent,
, not invertible, for odd
n. Note that in the odd
n case, the rank of
is
because it contains a non-singular
principal submatrix so that the corresponding linear complex and null polarity are minimally degenerate. □
Since a line of will be either in the linear complex or not, we can count the number of LCD codes . Let us demonstrate this for the binary case, . Suppose n is even and we have a non-degenerate null polarity of . Given a point P, its image will be a hyperplane of . Then, any point is conjugate to P, so every line through P in is in the linear complex corresponding to . There are such lines. By counting point/line incidences in two ways, we obtain the number of lines of the linear complex as . Hence:
Theorem 6. In the case n is even, the number of LCD codes (lines not in the linear complex) is These counts are consistent with the formulae in [
6].
Corollary 3. For , there is a single binary LCD code , which is generated by and . For , there are LCD codes with parameters .
For instance, the [4, 2] codes with generators and are LCD.
A projective code is a linear code which has a dual distance of at least three. Equivalently, any generator matrix of such a code has no zero columns or columns that are multiples of one another. For example, the longest binary projective LCD code will be obtained by puncturing the simplex code at any column so that one could consider a binary matrix as a generator matrix with all the possible non-zero columns except for . We shall learn more about this in the next section.
4. Constructions and Numbers of Projective Binary LCD Codes
Since the main applications have, so far, been in the binary case, we will continue to consider binary LCD codes here. Suppose C is an LCD code over a characteristic two field with generator matrix A. We can assume that A has no zero or repeated columns because of the following:
Lemma 1. If a zero column is deleted from or appended to the matrix A, or if a pair of identical binary columns are deleted from or appended to it, then the product is unchanged.
Proof. The first part is elementary, and the second part uses the fact that in the field. □
Once we assume the simplification of non-zero or repeated columns in a generator matrix, and also the equivalence of codes under permutations of the element positions, we can make a geometrical leap and consider sets of points in projective space of dimension over the finite field GF(2). Every point of has homogeneous coordinates, a non-zero vector in . A subset of n points of then corresponds to a matrix A. There are no zero or repeated columns in this matrix. Thus, it is the generator matrix for some binary code of length n that has a dimension of at most k. The property of this code being LCD, self-orthogonal or similar is an inherent property of the set of points, as is the code’s distance. Equivalent codes will be related by non-singular (row) transformations; that is, collineations of the space .
When considering LCD codes for which is non-singular, the product is important. It can be any symmetric binary matrix. Any set S of n points of gives a matrix with the columns as point coordinates of the set. This, in turn, gives the symmetric binary matrix . If its determinant is 1, then the code generated by the rows of A is LCD.
Suppose we have two sets of points S and T such that their corresponding symmetric matrices Then, the symmetric difference will be another set of points with . Note that by Lemma 1, columns which are the same in both matrices do not contribute because they are repeated in the join of the two matrices and are deleted in the symmetric difference. Thus, is a generator matrix for a self-orthogonal code.
We need the following result of MacWilliams [
15].
Lemma 2. The number of non-singular symmetric matrices over is Note that enlarging the size of the matrix by one (from an even size to an odd size k) multiplies the number of non-singular matrices by the number (in the term with ) of non-zero vectors in the larger space.
There is an interesting relationship between self-orthogonal binary codes (all the codewords are pairwise orthogonal) and certain punctured Reed–Muller codes. A binary Reed–Muller code has parameters , where each position in a word corresponds to a vector in , and each word corresponds to evaluations of a polynomial in k variables of a degree at most i over . We consider punctured (projective) codes so that evaluations at the zero vector are neglected, trimming the code length .
Theorem 7. Every matrix A, with no zero or repeated columns that satisfy (the rows generating a self-orthogonal code) can be formed from a set of planes under symmetric difference (giving a set of points) in the projective space . Each symmetric difference corresponds to a unique word of the punctured Reed–Muller code .
Proof. Note that each plane consists of seven points, and if we write these points as the columns of a generator matrix, the code is the simplex code , which is self-orthogonal, contained in its dual, the Hamming code. In the general case, we form a generator matrix G for the simplex code , which is self-orthogonal for . We then append it to rows, each row corresponding to a polynomial , and, for a column of G, we insert the value into the row with label . This makes a binary matrix H. This method of creating longer vectors is called Veronese mapping in geometry, so the columns of H are coordinates of points on a certain Veronesean variety. If we added the entire ones row to H, we would have a generator matrix for the quadratic punctured code. However, we do not follow that. We consider the dual code to the code generated by H.
If v is any word of of weight n, it is easily checked that the corresponding columns where the ones are form a submatrix A of H that is self-orthogonal. What, however, is ? It is the punctured made from polynomial functions in k variables of a degree at most (including the constant 1 function of degree 0).
The connection with planes of
follows from the Fundamental Theorem of Geometric Codes as found in [
16,
17]. In the binary case, the characteristic functions of the planes of the projective space
generate a binary code, and a basis for this code is precisely the set of all functions from
to
given by the mappings
, a polynomial of degree at most
in the
k variables. This code is called the subspace code
in [
17]. In general, the geometric codes and the subspace codes are classified for any finite field
, but a more general polynomial degree has to be specified that is related to the automorphisms of the field: mindeg for the geometric codes and maxdeg for their dual subspace codes. In the binary case (or any prime case), mindeg = maxdeg = deg, so it corresponds to Reed–Muller codes. □
Note that the matrix A in Theorem 7 may have a rank less than k, but, if it has full rank, then it is a generator matrix for a self-orthogonal code. The case where A has k rows but no columns happens when the symmetric difference of the planes is empty, such as when there are no planes.
Corollary 4. Every self-orthogonal binary code can be constructed from a set of planes of or, equivalently, from the punctured code.
Since the all ones’ words are in the punctured RM code, or, equivalently, the whole set of points of is the symmetric difference of the characteristic functions of a number of planes, we have the following:
Corollary 5. There is a different kind of complement for any LCD code. Assuming that the columns of a (projective) generator matrix G are non-zero and are not repeated (equivalent to saying that the dual code has a distance of at least 3), we now form a new generator matrix from all the non-zero columns that do not appear in G. Thus, from a projective LCD code, we obtain a projective LCD code with . In most cases, .
Theorem 8. When k is even, , the number of sets of points of that correspond to projective LCD codes is given by: When k is odd, , the number of sets of points of that correspond to projective LCD codes is given by: Proof. A binary simplex code is self-orthogonal if its dimension is at least 3. In this smallest case , a corresponding generator matrix, is the matrix of all possible non-zero binary columns. This being the coordinates of the 7 points of . This is a special plane in , but using collineations (non-singular linear mappings), one sees that any plane of (if ) gives a generator matrix for a self-orthogonal code. By using the closure of all planes under symmetric differences, we obtain the “subspace code” on the points of , isomorphic to an elementary abelian group over of dimension , and each of these sets of points gives a self-orthogonal code, also counting the empty set which gives the trivial code . Note that when , the smallest case, the subspace code has a dimension of 1 with two words, which is the whole plane and the empty set.
We check that every (projective) self-orthogonal code contained as a truncation by deleting columns from the simplex code of dimension k is obtained that way because the planes are the subsets of a minimal size 7. There are no self-orthogonal codes of a length less than 7 with no zero or repeated columns when considering all cases.
Each binary symmetric matrix B will correspond to a coset of the above self-orthogonal code subgroup because we can always write for some A; for example, forming an incidence matrix (vertices versus edges) of the graph with B as (vertex) adjacency matrix and appending appropriate unit columns to ensure that the number of ones in each row is even. Any set of points that gives B as its will then be the symmetric difference of a self-orthogonal set of points with a fixed set (e.g., corresponding to the columns of the (appended) incidence matrix just mentioned). Hence, we count the number of symmetric non-singular matrices B using Lemma 2 to finish the proof. □
In [
18], it is shown that a (binary) LCD code with a non-singular symmetric matrix
with a 1 on the leading diagonal (equivalently, the code is “odd-like”, having odd-weight words) has a normalised generator matrix with
(the rows then form an orthonormal basis). In the case of
k being even,
C can be “even-like”, and then the generator matrix can be normalised so that
is the symmetric permutation matrix with
transpositions. If
k is odd, then
C must be odd-like, since a skew-symmetric matrix (with zero diagonal) of an odd size
k is always singular (from the Pfaffian formula).
Let us connect this with finite geometry. See [
19] for the theory of polarities over finite fields. In general, a non-singular symmetric matrix
B corresponds to an
orthogonal polarity (having the absolute points of a quadric) and a non-singular skew-symmetric matrix corresponds to a
symplectic (or null) polarity having every point absolute (lying on its image hyperplane).
B is used to bijectively map a point (column vector)
to the hyperplane with dual (row) coordinates
. Similarly, any hyperplane
. Repeating the polarity twice,
so that the polarity has order two (meaning it is an involution).
The (linear) polarities of (for any prime power q) are of these two standard types. There is also the Hermitian polarity, but this is non-linear depending on a square root subfield not existing in the binary case. The orthogonal (ordinary) type of polarity has the absolute points of a quadric, the set of points . This is non-degenerate in general, but, in the characteristic two case (e.g., binary), the quadric degenerates into a hyperplane (squared) if B has a non-zero main diagonal. If k is even, there can be symplectic polarities such that any point is absolute, lying on its image. Thus, , meaning that B has zero diagonal. Note that the standard skew-symmetric condition for symplectic changes in characteristic two to symmetric and zero diagonal.
From the reduction to just two kinds of symmetric matrices (or linear polarities) we have this:
Proposition 1. When k is even, , the number of sets of points of that correspond to non-equivalent LCD codes is at most: When k is odd, , the number of sets of points of that correspond to non-equivalent LCD codes is at most: To reduce these numbers much more, one could investigate the groups of collineations of
similar to [
18] because sets of points that are mapped to each other by a collineation will give equivalent LCD codes.
The number of LCD codes has been counted in other ways, e.g., in [
20], with mass formulae for
. The number of non-equivalent LCD codes is harder but has also been calculated there for smaller cases.
Given a binary code with a generator matrix A, we construct the matroid which has the column vectors of A as a set of n points. A basis of is a linearly independent set of k columns over the binary field . The property of no zero or repeated columns in A means that is a simple matroid. Note that the matroid is the same no matter what generator matrix is used because multiplying on the left of A by a non-singular matrix does not change the dependence properties of the columns of A.
is called a representable matroid since it corresponds to a set of n points in binary projective space , where each column of C forms the homogeneous coordinates for the corresponding point. Any non-zero codeword of C can be seen in the geometry of by considering a hyperplane of . Then, the positions with non-zero values (ones) in the codeword are those points of that are not in that hyperplane. Thus, the distance of C is given by the complement in of the largest intersection of C with a hyperplane. This gives a word of the smallest Hamming weight in the code.
Theorem 9. A binary code C is an LCD code if and only if the number of bases of is odd.
Proof. This follows directly from the fact that for a binary matrix A, is the sum of the subdeterminants of A, as seen when taking square roots in Corollary 1. □
The number of bases of a matroid has been well studied, and, in particular, it is related to the Tutte polynomial invariant. Here, we discuss some of the ideas which can be found in [
21].
The Tutte polynomial of a matroid
M is defined as
where
is the rank
of the whole matroid minus the rank
of the subset
S, and where
, the nullity of
S, is the number of points
in
S minus
. Note that
, and
. Furthermore,
if and only if
S is a basis of
M, while
if and only if
S is a set of loops of
M (points of rank 0).
if and only if
S is a subset of a basis of
M (an “independent” set).
if and only if
S is a set (perhaps empty) of loops of
M.
Another important property for coding theory is that the Tutte polynomial of the dual matroid satisfies . If the matroid M comes from the columns of a generator matrix of a linear code C, then the dual matroid comes from a generator matrix for the dual code . In general, the dual of a matroid M is defined on the same (ordered) set of points, with the bases being the complements of the bases B in the original matroid. The dual concept of a loop is an isthmus, which is a point of M contained in every basis of M. Thus, adding a loop to a set S never increases its rank, while adding an isthmus always increases the rank of S by one.
The Tutte polynomial can be calculated recursively by the operations of matrioid deletion and contraction, which for generator matrices correspond essentially to the operations of deleting a column or deleting a row of a generator matrix. The main relation is the following:
where
e is a point (neither a loop nor isthmus) of
M,
is the deletion of
e from
M, while
is the projection (or contraction) of
M from
e. The recursion in the case where
f is a loop is
and, if
g is an isthmus, then
Of course, with all these recursions, it turns out that the Tutte polynomial in the general case has a high complexity of calculation. However, we note that it is known that the evaluation at the point gives the number of bases of M.
Theorem 10. is the number of bases of the matroid M.
Corollary 6. The matroid of a binary code has an odd if and only if C is an LCD.
Since the number of maximal non-singular submatrices determines whether a generator matrix gives an LCD code or not, we have the following easy fact which also comes from the evaluation of the Tutte invariant as above.
Corollary 7. Given any (pivot) position with the value 1 of a LCD generator matrix A, either we can delete the column with the pivot to obtain a smaller LCD generator matrix or we can use row operations to make all zeros below the pivot, and then the minor of A will be a LCD generator matrix. One of these cases will occur but not both.
Proof. This follows immediately from the permanent function description of in Theorem 3 because the (odd) number of permutations in A is counted by going through the pivot or not. □