1. Introduction
Nowadays, the volume of processed information is constantly growing, and it is necessary to pay due attention to the security and immutability of transmitted and stored information. The most used method of protecting information is the use of error detection and correction codes. However, hardware implementations of error correction codes, data storage systems, and cryptographic algorithms are vulnerable to malicious analyses [
1,
2]. The class of attacks that exploit the physical properties and peculiar properties of system architecture is called side-channel attacks [
3]. Side-channel attacks are one of the most effective ways to breach the security of information, this class of attacks uses vulnerabilities in the implementation of the algorithm to obtain secrets [
3,
4,
5]. Analysis of system behavior in case of its incorrect work can give to an attacker a great advantage and valuable information [
3,
4,
5].
An attacker can exercise various effects on the hardware component of a cryptographic device to distort information at some stages of coding [
6,
7]. Additionally, an attacker has the capability of adjusting the fault injection mechanisms and techniques to inject errors of almost any multiplicity and any type. An attacker can even change the information transmitted over the channel and do it in such a way that it will be undetected by protective systems. This type of attack is called a calculation error attack [
1,
6]. The model of calculation error attack over an abstract storage device was first described by Cramer et al. in [
1]. This type of attack poses a serious threat to the integrity of information since successful attacks can be used to outflank a protection mechanism [
1]. This work also presented codes that can protect the system from such attacks—algebraic manipulation detection (AMD) codes. Accordingly, traditional error checking mechanisms are not suitable, since the scenario in which the attack develops can be atypical.
To protect systems from this type of attack, robust codes built on non-linear functions are used, because linear functions do not show all errors due to linear properties [
8]. The most promising non-linear functions with higher non-linearity are bent functions [
9].
Another important task for ensuring a high level of work for security code creation of a model, in which the encoding time will be minimal, without reducing the protective functions [
10,
11].
A design of AMD and robust codes was first proposed to protect two-level and three-level NAND flash memory for SSD drives. Due to its high data density, NAND flash memory used in SSDs is characterized by errors of various multiplicity, which directly affect security and performance. The use of hashing or digital signature algorithms is not suitable for flash memory, since the memory changes quickly, the volumes of information are very large, and the number of calculations will greatly slow down the operation of devices, unlike codes that everyone calculates on the fly. Additionally, one of the ways to use these codes is to protect cryptographic primitives from memory changes through algebraic manipulations, as well as to keep public-key encryption resilient against related-key attacks [
12]. Another topical trend in AMD code theory is the protection of implantable medical devices (IMD) from algebraic manipulations, in which errors can cause direct harm to human health [
13].
This article investigates the properties of robust codes constructed on bent functions and wavelet decomposition.
The paper is organized as follows.
Section 2 introduces the terminology used throughout this paper.
Section 3 describes the robust code, based on spline-wavelet decomposition and bent functions.
Section 4 describes the implementation of constructed codes for FPGA, comparing them to each other.
Section 5 describes hardware architectures. Finally,
Section 6 presents the conclusions.
2. Theoretical Assumptions
The linear codes that are used in most protocols and data transfer standards are not suitable for protection against algebraic manipulation attacks [
1] since one can always choose an error that will not be detected by the receiver. The general model of algebraic manipulation (including weak and strong conditions) over an abstract storage device was first presented by Cramer et al. [
1] and is demonstrated in
Figure 1.
Definition 1. AMD codes. Let m and n be two positive integers. An AMD code is a pair of a probabilistic encoding function from a set S of size m into a finite Abelian group G of order n, and a deterministic decoding function such that with probability 1 for every , where ⊥ denotes combinations which are not included in the code. An AMD code is called “systematic” if set S is a group and the encoding function E has the formfor a function F, with t being randomly chosen with uniform probability in . Any protected device in that model is given as an abstract storage device which can hold an element g from a finite Abelian group G. For both strong and weak conditions, an attacker does not know the element g and can change the stored element g only by adding error . Successful error injection is called an algebraic manipulation. In the weak model, the attacker can choose the value of e, but cannot change the value of s and therefore cannot influence the probability of certain input combinations occurring. In the case of a strong attack, the attacker can influence the exits by choosing the inputs. In this case, the attacker knows the value of .
To solve the problem of algebraic manipulation, in 2007 Mark Karpovsky proposed the use of robust codes, which are related to weak AMD codes [
6], and this class of codes was actively used to ensure a high level of information security [
13].
Definition 2. Robust codes are non-linear systematic error-detecting codes that provide uniform protection against all errors without any (or minimize) assumptions about the error and fault distributions, capabilities, and methods of an attacker [6]. Let
, this is the number of codewords in code
C. By the definition of an
, there are no more than
R code words that cannot be detected for any fixed error
e.
The probability of masking any error can be defined as:
In the case of linear codes, it is possible to choose an error whose the masking probability will be equal to 1, and the task of constructing a reliable code is to create the code with minimal masking probability over the entire space of errors. Therefore, the maximum error masking probability is the most important parameter for protecting information in a channel or data storage device and can be defined as
A graphic depiction of the definitions of the properties of a robust code is demonstrated in
Figure 2.
In order to avoid linear properties, robust codes are constructed on the non-linear functions, therefore the parameters of a robust code are dependent on the non-linear properties of the used functions. Under such conditions, the most interesting are non-linear functions for which the property of non-linearity is maximum [
14]. One such function is a bent function.
Definition 3. A bent function is a Boolean function with an even number of variables for which the Hamming distance from the set of affine Boolean functions with the same number of variables is maximal [15]. The bent function can be defined as a function that is extremely poorly approximated by affine functions. For the first time, bent functions were researched by O. Rothhouse in the middle of the 20th century, then they are also mentioned by J. Dillon and R.L. MacFarland [
9]. Currently, the research on bent functions is widespread, however, many questions remain in this subject unexplored and require careful consideration.
Definition 4. The non-linearity of a function f is the distance from f to a class of affine functions. Let us denote the non-linearity of the function f in terms ofwhere is the class of linear functions, . The function
is called maximally non-linear if
[
16].
The properties of bent functions:
Bent functions exist only for even n;
Bent functions are not balanced;
Bent functions depend statistically on all their arguments;
Let f be a bent function, and h belong to the class of linear functions. Then belongs to the class of bent functions;
Let — be functions of disjoint sets of variables. Then is a bent function if and only if f and g are bent functions.
For example, the Kerdock code [
6] is a robust code, and at the same time is a bent function using the above properties:
From the cryptographic point of view, the important criteria that a Boolean function f of n variables must satisfy are the following:
Equilibrium—the function f takes values 0 and 1 equally often;
The propagation criterion of order k—for any non-zero vector weight at most k, the function is balanced;
The maximum non-linearity—the function f is such that the value of its non-linearity is maximal.
The bent function matches the criteria propagation and maximum non-linearity, it allows it to detect all errors in the channel and to have a uniform probability of detecting errors. For example, the Kerdock code detects all errors in the channel due to maximum non-linearity with parameter
. The propagation criterion follows from the criteria of maximum non-linearity and protects the systems against error selection, which can be made by an attacker, each error will be detected after a certain number of injections. However, bent functions are not balanced [
8,
9]. The equilibrium criteria are important in the case of using bent functions in cryptographic S-Boxes, but in coding theory this is not a necessary criterion, since it does not affect the robust parameters, therefore, in the framework of this study, this property was not observed.
3. Spline Wavelet Code with Different Degrees on Bent Function
One of the most well-known methods of dividing information sets into streams is wavelet transformation [
17,
18,
19]. This method is used in many fields of science, including error protection codes [
6,
8,
17,
20,
21,
22,
23,
24,
25,
26]. In this article will be used wavelet transformation or rather first-degree spline-wavelet transformation for creating a robust code [
25,
26].
Definition 5. Let us take function , where t belong to the Hilbert space with the scalar product and the norm .
The idea of the wavelet transform is based on the partition of the signal
into two components, approximating
and detailing
, where
m denotes the decomposition (reconstruction) level.
In this article, will be used spline wavelet transformation for the creation of an error detection code.
Let
X be a non-uniform grid of elements,
, where
,
Z is the set of integers. First-degree splines on the grid
X are defined as follows:
As a part of this research, we considered spline wavelets of a higher degree, but this leads to an increase in mathematical operations without any benefit. Other types of wavelets did not suit us due to the impossibility of converting them to binary form.
In the process of wavelet decomposition, an element
is taken out from the grid
X, after this transformation will be received the new grid
.
where
,
Z is the set of integers.
Based on this new grid new splines
are constructed. New and old splines are interconnected. The relationship between the elements
and
can be shown using the formulas. The new splines
depend on the old
as follows:
In this research, the algorithm of spline wavelet decomposition will be used for building robust codes. Let us rewrite the spline-wavelet decomposition formula in terms of code creation:
where
,
k—the number of characters in code,
—informational element,
.
Spline-wavelet decomposition elements will be used for the construction of redundant symbols of robust codes. This will allow the creation of a large class of robust codes depending on the spline-wavelet grid values. In the case of changing the grid, without changing the construction, it is possible completely to change the properties of a robust code, for example, increasing the parameter R and decreasing the probability of masking any error. Without this method, in case of using only AMD code, the attacker can adapt to the system and find weaknesses.
For example, when the system is implementing a combination of LDPC code [
27] and AMD code for protection against algebraic-manipulation attacks (the structure is shown in
Figure 3), knowledge about the sparseness of the check matrix in LDPC gives an attacker significant advantages in both increasing the error masking probability and simplifying the process of finding undetectable errors [
27]. For a successful algebraic manipulation, an attacker needs to make an error simultaneously in the LDPC codeword, bypassing the check-in part of the AMD code [
27]. Therefore, if there is an algorithm that will change the values of the grid, the security of the code device from a “smart” attacker will increase.
Constructions presented in this work will be compared by the parameter R. In this research, the value is taken since most protocols work with a given number of information symbols. For each number of variables, bent functions were constructed on variables and spline-wavelet decomposition elements. In this construction, for all code, selected grid is . It is based on static values, or on the information part of the codeword.
The grid is very important factor for robust codes, because if the grid is static, the wavelet element will became a simple affine function: , where is 0 or 1, but if the grid is based on information part, wavelet element become a non-linear function of second or third degree.
Let
denotes the codeword of some shared
code. Then
is the information part, and
—additional part,
. For presented construction grid is selected depending on the spline wavelet function,
is a function from
Table 1,
,
m is the number of parameters of the function
f, it can correspond to
n or can be less than
n if we use the function from
Table 1 with the number of variables
. The vector
c belongs to the code if
The results of the analysis designed code show that the parameter
R and, accordingly, the maximum probability of hiding the error
is lower than that of the existing solutions with the same number of additional bits. However, the designed solution also has disadvantages—the time for encoding information can be quite high [
25].
Functions for
n = 8 are presented in
Table 1, which also gives the conditions of the grid and the degree of function.
Let us compile the code constructions for all the above functions with two redundant symbols
,
for
. Calculated parameter
R and the maximum probability of error concealment, the results are listed in
Table 2. A comparison was also made with the Kerdock code, which is the most well-known and used robust code. Additionally,
Table 2 shows the time taken to encode 5000 bytes of information calculated by the software. The value of 5000 bytes is taken to represent the encoding time and to understand the difference between the compared codes.
Lemma 1. With an increase in the number of information symbols, the function stays a bent function.
Proof. Since when adding multiplicative elements , by property 5 of the bent function, the final function will also be a bent function. □
One of the main parameters of code constructions is coding time, but at the same time, code constructions need to be created with a high degree of bent functions, because, for security reasons, we need to lower the maximum probability of error concealment. The solution for it is based on matrices and optimization of hardware implementation.
4. FPGA Implementation of Presented Constructions
In this section, we will show the FPGA implementation of the presented constructions, and demonstrate its matrices representation. Let us try to roughly represent the created constructions in the form of matrix multiplication, to identify new constructions, and further optimization of the algorithms for the (10,8) code. Selected designs are built based on a more convenient implementation for FPGA—the minimum number of connections and calculations FPGA implementation of presented constructions.
For all constructions, the spline-wavelet function is the same and can be described as:
It is possible to use linear components for additional methods of construction change.
Lemma 2. The use of linear components for an additional method of structural change does not reduce the non-linear properties of the functions.
Proof. According to property 4 of the bent functions, the linear component does not affect the non-linear properties. □
4.1. Construction 1
The grid
x is static and can be selected as desired. The matrix function of the code is presented below:
The following is the classic form:
All functions are bent functions of degree 2 constructed from spline wavelets and Kerdock code. Its FPGA implementation is shown in
Figure 4.
4.2. Construction 2
The grid
x:
. The matrix function of the code is presented below:
The following is the classic form:
All functions are bent functions of degree 3 constructed from spline wavelets and Kerdock code. Its FPGA implementation is shown in
Figure 5.
4.3. Construction 3
The grid
x:
. The matrix function of the code is presented below:
The following is the classic form:
All functions are bent functions of degree 4 constructed from spline wavelets and Kerdock code. Its FPGA implementation is shown in
Figure 6.
4.4. Code Construction Comparison
When comparing different codes, bit overhead (BO) is also an important parameter. Bit overhead is defined as the ratio of parity bits to data bits present in a codeword. A better error detection and correction method should have less bit overhead [
28].
In this research, bit overhead is the same for all compared codes. A large number of additional bits does not make much sense in a code structure, since its main task is not to correct errors, but to detect all of them, so it may be a good idea to combine robust and linear codes that will solve different problems—detecting all errors and correcting errors of certain type.
The bit overhead, parameter
R, and maximum probability of error concealment for presented constructions are listed in
Table 3. Additionally,
Table 3 shows the time taken to encode 5000 bytes of information calculated by software.
A degree of the bent function different from 2 gives a better result for the parameter R. The number of calculations and the time spent on coding information is more compared to the codes built on bent functions with a power of 2. When these codes are used in the case of protection against attack, then the parameter R is more important. The best value of the R parameter has the robust duplication code, but the information encoding time is very long and the number of bits is equal to the number of information symbols, which can disrupt the operation of the device where this code can be applied. Using spline wavelets gives the possibility to build a large number of robust codes, and bent functions, and increase their degree, thereby improving the quality of robust codes. Additionally, if the number of information symbols increases for a codeword, the functions will stay bent functions.