1. Introduction
Besides numerous other uses, stream ciphers play a central role in mobile telecommunications. Essentially every hand-held communication device contains a hardware implementation of one or more stream ciphers. Since these devices are readily available to a potential attacker, there is an urgent need to secure those hardware implementations against side-channel attacks targeting the secret keys stored on the device and the keystream produced from them. Therefore it is an important task to analyze the security of current stream ciphers and stream cipher proposals against various types of side-channel attacks, both of the passive type (measuring power consumption, measuring electromagnetic radiation) and of the active type (generating power glitches, laser injections, mechanical disturbances).
In this paper we examine the Enocoro family of stream ciphers developed by Hitachi Ltd. (see [
1]). These stream ciphers are contained in the Japanese government’s CRYPTREC recommended ciphers list (see [
2]) and have been standardized by ISO in ISO/IEC 29192-3:2012 (see [
3]). The Enocoro family consists of the stream ciphers Enocoro-80 (see [
4]) and Enocoro-128
v2 (see [
1,
2]). Their construction is inspired by the PANAMA hash function and stream cipher (see [
5]). The security of Enocoro stream ciphers against various types of attacks has been considered in [
6,
7,
8,
9,
10,
11]. For instance, in [
9], a slide attack is proposed, and [
10] suggests a guess-and-determine attack which, despite its name, is unrelated to the attack studied here. Side channel attacks on the family of Enocoro stream ciphers have not been studied intensively. Only [
7] briefly discusses two types of passive side channel attacks, namely Differential Power Analysis (DPA) and Correlation Power Analysis (CPA). Active side-channel attacks, in particular fault attacks, apparently have not yet been analyzed in this context. Notice that the core part of our attack relies on a guess-and-determine mechanism which has to adapt to the results of previous fault injections. For this reason, the guess-and-determine attack strategies presented in [
6,
10,
11] cannot be adapted to the fault attack setting in a straightforward way and a completely new approach had to be developed in
Section 5.
In general, fault attacks are a well-researched topic (see [
12,
13,
14,
15,
16,
17,
18,
19]). Their main characteristic is the set of assumptions made concerning the capabilities of the attacker: physical capabilities, computing power, or prior knowledge. So, let us start by specifying our
Fault Model: The proposed fault attack has byte-sized fault injections at its core. In addition, we are assuming the setting of a known plaintext attack. Since the bit stream of the plaintext is added to the keystream, this means that we are in possession of the keystream. Moreover, we assume that we can reset the internal state, typically by restarting the keystream generation with the same secret key and the same initialization vector. Of course, we do not assume the knowledge of these values, just the possibility of resets. As for the fault injections themselves, we require only a very mild spatial resolution, namely the injection of a byte fault into an unknown register, as well as a very mild temporal resolution, namely the possibility to time the injection within a small time period, i.e., a small number of clock cycles.
Feasibility: For an actual physical realization of our attack, these assumptions are not particularly hard to fulfill. Given physical hardware access, the first two assumptions can be reduced to the setting of a known plaintext attack and the assumption that we can convince the hardware device to take a certain initialization vector as an input. For the actual injection of faults, there are several different techniques available. These include the usage of electromagnetic or laser pulses, over-and under-clocking of the circuit, and voltage drops. For extensive studies on these techniques and their limitations, we refer to [
20,
21,
22]. Practical implementations of much more demanding fault models have been reported in [
23,
24,
25]. Based on these results, we believe that the injection of byte faults into unspecified registers on an FPGA-implementation is readily achievable with current technology.
The proposed fault attack framework consists of several parts. First we introduce a differential clocking algorithm. By comparing the correct keystream with the corrupted keystream, i.e., by considering the differential keystream, it allows us to determine the precise injection point and to find differentials in certain register values. Then the task is to use these differentials to derive other register values until we are able to completely reconstruct the internal state of the cipher at one point in time. Clocking the keystream back in time, we are then able to reveal the secret key and the cipher is broken.
So, how do we use several differentials in register values to deduce the full internal state? The construction of the cipher allows us to write down equations connecting the bits of the register values and the key stream bits. Then we can make deductions of the following kind:
If we know registers R at time t and at time , we can deduce the value of at time .
(Here “register” has to be replaced by the “keystream” in some cases.) All of these deductions are phrased as propositional logic formulas, more precisely, as Horn clauses. Since there is an efficient algorithm for checking the satisfiability of a set of Horn clauses, called the Marking Algorithm, we can quickly decide if our current knowledge suffices to deduce the entire internal state or not. In actual fact, we may not know the exact value of some register at some point in time, but only a (small) set of candidate values. Then our deduction technique has to be refined as follows. We assign each Horn clause a weight corresponding to the amount of information we have about it, i.e., how large the candidate sets of values are. Then we use a suitably adjusted version of the Marking Algorithm to check if we can reduce the candidate set for the internal state to a manageable size, where “manageable” means that we can find the correct internal state by a small enough exhaustive search. Furthermore, we extend the scope of this procedure by allowing the guessing of certain register value bits (thereby increasing the search space, of course), and finding the optimal guessing path heuristically. In other words, our attack tries to model and follow the information flow during the execution of the stream cipher, and it tries to optimize the information gain which can be derived from each fault injection.
Does this approach work? For the Enocoro family of stream ciphers, it seems to work extremely well. A successful fault attack on Enocoro-128v2 is possible using approximately five fault injections on average, where the fault is assumed to be injected anytime during the first 10 clock cycles. The ensuing computation requires approximately 20 min.
Contents. Now let us describe the course of the paper in somewhat greater detail. In
Section 2 we recall the definition of the family of Enocoro stream ciphers. In particular, we provide the mathematical equations defining the clock and the update functions. In the next section we use this description to make our fault model, and the assumptions underlying it, very explicit. We also introduce some common terminology for fault attacks. The topic of
Section 4 is the
Differential Clocking Algorithm 1. This algorithm allows us to track
differentials, i.e., input and output differences, through a clock cycle. More precisely, it returns a set of linear polynomials that relate distinct differentials with one another and with the keystream output. With these linear polynomials we can trace a generic fault through the internal states and the keystream output on a cycle-by-cycle basis. One of its first uses is the possibility to locate the injection point of a fault injection up to a small number of possible cases (see Propositions 2 and 3). After determining the injection point, we can then try to find as many input and output differences as possible. Concrete results for the cipher Enocoro-128
v2 are given in the last part of this section.
In
Section 5 our attentions shift to combining several fault injections in an optimal way. The basic strategy is to perform a Guess-and-Determine (GD) Attack. We describe logical implications between the knowledge of various register values by Horn clauses. When dealing with fault injections, often the acquired information is not exact, i.e., instead of the precise register value, only a small set containing that value can be found. As indicated above, this is modelled by assigning a weight to the corresponding Horn clause. Now the order in which the registers can be computed, represented by their corresponding Horn clauses, is called a (weighted) guessing path. The weighted clauses here are obtained by carefully tracking how the fault injections propagate through the internal state, and how the various differential in- and outputs that occur due to the faults are related to one another. The
Guessing Path Attack Algorithm 2 allows us to find the initial state of the cipher if the guessing path has just small enough weight. Since finding an optimal guessing path is not feasible, we propose a
Greedy Marking Algorithm 3 as a heuristic to find a guessing path of small weight.
These techniques are put to work in
Section 6 where we apply them to construct actual fault attacks on Enocoro stream ciphers by describing an
Enocoro Fault Attack Framework in Algorithm 4. In the final section we execute this framework in actual simulations and report on the timings and results. Using approximately five fault injections and about 20 min of calculations, we can expect to retrieve the internal state and break the cipher Enocoro-128
v2.
Unless stated otherwise, we use the definitions and notation in [
26,
27]. The algorithms of
Section 4 and
Section 5 were implemented by the first author using the computer algebra system
CoCoA-5 (see [
28]).
2. Description of the Enocoro Stream Cipher Family
This section summarizes the relevant aspects of the family of Enocoro stream ciphers as specified in [
1]. We adhere to the definition of a stream cipher, as in [
29]. Let
and
with
for
and
. Now the stream cipher
is defined as follows:
- (a)
The cipher consists of the registers
and the value of register
at time
is denoted by
. The internal state at time
t is the tuple of the values of the registers at that time and denoted by:
The first entries are also called the buffer, and the remaining two coordinates are known as the state.
- (b)
The next-state function
where
is defined by
for the buffer registers. Notice that restricted to this part, the update function is linear. For updating the remaining two registers, one defines an invertible linear function
and a non-linear S-Box
whose precise construction is not relevant for our attack and is omitted here. However, the following relation is important:
Observe that as a whole is an invertible function, i.e., it is possible to retrieve the so-called initial state from any internal state .
- (c)
The output function, denoted by , is linear and given by .
- (d)
The general specification does not include an explicit description of the initialization function that takes an initialization vector (IV) and a secret key and returns the initial state . Usually this function is invertible, and heavily based on the next-state function .
From this general framework, two specific parameter sets were derived, for 128-bit security [
1] and 80-bit security [
4]. Those are both standardized in ISO 29192-3 [
3] (along the well-studied cipher Trivium [
30]), and the 128-bit version is part of the
Candidate Recommended List of the Japanese CRYPTREC.
Example 1 (Enocoro-128
v2 [
1])
. For 128-bits of security, we let , , , , , , , , , , and . Then the internal state consists of 34
bytes. The initialization function is efficiently invertible and based on 96
applications of the next-state function . It takes a key in and an IV in . The next-state function of Enocoro-128
v2 is also shown schematically in
Figure 1.
From now on we assume that is efficiently invertible. Thus the used key and IV can be derived quickly from the initial state (and any other internal state where t is not too large). The fault attack of this paper computes the initial state of a given Enocoro stream cipher instance. In the last section, results from simulations and experiments with the 128-bit version of Enocoro are presented.
In the following we fix a key and an IV and consider the stream cipher with an (efficiently) invertible initialization function. We also denote the sequence of internal states by and the keystream output by .
3. Our Fault Model
Our fault attack is based on a differential fault analysis which is ultimately reduced to a close inspection of the differential keystream. For this, we clearly need to have access to the keystream at all times.
Assumption 1. The keystream is accessible, i.e., we are in the setting of a known plaintext attack.
In practice, this assumption can be slightly weakened as we do not need to have the whole keystream available. The simulations in
Section 7, for instance, never used more than the first 40 keystream words. In order to be able to combine information of different fault injections, it is necessary that they convey information about the same sequence of internal states.
Assumption 2. The internal state of the cipher can be reset to the initial state, e.g., by restarting the cipher with the same IV and key.
Let us point out here that it would also be sufficient to assume that the attacker is able to force the internal state to be equal to some for and the attack works nonetheless. In practice, however, it is usually easier to force it to by simply re-initializing the cipher, also known as nonce-misuse. If the attacker has hardware access to the device under inspection, this assumption surely poses no problem. For the injection itself we only have small restrictions, namely that the fault must happen in a short time period and only in one single byte representing exactly one of the registers.
Assumption 3. For a known we are able to inject a random byte-fault in a random register of the internal state at a random time during the generation of the keystream. (This means that the value of register R at time t is replaced by .)
To be more precise, we assume that an injection at time t happens before the corresponding keystream output is computed, i.e., an injection at time t can already affect the keystream output of the same time t. Notice that this assumption does not require us to know the injected fault , nor its precise temporal or spacial resolution. The only requirement we have is that the injections happen in a short time period T. For the random distributions of the faults and the affected registers we also impose no restrictions.
Let us point out right away that our experimental results were obtained using uniform distributions and the time period , i.e., the first ten clock cycles. We would expect similarly good results for other distributions and sets T.
In the following, we fix a finite and assume that the above assumptions are satisfied for this set T. Recall that we also fixed an instance of the Enocoro stream cipher with internal state , keystream , and set of registers .
Definition 1. Let , let be the keystream produced by the cipher if a fault is injected into R at time t, and let .
- (a)
The tuple is called aninjection point.
- (b)
The tuple is called afault injection, where is itsinjection point, itsfaulty keystream, and itsdifferential keystream.
If the injected fault is relevant in the context, we also denote the fault injection by .
Under our assumptions an attacker is able to inject a fault at the injection point and observe the faulty keystream , without knowing or . From the original keystream one can then simply determine the differential keystream . Hence, for every fault injection , we assume that an attacker is only in possession of .
In the following section, we discuss the problem of finding the injection point of a fault injection and how one can use this information to gather information on the internal state of the cipher.
4. Differential Clocking Algorithm and Fault Localization
At the heart of every differential fault attack lies the propagation and computation of so-called differentials. These are the in- and output differences of a non-linear function used in the cipher. To precisely track the effect of a fault injection on the internal state and the keystream output we introduce a Differential Clocking Algorithm. This algorithm returns a set of linear polynomials that relate distinct differentials with one another and with the keystream output. As a consequence, it tells us exactly if—and how—certain differentials can be computed from the faulty keystream. Moreover, the output helps to decide whether a given differential keystream was produced by a fault injection at a certain injection point.
We use the following definition of a differential; notice that we only consider differentials in the non-linear function . This is sufficient here as no other non-linear functions are used in the specification of the Enocoro stream ciphers.
Definition 2. Let and with . Then is called adifferentialin x, δ itsinput difference, and Δ its output difference, respectively.
The goal of a fault injection is to find as many differentials as possible, since every differential in x gives us information on x, as the next remark points out.
Remark 1. Let , , and be a differential in the value of the register R at time t. Then we clearly have: So every differential in corresponds to a set of bit-tuples that contains . Given multiple differentials in the same register value , we also have: Table 1 shows the average size of this set for one and more differentials. One can see that in a vast majority of all cases, two differentials suffice to uniquely determine . (Also notice that for we have at least two solutions, as both and are in .) Before we discuss the use of differentials in detail, we present the Differential Clocking Algorithm that forms the basis of all subsequent analyses.
Write the
-function for
as:
Then from the specification (see
Section 2) we know that
are all linear. Additionally, also
is a linear function.
For
we now define the polynomial ring:
To keep the presentation simple, we abbreviate
by
e,
by
, etc.
Algorithm 1: Differential Clocking Algorithm. |
|
Conceptually, this algorithm traces the propagation of a generic fault , injected at the injection point , through the internal state and its differential keystream output, represented by the indeterminates . The algorithm keeps track of the structure of the differential internal state in terms of the fault and its propagations. With every clocking, the (differential) state, gets updated as follows: if a register value is updated linearly, we can use the linear update function for the differential state as well (line 9). If the update function is non-linear, we introduce new indeterminates representing the output difference in that specific part. This is done in line 10 where we replace each output of the function with inputs by a new indeterminate. At the same time, we also keep track of the inputs to these non-linear functions using the indeterminates in line 6. In this way, takes the role of a generic differential in . In that context, line 7 is correct since is bijective: if the input difference is zero, then the output difference has to be zero, too. Finally, line 4 ensures that the differential keystream is related correctly to the differential internal state.
From this, we immediately see that, among others, the linear polynomials in L clearly relate the differential keystream indeterminates with the differentials in the values of the registers , , , and at all the times steps .
For better formalization, we denote the ideal generated by the output of Algorithm 1 by
, where
is the injection point and
. Moreover, the elimination ideal
in the keystream indeterminates will play a major role later on. Notice that a generating set of this ideal can be found via an appropriate Gaussian elimination of the linear polynomials in
. Moreover, for a set of polynomials
we denote its
set of zeros in
by:
Now the next result is an immediate consequence of the above discussion.
Proposition 1. Let be a fault injection and let be the differentials corresponding to , for which appear during the generation of the faulty keystream of the fault injection. Then we haveand thus, in particular Remark 2. In view of the first part of this proposition, we see that one might be able to determine certain input and output differences from the differential keystream , if the injection point is known. For this, we compute a generating set of and consider the substitution for all . If the resulting system of linear equations has a unique solution for some of the indeterminates and , then these solutions correspond exactly to their in- and output differences , and , respectively. Ideally, this happens for matching pairs of differences such that one can obtain a differential in some register value .
For Enocoro-128v2 it turns out that, for each injection point, between four and six differentials in the register can be found. All of this is, however, only possible, if the precise injection point is known. Thus we now focus on how to determine this injection point given just a corresponding faulty keystream.
A straightforward approach for this uses the previous proposition: Given the differential keystream , one can check for all injection points whether is a zero of .
Clearly, a limitation of this method is that if the sets and are equal for two injection points and , the injection points cannot be distinguished. This hints at the more fundamental issue that, in general, there is no unique answer to our question, i.e., there are differential keystreams that can be produced from fault injections into distinct injection points.
Definition 3. Two injection points are calledequivalentif and only if for all .
If two fault injections are equivalent according to this definition, then all the input and output differences are based in the same way on the previous differences and, in particular, the fault . Thus for any two fault injection sequences with the same fault in those injection points, the differential keystreams must be equal. This shows that points in the same equivalence class cannot be distinguished using just the differential keystream. However, since our fault analysis is based on these sets, , it is also sufficient to just determine the equivalence class.
Under certain conditions of the stream cipher’s parameters, the number of time steps it takes for a fault to affect the keystream is determined uniquely by the injection point, independent of the values of the internal state and the injected fault. This allows us to attribute the injection point corresponding to a given differential keystream to one of just nine equivalence classes. In particular, this result is applicable for our attack on Enocoro-128v2 (and also for Enocoro-80).
Proposition 2. Assume that the parameters of the Enocoro stream cipher satisfy: - (a)
For every injection point there is a unique with:such that for every fault injection we have and .
Let be the map assigning each injection point to the index , as above.
- (b)
Let . Then the set of injection points contains at most nine equivalence classes.
Proof. For part (a) it suffices to show that it takes a fixed amount of time steps less than
for a fault in
to reach the keystream. If
, then a non-zero keystream difference can be found in
or
, respectively. Now we can assume that
R is part of the buffer, which we decompose as follows:
By definition of , the fault can only spread to the differential keystream via the registers , , , , and , . Thus we only need to show that a fault in spreads to one of these registers in a small enough fixed number of time steps. By assumption on the parameters of the Enocoro cipher , a fault in one of the above sets stays in the registers of that same set until it reaches exactly one of the registers in . Moreover, the precise number of steps until this happens is determined by , , , and respectively. If the fault is in , , or , then after just one additional clocking it is present in and thus affects the keystream of the same time stamp. In case the fault is in , two additional time steps are required. One for the fault to get to and another one to spread to . As we also know that the fault in must be non-zero, since no other input for the state update of can already be corrupted. The claim is an immediate consequence.
In order to prove (b), notice that the above shows that the set
is a subset of:
A straightforward analysis shows that the injection points in each of those nine sets are indeed equivalent. □
For this proposition the actual equations in are irrelevant, the result is derived only from the structure of the Enocoro stream ciphers. In combination with Proposition 1, this gives us a fairly good tool to find a small set of equivalence classes containing the class of the actual injection point.
Proposition 3 (Locating Injection Points)
. Assume that the parameters of the stream cipher under consideration satisfy: Let . Let be a fault injection and let be minimal with .
Then, belongs to one of the, at most, nine equivalence classes contained in: This proposition admits an efficient implementation: the equivalence classes in can be represented by less than nine elements, and in view of the proof of Proposition 2 we can immediately derive those from s. Now, it suffices to check for each one of these injection points , if the first differential keystream outputs are a zero of the (linear) generators of , as this ideal is equal for equivalent injection points. The parameter should only be chosen large enough that the ideals, , allow us to distinguish distinct equivalence classes as well as possible.
Remark 3. One might think that the larger τ is, the more linear polynomials one can find in . Although a larger τ surely increases the size of , the number of indeterminates that appear in this set increases as well. Close inspection of Algorithm 1 shows that the output contains between and linear equations in indeterminates. Thus, the system never has a unique solution. Furthermore, the more coordinates of the differential state are non-zero (in particular , , , and ), the fewer polynomials are added to L in line 6. Now, it suffices to notice that with each additional loop-iteration, more entries of are assigned to non-zero entries. This suggests that for some large enough the ideal could get stationary.
In the remainder of this section, we show the implications of the previous propositions for Enocoro-128v2. As indicated by the last remark, in practice, linear equations in the keystream indeterminates can only be derived from the first few non-zero differential keystream outputs.
Example 2. For Enocoro-128v2 we have that and this ideal is generated by linear polynomials. For other injection points, we also come to the conclusion that seems to get stationary already for .
The next remark collects some special properties we could observe when applying Proposition 3.
Remark 4. Consider the parameters of Enocoro-128v2, and let .
- (a)
From Proposition 3 (and its proof), we see that contains (at most) nine distinct equivalence classes with representatives , and respectively.
- (b)
For , one can check that the sets for these representatives form only a total of four distinct ideals; these are: Thus, our method cannot distinguish all different equivalence classes. In the best case, however, we can still learn to which of the four groups of equivalence classes , , , or the actual injection point belongs to.
- (c)
Our experiments indicate that we are always in that best case. This means that an injection point is attributed to exactly the correct set of equivalence classes. Or in other words, there seems to be no differential keystream that is a zero of more than one of the above four ideals. So, in practice, we can determine a set comprising of less than three injection points where one of them is equivalent to the real injection point.
This result is also summarized graphically in Figure 2. The injection points with the same base colour share the same ideal , and those that also have the same colour shading are indeed equivalent. - (d)
At last, we want to remark that in each of those four groups of equivalent injection points, the timings of the injection points are distinct. So if the time t of the injection point is known, we can directly infer to the register R, and by that, uniquely identify the injection point.
To put it differently: if an actual physical implementation of our fault model allows us to measure the precise time a fault was injected in an instance of Enocoro-128v2, then, in practice, the corresponding injection point can be found with little effort. Using this additional information, our fault attack becomes considerably more efficient, while requiring the same small number of fault injections.
Example 3. To illustrate this further, consider a fault injection for a given instance of Enocoro-128v2, let , and let . Then the first non-zero differential keystream output is .
- (a)
By Proposition 2, the injection points with can be partitioned into (at most) nine equivalence classes. These are given by:intersected with . After this intersection, five of those sets are empty. The remaining equivalence classes have representatives , , , and , respectively. - (b)
Now we check whether the first elements of are a zero of , , , and , respectively. As indicated in the previous remark, in practice it is only a zero of one of them, i.e., we are likely to observe that it is a zero only of .
This uniquely determines the equivalence class of the injection point, and indeed is equivalent to .
To finish this section, recall Remark 2, where we explained how to compute input and output differences from the set
and the differential keystream
, if we are given a fault injection
.
Table 2 gives an overview of the number of input and output differences that can be computed, as well as the register values in which those can be combined to differentials. Since these numbers only depend on the ideal
, it suffices to list one injection point for each equivalence class.
Remark 5. For an actual physical implementation of a fault injector satisfying our assumptions, the injection points that can be observed adhere to some probability distribution on . If this distribution is known (or can be estimated), one can assign, to every equivalence class, the probability that the injection point lies in it based on the elements that the equivalence class has in common with . This is the basis of a minor optimization in the fault attack where we perform an exhaustive search over the different injection points that come into question after using Proposition 3.
5. Combining Fault Injections
In this section we show how the information gain of several fault injections can be combined in a very structured and automated way. The core idea for this was first developed in [
19]. There, a connection was drawn between Horn clauses that resemble relations between the values of registers at different time steps, guess-and-determine (GD) attacks, and fault attacks. In this work, we extend this approach to not only consider
exact relations among the register values, but also containments of register values in
small sets that are derived from other register values.
For completeness, we first introduce the original approach, before we explain our adjustments.
Construction 1. Let be the number of time steps to consider. Introduce propositional logic variables where corresponds to the truth value of the statement ‘The value of register R at time t is known.’ Now relations among the values of the registers at consecutive times can be derived from the next-state-function . These relations are of the form:with and . This means that if the value of is known at time for all , so can be the value of register R at time t. In the logic setting this corresponds to the implication: In particular, such implications are Horn formulas, and we also denote them by the equivalent formula , or the clause .
We collect all Horn clauses that one can derive in this way from the first time steps using the definition of in the set .
The following remark gives a few more details on how this set can be constructed in the case of the Enocoro stream cipher.
Remark 6. For any Enocoro stream cipher every equation that is used to define the next-state-function admits as many clauses as distinct register values occur in them. As an example, consider which holds for all . By rewriting it to and we get two more equations. Altogether they correspond to the three Horn formulas: By careful counting, we see that this construction admits exactly clauses.
In the following we denote the set of logical variables appearing in a set of clauses
C by
, and for
and
we define
as the truth value of
F under the assignment
, where
and
, as usual. The set of
satisfying assignments of
C is denoted by:
Moreover, we denote the well-known
Marking Algorithm (see [
31]) of Horn logic by
. Recall that it takes a set of Horn formulas as input and returns the unique satisfiable assignment of minimal hamming weight, if it exists.
In [
19], [Corollary 6.9] a deterministic algorithm based on Gröbner basis computations was proposed to find minimal sets
, such that
. Such a set,
G, is called a
guess basis for
C. If
as in the above construction, then the following two statements are equivalent:
is a guess basis for
, and the internal state can be computed from
using the equations corresponding to the clauses in
.
This already indicates that every guess basis
G for
gives rise to a guess-and-determine (GD) attack with an attack complexity of at most
. Using mixed-integer linear programming solvers it was shown in [
11] that for Enocoro-128
v2 the smallest guess basis for
, as in Construction 1, consists of at most 18 logical variables. Thus, the best GD attack requires a guess of no more than 18 register values, which is just slightly more complex than guessing the 16 bytes of the key and IV. This result tells us that the fault injections should provide us with information on the internal state comparable to at least 18 register values in order to be able to retrieve the internal state.
The following remark motivates as to why looking at several fault injections independently is not likely to result in a fault attack with complexity below .
Remark 7. Using Table 2 we see that every fault injection admits the knowledge of up to five differentials for the register . Assume that we repeatedly inject faults where the fault injection point can be determined uniquely, and ultimately we are able to derive so many differentials such that the value of can be determined for m consecutive time steps. Note that this requires at least fault injections, as a single differential can never uniquely determine a registers value. To check whether we can compute the internal state from these register values we can switch to the logical setting: let and as in Construction 1. Moreover, let model the fact that the value of is known for m consecutive time steps.
A simple computation shows that for all . This means that even more than 200 fault injections are not sufficient to determine the internal state directly.
Nonetheless, it might be possible that guessing just some additional register values might make this feasible. As long as we only need to guess a small amount of values this is clearly acceptable. This corresponds exactly to finding an optimal guess basis for . Using the Gröbner basis method of [19], we computed those for where . All of these guess bases contained more than four elements, i.e., the internal state can only be determined if, additional to the fault injections, at least four bytes are guessed. This shows that a search space is still left, even after considering more than 14 fault injections.
The remark indicates that looking at just the differentials that can be deduced from multiple fault injections isolated from the others, is not ideal. Therefore we are in need of a more sophisticated approach that better captures the interplay of various fault injections.
In the following, we take a different perspective on the concept of a guess basis. Instead of looking at the register values that need to be known to be able to determine the internal state, we focus on how this is done, i.e., the order in which the internal state can actually be computed and which formulas (and functions) are used in every step. Moreover, instead of functions that resemble exact relations among the register values, we allow functions that omit a small set in which another registers content is contained. To cope with the fact that these sets may—on average—have different sizes, we accompany the corresponding Horn clauses with a weight resembling the average size of this set.
Using this refined concept, we capture the information we gain on the internal state using multiple fault injections better. This includes modelling all the input and output differences that occur during the propagation of a fault injection by logical variables such that we do not only utilize those which are deduced from the differential keystream right away.
In the next definition, we denote the set of negative literals of a clause F by , and the set of positive literals by .
Definition 4. Let be a set of weighted Horn clauses where is the weight of .
- (a)
A sequence in C is called aguessing path, if and for all .
- (b)
Theweightof a guessing path is defined as .
- (c)
A guessing path in C is calledoptimalif there is no other guessing path in C with .
Informally speaking, a guessing path represents the order in which the Marking Algorithm applied to can choose the clauses for the markings. Thus it is also clear that , if and only if there exists a guessing path in C. In fact, the notion of a guessing path can be seen as a generalization of the concept of a guess basis.
Remark 8. Let G be a guess basis for a set of Horn clauses C, and let be a set of weighted Horn clauses where all have weight 0, and all have weight 1. Then we have:
- (a)
If is a guessing path in , then is a guess basis for C. In particular we have .
- (b)
If is a guess basis for C, let be the order in which the clauses of are processed when computing . Then is a guessing path in with .
- (c)
If G is an optimal guess basis for C, then we have for any optimal guessing path in that .
With this improved notion we are able to adequately treat relations arising from differentials in register values , where we have . More precisely, we construct a set of weighted Horn clauses as follows.
Construction 2. Let be the number of time steps to be considered, and consider a set of fault injections. For every denote the differentials that appear during the fault propagation of the injection in the registers for every and by . Furthermore, denote the set of all input and output differences of this fault injection by .
Introduce a set of propositional logic variables:with the following correspondence: Similar to Construction 1, the next-state function and the fault injections admit relations of the form:with and . This means that if the value of is known at time for all , one can determine the value of register R at time t as an element of a set. We translate this to our logical setting as the Horn clause:with weight equal to the logarithm of the average size of the set the relation is based on. There are various ways to construct such a set of weighted Horn clauses and their associated functions for Enocoro stream ciphers. Clearly, one can adapt the functions that appear in Construction 1 such that they return a set containing exactly one element, and assign their respective clause the weight 0.
In view of Remark 7, it can also be useful to allow the guessing of entire register values. To model this, see Remark 8, we consider the fact clauses for each and of weight 8, whose corresponding functions take no input and return the set .
The clauses so far use no fault information. Now let us describe precisely how to construct clauses encoding the information gain by fault injections and their mutual interplay.
Remark 9. Consider the setting as in Construction 2. The desired weighted Horn clauses and their respective functions can also be found in the following ways:
- (a)
In Remark 2, we explained how certain input and output differences in can be computed from the differential keystream using the set . For each such difference , we can consider the corresponding logical variable as a fact clause of weight 0. The corresponding function takes no input and returns the set containing only the value of the respective difference.
- (b)
From every differential we get the following three relations:which give rise to three weighted Horn formulas:where the weight of the first formula can be deduced from Table 1. Additionally, for every subset we also have which translates to the Horn clause . The corresponding weights can again be deduced from Table 1 if . - (c)
The set also contains linear equations that relate the bits of the input and output differences in to one another. Each subset of those equations that relates all bits of one difference with the bits of other differences can then be translated to a Horn clause of weight 0. The set of all such relations can be computed using linear algebra techniques. However, we do not want to consider all those exponentially many clauses, since we need to quickly process them later on. Thus—for our experiments—we used the following strategy to find a set of such (linear) relations:
- (1)
Replace the keystream variables by the differential keystream in , and let L be a linear system generating the resulting ideal. Apply Gauss-Jordan elimination to L where we use the lexicographic ordering on the indeterminates of .
- (2)
For every construct the set where the eight indeterminates corresponding to d appear. Let be the set of differences whose corresponding indeterminates appear in the set .
- (3)
For each subset check whether the indeterminates corresponding to d are uniquely determined by if the differences are known. In that case we found a linear relation of the form . This corresponds to a Horn formula of weight 0.
From now on, let
be a set of weighted Horn clauses as described above. Notice that the last part of the remark ensures that there has to exist at least one guessing path. Given any such guessing path, Algorithm 2 can be used to determine the internal state of the stream cipher. (Notice the similarity to the GD attack framework in [
19].)
Proposition 4. Algorithm 2 is an algorithm that computes the inital state of the stream cipher.
Algorithm 2: Guessing Path Attack. |
|
Proof. By definition of a guessing path, we have . Thus, the set , in the m-th iteration of the for-loop, can be computed in line 4 as the values have already been chosen in a previous iteration. If the value is chosen as the value of the register (or input/output difference) corresponding to the logical variable in line 6, then, by construction of the set C and the functions , the value of is in . Thus, the algorithm terminates correctly. Since , by construction, lines 6 and 7 are performed less than times in a single iteration. By induction on n, we also see that every step of the algorithm is performed at most times. This shows that the algorithm terminates in finitely many steps. □
Now we can also see why the construction of the weights as the logarithms to base 2 of the average size of the expected size of the sets is indeed useful.
Remark 10. Let C be a set of weighted Horn clauses as in Construction 2, and let be a guessing path in C. Assume that the values corresponding to the variables in behave like independent random variables of a uniform distribution. Then the expected size of the search space considered by Algorithm 2, used with the guessing path , is bounded from above by .
Next, we compare the Guessing Path Attack to an approach using SAT solvers for combining partial information.
Remark 11. Algorithm 2 proceeds in a way which bears some similarity to the approach of a DPLL-based SAT-solver. There we deal with Boolean variables instead of registers values here, the role of unit propagation there is taken by the guessing path here, and conflicts arise as empty sets (in line 5) and incorrect keystreams (in line 14).
Therefore, it appears possible to mimic the Guessing Path Attack with a modern SAT-solver by using the given guessing path as a variable guidance strategy and expressing the functions corresponding to weighted Horn clauses as CNF-formulas. (Here each register would have to be represented by eight Boolean variables.) While such a SAT-based approach could possibly deliver good timings, since it can learn from incorrect intermediate assignments, it also requires a substantial amount of work to construct an appropriate CNF encoding. It is unclear whether one can appropriately adapt the variable selection routine of the chosen SAT solver. In contrast to this, our Algorithm 2 admits a straightforward recursive implementation and proves to be more than sufficient for the requirements of our attack (see Section 7). As a next step towards our fault attack, we need an efficient way to find guessing paths, since our attack must do this quite frequently. In general, finding
optimal guessing paths is a hard problem (in particular harder than computing minimal guess bases, see Remark 8 and [
19]). So, instead of finding an optimal one we restrain ourselves to finding paths of
small weight. For this we propose Algorithm 3 as a greedy heuristic, based on the Marking Algorithm.
Algorithm 3: Greedy Marking Algorithm. |
|
Clearly, this algorithm cannot find optimal guessing paths in general, and should also not be expected to construct good GD attacks all the time. However, in our experiments this simple heuristic proved to be a satisfactory choice which yielded acceptable results.
6. The Fault Attack on Enocoro Stream Ciphers
In this section we combine the algorithms and ideas of the previous two sections and present our fault attack on the Enocoro stream cipher. The attack is constructed in such a way that a new fault injection is only required if the algorithm does not expect to find a solution in a small enough search space. The algorithm takes two parameters as input: determines the maximal search space size that should actually be considered in the final step, and specifies the number of time steps that are taken into account for the internal state retrieval.
Proposition 5. Algorithm 4 is a Las-Vegas algorithm that computes the initial state , i.e., it may not terminate but when it terminates it computes .
Proof. If the algorithm terminates, then it must stop in line 11. Now the output is correct by virtue of Algorithm 2. □
In every iteration of line 2, our fault attack consists of up to two stages of
exhaustive search: the first is to determine equivalent injection points for the
m fault injections (line 6); the second one is outsourced to Algorithm 2 in line 10, but is only performed if it is reasonable to expect a small enough search space by virtue of line 9 (see Remark 10). Notice also that, by Proposition 2, the set
may contain more than 9 elements in every iteration. Therefore, in the worst case, line 6 is executed more than
times. Because of that, our implementation uses (large) lookup-tables for line 7. This is also the reason why we do not recommend to search for optimal guessing paths in line 8, as it would just cost too much time. Instead, Algorithm 3 is a good enough choice. Note also that this heuristic is based on the Marking Algorithm which is known to be a linear-time algorithm [
31]. Thus, it should be investigated if Algorithm 3 can be implemented with linear time complexity as well.
Algorithm 4: Enocoro Fault Attack Framework. |
|
As a minor optimization, the order in which the elements are processed in line 6 can be chosen corresponding to the probabilities that the fault injections lie in the respective equivalence classes, as indicated already in Remark 5. This has no effect on the overall number of fault injections that are required for a successful attack, but ensures that the algorithm terminates faster on average.
7. Experiments and Timings
In this section we present experimental results of our fault attack (Algorithm 4) applied to Enocoro-128
v2 using an implementation in the computer algebra system
CoCoA-5 [
28]. This system has a Python-like language and offers many specialized built-in functions, such as fast Gröbner basis calculations. Moreover, we were able to build upon the implementations for [
19], shortening the implementation time and simplifying the code.
The set T that specifies the temporal resolution of the actual injections was chosen as . The results are based on a total of 500 simulations each, and the computations were carried out on a machine with an Intel Xeon E5-2623 v3 (3.00GHz) processor and 128 GB of RAM.
To aid the selection of the parameters
and
, we first consider a simplified version of the attack where the set
, in line 6, contains only a single element in every iteration. This corresponds to a fault attack with a fault model that also allows the attacker to retrieve the exact injection point. (In light of Remark 4.d this is, in practice, also equivalent to knowing just the time of the injection.) Observe that the number of fault injections required for a successful retrieval of the initial state and the weight of the corresponding guessing paths is the same as with our fault attack using the weaker fault model of
Section 3. Thus,
bad parameters for this simplified version are also
bad ones for the full version.
From Proposition 3 we know that
should be chosen larger than 23. Here we consider
and
.
Table 3 presents the timing of the simplified fault attack for these parameters. There one can see the weight of the final guessing path
of line 10 along with the average
of the exponential size of the search space that actually had to be considered. Furthermore, we provide the cumulative time of the repeated applications of our heuristic (Algorithm 3) used in line 8, the overall running time, and the average number of faults that had to be injected.
For larger parameters we did some preliminary experiments and can report that the variance of the timings increased significantly. Already for and , the timings ranged between a few seconds up to over 10 h. Since we target a fault attack that works well in the clear majority of cases, we did no further investigations. Nonetheless, the actual size of the last search space exceeded only rarely, i.e., Remark 10 seems to be applicable in practice.
As for smaller parameters, we also ran experiments with and . On average, such an attack requires fault injections and takes no more than 90 s to solve for the internal state. This shows that, with the simplified version of our attack corresponding to the stronger fault model, the internal state can be retrieved in negligible time with only about 10 fault injections. (In light of Remark 7 this shows that our structured formalization of the information gain of several fault injections and their interplay pays off very well.)
Looking at the table, we see that the differences between , , and can mainly be described as a significant decrease in the timing and a small (negligible) increase in the number of required fault injections. Only when we consider , the trade-off between timing and number of faults becomes noticeable. Since the complexity of our final fault attack is expected to be exponential in the number of required faults, we consider the trade-off here as acceptable, and favour a slightly larger . This motivates why we only consider for the evaluation of our fault attack.
In
Table 4 the results of 500 simulations of our fault attack on Enocoro-128
v2 (based on the fault model of
Section 3) are shown. As above, we give information on the total time spent on finding light-weighted guessing paths with Algorithm 3, the average weight of the last iteration of line 6, the logarithm of the average search space size that had to be considered, and the average number of faults that needed to be injected.
Just as one could expect, the variance of the timings increased significantly, compared to the simplified attack. Nonetheless, in of all cases the attack finished within h of computation. Moreover, we see that the attack is faster if we choose the larger value of . At first this may seem counter-intuitive, as this parameter bounds the search space size of the inner exhaustive search of the attack. However, the choice of also affects the number of fault injections that are required for the total attack. Consequently, the outer exhaustive search is enlarged. Hence, in view of the fault model, we can see that fewer fault injections actually benefit the timings. Choosing and , we end up with a fault attack that is able to retrieve the internal state, and thereby break the cipher, in about 20 min using less than five fault injections on average.
To conclude, we have presented a fault attack framework for the family of Enocoro stream ciphers that is based on a rather weak fault model, and successfully applied it to the standardized cipher Enocoro-128v2. This clearly shows that appropriate countermeasures should be developed and installed. One suggestion that we would like to offer is to insert a non-linear filter near the keystream output into the definition of the cipher.