1. Introduction
Throughout human history, information has been a valuable resource. Useful information was hidden and hidden from competitors. For many years, confidentiality was at the heart of information confrontation, but with the development of warfare, the attitude towards information as such has also changed. In certain situations, a violation of availability or integrity can cause much more harm than its disclosure. A natural question arose: why spend all the resources on ensuring confidentiality if the attacker’s goal is solely sabotage? Determining the goals of an attacker is a separate issue, but one cannot focus on only one of the three basic aspects.
Modern society has reached the stage when the need to ensure a constant and uninterrupted connection to the information field to perform everyday and/or work tasks becomes more important than privacy requirements.
The widespread introduction of information technology has also affected the technology of document management within organizations and between them. Increasingly important in this area is becoming an electronic document management, which makes it possible to refuse paper carriers. The advantages of this approach are obvious: reducing the cost of processing and storing documents, fast search. In the era of the “information boom”, this approach is obvious, and the only way out of the predicament associated with the growth of the volume of processed information.
However, the transition from paper to automation creates several problems related to ensuring the complete confidentiality of the document and authentication of the disclosure of its author.
Both the sender and the recipient of an electronic message need to ensure that the message has not been altered during its transmission. Workflow technologies must be implemented in such a way that an attacker cannot deliberately distort the transmitted document. If distortions were made to the document, then its recipient should be able to recognize this fact. The problem of authenticating the authenticity of the author of a message is to ensure that no subject can sign a message under anyone else’s name but their own. If they signed under a false name, then again, the recipient should be able to recognize this fact [
1].
In a conventional paper workflow, these problems are solved because the information in the document and the author’s handwritten signature are rigidly associated with the physical medium (paper). In this case, the elements that ensure the integrity of transmitted messages and the authenticity of authorship are handwritten signatures, seals, watermarks on paper, holograms, etc. For electronic document management, there is no rigid connection of information with a physical medium, and therefore, the development of other approaches is required to solve the problems listed above. It follows from this that the models of threats to confidentiality and integrity/availability have different justifications, which means that the solutions used to protect information depend on the aspect of information security [
2,
3].
Before forming a protection system and determining the mechanisms of its work, it is necessary to determine the list of threats. Determination of threats is one of the key stages in the formation of an information security system. At this stage, two points need to be made.
The model of information threats depends on the aspect under study: integrity threats are significantly different from confidentiality threats, while availability threats are a subset of the set of integrity threats.
From the point of view of electronic document management, the transmission channel is the same carrier of information: yes, we protect information, but at its core, electronic information is a kind of abstract object that is not directly affected, while it is the carrier that takes the whole “blow”.
In this paper, we propose a new way to solve the problem—building a model of threats to integrity and availability, considering information transmission channels.
2. Background and Related Work
In connection with the development of technology and the rapid increase in the number of types of information transmission channels, the problem of accounting for these channels is becoming increasingly critical. In the context of this work, we will introduce the concept of an elementary information flow, which will symbolize a separate data transmission channel. A scheme consisting of such flows will be able to describe an information system in terms of the information circulating in it. Consider how this problem is viewed in various sources. While we will not consider how these threats are defined, we are interested in the attitude towards them, their typification, formulations, and applicability to the elements of the system.
First, we will consider such types of information systems, where it is the channels of information transmission that play the decisive role, and not the elements that process it. We refer to such systems, for example, cyber-physical systems (CPS), telemedicine systems, SCADA, IoT, software development systems. We will not dwell on each type but give a general overview of ideas on this problem.
Let us start with the fact that information flow is a fundamental concept underlying the security of a system and confidentiality of information in a system can be breached through unrestricted information flow [
4] and at the same time access control and information flow-based policies for CPS security should be analyzed [
5].
Along with information flow models, the Flow Diagrams via STRIDE or DREAD methodology are often proposed for use [
6,
7,
8,
9,
10,
11,
12,
13,
14]. More specifically, the authors of [
15] report that the Network is an important part of the system, along with Clients and Servers. It also happens that authors ignore threats directed specifically at the flow, although they use this term [
16] and even completely ignore this topic [
17].
The works [
18,
19] describe similar solutions that have one common drawback: the models consider threats directed directly to the channel, but the channel itself does not have a sufficiently complete description of its characteristics, which casts doubt on the completeness of the defined list of threats.
Separately, we singled out IoT systems, where the information transmission channel is an important working element [
20]. For such systems, identification, assessment, and mitigation of risk will be more difficult and complex for cloud computing, mobile device toting, and consumerized enterprises [
21].
Having studied all the mentioned works, one can notice that the research of the problem considered by the author has been going on for more than ten years, and the disputes in the scientific world on this issue do not subside. First, this is due to the heterogeneity of the information systems themselves. However, even if we leave the above set of the most popular types of systems, one can find many publications that also mention information flow models directly [
22,
23,
24] or indirectly through the network [
25,
26,
27,
28,
29,
30,
31,
32,
33]. The author of this work does not agree with this approach, because not every information flow is implemented by a network, but all networks implement flows. A network connection is only a particular kind of information flow. The very concept of flow is much more extensive and defines all possible channels of information transmission.
An interesting solution to the problem is the use of Hidden Markov Chains [
34], however, this approach is more appropriate to use when identifying attacks rather than threats. An equally interesting option is the use of Petri Nets [
35]. However, in the context of the current study, their application makes no sense, since Petri nets allow us to describe the process of information transfer, or rather the very fact of information transfer from one vertex to the channel and further, but do not allow us to describe the information transfer channel separately. To solve the current problem, a higher level of abstraction is needed, which will allow for describing a larger number of possible system states depending on the location of the information being processed.
Speaking about the practical applicability of what is being developed, we should mention [
36]. This article speaks of the unconditional need to apply the application of machine learning methods for analyzing risks and threats to information security. The model of information flows proposed by the author of the current article implies a high level of abstraction with the ability to control the depth of the system description. With an increase in the depth of the system description, the number of elements in the scheme of information flows increases in direct proportion. Depending on the number of elements in the information system, the scheme may increase from a scale that is not processed by a person. Machine learning methods will come to the rescue, but it will be possible to talk about this in more detail when the models being developed go beyond theory and find their practical application.
Based on the results of the review part, it can be noted that DFD (Data Flow Diagram) is the most popular way to solve the indicated problem. However, this approach has two key drawbacks:
- -
the model has two separate notations for constructing schemes of internal and external interaction;
- -
the model does not describe the channels of information transmission and the resulting information flows.
The author agrees that the use of DFD allows us to fully describe the information system, however, further use of STRIDE takes us a little in the other direction, since STRIDE allows us to form a list of possible attacks, not threats. In his works, the author considers threats to be primary in relation to attacks. Each threat can be implemented by many attacks. It is necessary to adhere to the principle of “from smallest to largest”. Comprehensive measures to combat threats are of a preventive nature. Threat coverage provides protection against a large layer of attacks. Therefore, the formation of a threat model is of paramount importance.
3. Information Flow Model
The threat model proposed in this paper is based on the information flow model. This model implies a description of the system using graph theory. Each information transmission channel in the system is represented as an elementary information flow, which includes three elements: a source, an information transmission channel, and a receiver. The elementary information flow is symmetrical and bidirectional, which means that in general there is no division of vertices into sources and receivers in the diagram. Using the following notation:
V is a set of information carriers (a set of graph vertices),
E is a set of information transmission channels (a set of graph edges), and by comparing any two elements from
V and one from
E, we get an elementary information flow in the form of an undirected graph with two vertices (
Figure 1) [
37].
Using the notation of graph theory, we describe the above information flow:
where
vi,
vj—information storage;
ez—information transition channel.
Considering the specifics of the study, namely the work with electronic information resources of the organization, certain sets were compiled.
The set of information carriers was divided into three subsets and took the form:
where
V1—users set;
V2—software tools set;
V3—electronic resources set.
The set of channels has been instantiated to the following form:
where
E1—set of transmission channels in the electromagnetic environment;
E2—set of transmission channels in a virtual environment;
E3—set of remote transmission channels in the electromagnetic environment;
E4—set of remote transmission channels in a virtual environment.
Now we need to get the full picture. Having an extended set V and a specified set E, it is possible to construct a set of all elementary information flows G. To do this, it is necessary to indicate some restrictions:
- -
an element of the set V1 cannot refer to another element of this set;
- -
an element of the set V3 cannot refer to another element of this set;
- -
an element of set V1 cannot directly access an element of set V3 and vice versa;
- -
remote information transmission channels are available only when an element of the set V2 is connected to an element of the same set.
Considering all the above, the set of all elementary streams will have the following form:
where
g1 = {V
1, E
1, V
2};
g2 = {V
1, E
2, V
2};
g3 = {V
2, E
1, V
2};
g4 = {V
2, E
2, V
2};
g5 = {V
2, E
3, V
2};
g6 = {V
2, E
4, V
2};
g7 = {V
2, E
1, V
3};
g8 = {V
2, E
2, V
3}.
The result of combining all the above graphs will be an undirected multigraph (
Figure 2) [
37], which will be a model of information flows when accessing electronic information resources. It should be noted that the connections between each pair of vertices are symmetrical are bidirectional. When determining each individual elementary information flow, the direction of information movement in it does not matter, because we will be interested only in establishing a connection and transmitting information.
The developed model allows us to build a scheme of information flows, which, in addition to allowed flows, will include all possible manifestations of prohibited ones. We applied the information flow model to describe the process of exchanging electronic documents between two users via an FTP server (the general scheme of the described process is shown in
Figure 3).
General description of the document forwarding process:
- -
the first user creates an electronic document on the FTP server;
- -
in our case, the FTP server does not process the document, but only stores it;
- -
the second user accesses the electronic document on the FTP server.
To make a complete list of information flows, we added a few more explanations:
- -
users interact with the PC using the operating system;
- -
users interact with the FTP server using standard OS tools without specialized software; for convenience, we will combine the local FTP client with the operating system into one object;
- -
we will assume that the user interacts with the PC using PC’s software and PC’s I/O without specifying;
- -
we will assume that the remote virtual channel for computer interaction is the TCP/IP protocol family, and the electromagnetic one is the Ethernet family of technologies. In turn, the local communication channels inside the server are Server’s software and Server’s hardware, respectively.
Given all the above, the list of information flows will look like this:
First User—First PC’s FTP client;
First PC’s FTP client—Server’s FTP client;
Server’s FTP client—Server’s data storage;
Server’s data storage—Server’s FTP client;
Server’s FTP client—Second PC’s FTP client;
Second PC’s FTP client—Second User.
Now let us build from this list of information flows a complete list of elementary information flows. Each stream is divided into two elementary ones, since the model implies the division of the data transmission channel into electromagnetic and virtual. In addition, we introduced the designations of all participants in the process according to the model of information flows.
Users set:
where
—first user;
—second user.
Software set:
where
—first PC’s FTP client;
—second PC’s FTP client;
—server’s FTP client.
Storages of information:
where
—Server’s data storage.
Set of local transmission channels in the electromagnetic environment:
where
—First PC’s I/O,
—Server’s hardware,
—Second PC’s I/O.
Set of local transmission channels in a virtual environment
where
—First PC’s software,
—Server’s software,
—Second PC’s software.
Set of remote transmission channels in the electromagnetic environment
where
—Ethernet.
Set of remote transmission channels in a virtual environment
where
—TCP/IP.
The final set of elementary information flows will have the following form:
where s
1 = (
,
,
); s
2 = (
,
,
); s
3 = (
,
,
); s
4 = (
,
,
); s
5 = (
,
,
); s
6 = (
,
,
); s
7 = (
,
,
); s
8 = (
,
,
); s
9 = (
,
,
); s
10 = (
,
,
); s
11 = (
,
,
); s
12 = (
,
,
).
As can be seen, the entire process of information transfer can be described using a set of elementary information flows, and in the construction of a complete scheme of information flows.
Therefore, concerning the example of exchanging documents via FTP, we illustrated the application of the information flows model. This analysis shows that the use of the model allows us to break any information transfer process into a finite set of elementary information flows, while the only difficulty lies in the correct description of the sets of system elements. The more fully and accurately the sets of elements are described, the more detailed the flow diagram will be.
Keep in mind that FTP is an exaggerated example. We deliberately chose this process to not pile up the article with huge sets. In practice, the scheme of information flows will contain a few connections beyond the limit for human processing. The scheme of information flows implies a description of all possible elements in the system of connections. In the most general case, we can use Formula (13) to calculate the total number of bounds.
where
N is the total number of bounds and
n is the number of elements.
For n = 2 we get N equal 1 and for n = 100 we have 4950 bounds. However, all these calculations are valid only if each element really has a connection with each element. In practice, the scheme is limited to special cases and interactions of elements. Modification of the information system due to a change in the number of elements leads to a complete recalculation of the scheme of information flows. In any case, at this stage, the work is undergoing theoretical discussion and examination. The practical application of this approach requires a software solution, most likely using big data technologies and possibly machine learning methods. This activity is supposed to be a further development of the theory proposed by the author.
4. Model of Threats to the Integrity and Availability of Information
The issue of the study is related to the fact that, today, all available models of threats to information security are very conditional. There is no single principle for constructing a threat model. There are several approaches, and all of them have fundamental shortcomings, namely: the lack of a clear concept of a “threat model”, a striking difference in the structures and principles of the functioning of models, methods of applying the model, redundancy of the model in the form of a merger with the model of the intruder, and much more.
The presence of these and some other gaps in existing approaches negatively affects the efficiency of the expert’s work with the model itself and the result, due to the lack of standardized final assessments of one threat model relative to another. Therefore, the objective of this study is to create our own model of information threats.
The principle of building a threat model is based on the developed model of information flows, namely on the concept of an elementary information flow. Let us again turn to the definition of an elementary information flow, which is described by the formula:
where
vi,
vj—possible information storage;
ez—possible communication channel.
In this model, the information transmission channel is not some abstract object, but a very real element of the system, which has its own properties. It follows that it can be accessed in the same way as the other two elements of the stream.
Unauthorized access to information is access to protected information in violation of established rights and (or) access rules, leading to leakage, distortion, forgery, destruction, blocking access to information, as well as to loss, destruction, or failure of the information carrier (including number and channel of information transmission).
The very definition of unauthorized access implies the appearance in the system of a new element that will carry out this very access. Using the notation indicated earlier, this situation can be depicted as follows (
Figure 4).
A similar situation is possible for any element of the information flow. By analogy with the situation described above (
Figure 2), access can be made both to an element of the set V
j and to E
z.
Interaction with elements of an elementary information flow leads to integrity and availability threats, and interaction with information circulating in this flow leads to confidentiality violation threats. Not all authors pay attention to this circumstance in their works. In most cases, it is said about the state of security of the information flow, without classifying possible impacts and consequences, which is necessary due to the different nature of the origin of the impact [
38,
39,
40].
Three possible connections of a foreign element Vj*→Vi, Vj*→Vj, Vj*→Ez describe situations in which there is a direct impact on one of the elements of the information flow, which can lead to distortion of information or its destruction.
From the foregoing, it follows that any of the three types of unauthorized influence can be exerted on any of the elements of an elementary information flow, and therefore on information:
- -
destruction;
- -
distortion;
- -
substitution.
Let us again turn to the concept of an elementary information flow and analyze the relationship between the types of influence on the elements of the flow with the classical aspects of information security: integrity and availability.
Applying to the tops of the stream:
- -
destruction of information on one of the vertices leads to a violation of the integrity of information;
- -
distortion of information on one of the vertices leads to a violation of the integrity of information;
- -
substitution of information on one of the vertices leads to a violation of the integrity of information.
Applying to the information transmission channel:
- -
destruction of information in the channel leads to a violation of availability;
- -
distortion of information in the channel leads to a violation of the integrity;
- -
substitution of information in the channel leads to a violation of availability.
Total: four threats to integrity and two to availability. It should be noted that the information flow has two symmetrical vertices, and any of them can be affected, which leads to the fact that the number of integrity threats directed to the vertex’s doubles, which means that their total number becomes seven. Thus, having analyzed all possible types of impact on the information flow, we can build a complete set of typical threats to information integrity and availability (
Table 1).
Set of integrity threats:
where
c1—substitution of the source Vi (transmission of distorted information to the element Vj);
c2—substitution of the source Vj (transmission of distorted information to the element Vi);
c3—substitution of the source Vi (destruction of information in the element Vj);
c4—substitution of the source Vj (destruction of information in the element Vi);
c5—substitution of the source Vi (substitution of information in the element Vj);
c6—substitution of the source Vj (substitution of information in the element Vi);
c7—impact on information during transmission over the Ez channel (distortion of information in the channel).
Denote the set of accessibility threats:
where
d1—inoperability of the Ez channel—overload, destruction, inability to establish communication with the information carrier (complete lack of access to information by an authorized person);
d2—“Noisy” channel Ez—interference (partial access to information by an authorized person).
Let us go back to the example presented earlier and apply typical threats to it.
Consider the first stream s1 = (, , ), where —First User, —First PC’s I/O, —first PC’s FTP client.
Let us apply each of the nine threats to this stream. Let me remind you that the connecting channel in the flow is symmetrical and, accordingly, bidirectional.
When the c1 threat is realized, the user is replaced by an unauthorized user *, because of which this element can introduce distortions into the information stored in the element. The implementation of the threat is possible when the computer is used by a third party. An unauthorized user can, on behalf of an authorized user, upload a document with modified information.
When the c2 threat is realized, the FTP client is replaced by unauthorized software *, because of which the authorized user can receive distorted information. An example would be installing an app from an unverified source.
When the c3 threat is realized, the user is replaced by an unauthorized user *, because of which this element can destroy the information stored in the element. The implementation of the threat is possible when the computer is used by a third party. An unauthorized user using an FTP client can delete important documents.
When the c4 threat is realized, the FTP client is replaced by unauthorized software *, because of which the information with which the user directly interacts will be destroyed. An example would be installing an app from an unverified source.
When the c5 threat is implemented, the user is replaced by an unauthorized user *, because of which this element can replace the information stored in the element. The implementation of the threat is possible when the computer is used by a third party. An unauthorized user can, on behalf of an authorized user, upload a document with completely changed information to the server.
When the c6 threat is realized, the FTP client is replaced by unauthorized software *, because of which the authorized user can receive completely incorrect information. An example would be installing an app from an unverified source.
When the threat c7 is realized, the information in the communication channel is influenced. In this case, the communication channel is the I/O device. An example is a hardware tab that distorts the output of information on the screen, for example, changes the displayed color.
When the threat d1 is realized, an impact is made on the communication channel , because of which the authorized user cannot gain access to this information. If we take an information output device as an example, then its complete inoperability can serve as an example of a threat implementation. The information is not compromised, but the user cannot access it.
When the threat d2 is realized, an impact is made on the communication channel , because of which the authorized user cannot get access to the information in full. Returning to the same example with the output device, an example of a threat implementation may be its partial inoperability because of the action of the tabs.
A similar selection of examples of the implementation of threats can be selected for any other flow and its elements, however, we will not present these analyses here, since this process is monotonous and, at the same time, will not allow us to better reflect the essence of the threat model.
Let us return to the set of elementary information flows and the sets of threats to integrity and availability. Knowing that both these sets are finite, we can apply each of the threats to each flow, i.e., compare each element of the sets C and D with each element of the set G and get a new set that will consist of all combinations of threats and flows, i.e., be their Cartesian product.
Now we classify and give a brief description of the identified typical threats. For convenience and readability, the set of typical threats was divided and grouped according to their belonging to information flows from the set G. In
Table 2,
Table 3,
Table 4,
Table 5,
Table 6,
Table 7,
Table 8 and
Table 9 present the grouping and characteristics of the analyzed typical threats.
Thus, a list of 72 typical threats to the integrity and availability of information processed in a computer system was compiled.
It is necessary to clarify once again that this list is not a complete list of threats:
- -
firstly, within the framework of this study, only threats to integrity and availability are considered;
- -
secondly, given the fact that technologies are developing at an accelerating pace, we cannot accurately predict which I/O devices, storage, or transmission devices will exist in principle in a few years, let alone determine the full list of threats to information that will be processed using devices that do not exist anymore. This problem is well disclosed in [
41]. This review article shows the dynamics of the use of various information transfer technologies with the development of information systems. In any case, all the mentioned technologies have already used element base. Therefore, at the abstract level, they can be reduced to the same sets that are indicated in the information flow model.
With all this, we can say with confidence that the set of typical threats will remain unchanged, since the apparatus used on the basis of the threat model has a high degree of abstraction and is based on graph theory, and not on objects of the real world. Within the framework of the model, any device is presented as an information transmission channel, regardless of its implementation. The specialist is only required to “not forget” about this channel (device) at the time of describing the entire system. The introduced abstraction allows us to describe the system down to the minimum level of element interaction. The specialist determines the depth of a detailed description of the system independently, depending on the feasibility and requirements.