1. Introduction
Over the last years, with the relevant increase in computational power and communication technologies, a new trend of diverse network devices and different technological systems emerged quickly and they are delivering a wider range of exploitable vulnerabilities [
1]. Consequently, the number of cyber attacks and their costs have also increased [
2]. Also, these new cyber exploits are more complex and targeted [
3], generating sophisticated and improved attacks. Such facts indicate that the cybersecurity spectrum is fundamentally changing and becoming increasingly challenging.
The progressive evolution of the current cyber attacks arises from a cascade of new sophisticated applications that are being developed by attackers and security experts, and the more complex a system gets, the more insecure it becomes [
4]. Another reason for the improvement of the attacks is the fact that these are being better planned and applied in a more specific way [
5], which makes them more complex. Most of them are developed to not be detected by first level defenses, being able to persist on the system [
6]. Besides that, these new threats are in a constant process of modification and improvement, making their detection and defense more complicated [
5]. The advances and modifications in the cyber attack ecosystem have encouraged changes in the traditional defense model and the search for more efficient and proactive methods [
1,
6].
Considering the presented scenario, the idea of Cyber Threat Intelligence (CTI) has been rapidly popularizing and is often posed as a new solution for applying effective security to enterprise [
7]. Any valuable information that can be used to identify, characterize or assist in the response to cyber threats is commonly referred to as cyber threat information and the analysis of this type of information can produce intelligence to inform the user about threats to their system [
8]. Within the limitations of the CTI approach, there is the heterogeneity of the data involved [
9,
10], and the massive amount of data for collection [
11]. So, in order to effectively use the cyber threat information, mechanisms capable of consuming, analyzing, evaluating and classifying the information are highly needed [
12].
Thus, new automated systems with the ability to consume a vast amount of data, provide sophisticated defense capabilities and respond to incidents in real-time are being developed and commonly referred to as Threat Intelligence (TI) platforms [
13,
14]. These platforms should include automatic processes of data transformation and intelligence production to ensure a more efficient, proactive and timely defense model [
15]. Besides, due to the heterogeneity of the data inserted in the CTI context, considerable efforts have been made in order to standardize the data [
13] and make it compatible among different systems [
12]. The interoperability of the data is important to facilitate the automatic gathering and analysis of the data and the sharing of cyber threat intelligence [
9].
However, since the field is growing rapidly [
15], today CTI concept lacks a consistent definition [
11,
16] and a heterogeneous market of CTI platforms emerged. Diverse systems and tools, which accomplish different goals, are implemented as threat intelligence platforms [
14]. Besides, capabilities, performance levels, and applicable use cases vary greatly among platforms an this is not always transparent to the user [
14]. Therefore, it is notorious a research gap involving the analysis of CTI systems and tools available, in order to describe in detail their features and evaluate the quality of the information that can be produced and disseminated with them.
The goal of this research is to provide a methodology for the evaluation of the standards and platforms of Cyber Threat Intelligence. The research includes a review of the state-of-the-art regarding the cyber threat intelligence ecosystem and existing TI standards and platforms, by presenting the directions that the theme has been following in recent years and the TI initiatives that have been consolidated or have great potential to be consolidated in this area. To achieve that, first, a review of the existing CTI standards and platforms is made to identify potential and relevant research opportunities. Then, a selection strategy is proposed to define the most popular standards and platforms. The selected ones had their features and usability analyzed, in a practical way. Finally, they were evaluated based on holistic and comprehensive evaluation criteria.
Previous researches have focused on comparing a large number of platforms and apply little or trivial criteria in the evaluation process. Our work complements the related research work by summarizing a significant and comprehensive evaluation of TI standards and platforms simultaneously. It presents an adequate strategy for selecting relevant platforms and standards and an integrated methodology covering a wide range of aspects for the evaluation criteria. Finally, the results and discussion about the evaluation process provide a valuable overview of the CTI landscape.
The remainder of this paper is structured as follows.
Section 2 brings an overview of related work and some definitions relevant to understanding the cyber threat intelligence landscape.
Section 3 presents a methodology proposal for evaluating CTI standards and platforms.
Section 4 and
Section 5 present the results of the evaluation.
Section 6 gives a discussion. Finally,
Section 7 concludes the paper.
2. Related Work and Definitions
In this section, a brief background on the CTI panorama is provided. Relevant related work to our research topic is presented along with some definitions and concepts.
Even though a lot of work has been done in the area of Cyber Threat Intelligence in recent years, most of it is not focused on analyzing and evaluating the state-of-the-art of TI standards and platforms. Besides, considering that this type of technology evolves at a rapid pace, some work and results can become out of date. To get a detailed picture of available work in the area and research gaps, a literature review was made.
Much work has been carried out into understanding and presenting the Cyber Threat Intelligence landscape. A survey [
17] provides a broad description of the CTI topic and briefly mentions some platforms and standards. In Reference [
18], the work is more focused on TI platforms and presents a general overview of the threat intelligence platforms landscape, including open source and commercial platforms. Other work [
19] describes some selected open source threat intelligence platforms and standards but no evaluation is done. A recent survey [
20] discourses about research opportunities regarding exchange standards and mentions, without any type of analysis, some of the most popular languages for CTI description and sharing, which are: Structured Threat Information eXpression (STIX), Trusted Automated Exchange of Intelligence Information (TAXII), Open Threat Partner Exchange (OpenTPX), Malware Attribute Enumeration and Characterization (MAEC), Incident Object Description Exchange Format (IODEF) e Vocabulary for Event Recording and Incident Sharing (VERIS).
Considering the existing limitations to fully implement CTI mechanisms as a model of defense, some work has been made to propose frameworks and platforms that could overcome these obstacles. An important limitation is the quality of CTI data produced and shared [
4], that were addressed in many works with the use of machine learning techniques. To CTI be effective for defense models, it should be sensitive to the context which is applied [
4,
13]. Thereby, initiatives that uses machine learning techniques for predicting personalized and context-aware data, as the ones presented in References [
21,
22], could bring great advances to CTI by assisting in data enrichment processes. Also, as proposed in Reference [
23], intrusion detection systems can benefit from artificial intelligence mechanisms, as machine learning, to build an intelligent data-driven system. Following this idea, in Reference [
24] presents good prospects for the use of artificial intelligence in cyber security. The research discourses about the massive amount of threat data available and puts the powerful automation and data analysis capabilities of machine learning techniques as a solution to handle the volume of data. In Reference [
25], using machine learning algorithms, a framework was developed to collect and analyze data and attribute threat incidents to their threat actors. Another work, Reference [
26] proposed a threat intelligence platform with an architecture based on state-of-the-art systems like Malware Information Sharing Platform (MISP) and Collaborative Research into Threats (CRITs). The platform applies machine learning algorithms to analyze and classify email content and actively defends against social engineering. To increase the maliciousness estimation of threat indicators, Kazato et al. [
27] uses a graph convolutional network based method. The method provided an improvement to the maliciousness estimation accuracy and reduced the time allocated to the analyses of indicators by human hands. Also in terms of improving the quality of threat indicators, in Reference [
9] a novel method to automatically extract indicators and apply domain tags from social media is proposed. The method includes a convolutional neural network to recognize CTI domains and correctly classify threat data into these domains. The amount of related work that explores the use of artificial intelligence, manly machine learning techniques, in cyber security methods and frameworks shows the importance of its utilization and indicates some research opportunities.
Regarding another limitation, in Reference [
8], the difficulty related to the lack of confidence from organizations in sharing sensitive CTI data is addressed. Chadwick, et al. [
8] introduces a trust model among organizations combined with a data sharing and analysis framework allowing a confidential exchange of data. Irrespective of some limitations, the results presented show the achievement of expected results. Along the same lines, Riesco et al. [
11] works to reduce the reluctance to share CTI data. The work provides an extensive list of the open challenges of existing solutions in information sharing, and propose a solution to cover all these challenges at the same time. The solution uses a combination of STIX and Ethereum Blockchain to achieve its goal. Results presented showed an improvement in important points like identification of trusted sources, availability and economic cost, comparing to other solutions available. The approach presented in Reference [
28] also works on the topic of sharing CTI data securely. The method not only provides trust levels among organizations, but it also combines the model of trust with encryption mechanisms for the data, bringing more confidence to the sharing process. In the same way, Wu et al. [
29] proposes a framework for decentralized sharing of data using blockchain. The work considers the trust between participants, the trust of TI quality and the trust in the platforms, and uses encryption of data as one of the mechanisms to ensure confidentiality. Thus, it is notorious that the use of encryption systems like the ones presented in Reference [
30,
31] could bring improvements to CTI processes. Even though encrypt CTI data before sharing it could improve the confidence from organizations in exchanging their sensitive data, the majority of the TI standards use common formats, like JSON and XML, to provide threat data, relying the security on the transport mechanism used. Since most standards follow this perspective, TI platforms available usually also do not support encrypted data as import or export formats. However, considering the benefits that the combination of encryption systems and CTI platforms could bring, this should be considered a productive research gap.
In another perspective of analysis and comparison, some research was made regarding CTI standards taxonomies and ontologies. The work presented in Reference [
32] aimed to analyze different CTI exchange ontologies. A layered model is proposed and two sharing protocols, together with their respective transport protocols (STIX/TAXII and IODEF/RID), are examined using the model. The work provides a great overview of the analyzed protocols, presenting a detailed schema of each one and leaving research opportunities about the topic. In Reference [
33], taxonomies, sharing standards and ontologies relevant to CTI scope are evaluated. These topics are analyzed based on data and concepts defined by two different CTI models presented, relationships with other taxonomies and ontologies and description provided by its documentation and source files. The sharing standards evaluated were STIX, MAEC and OpenIOC, however, they were not the main topic of discussion since the focus of the work stood out on the ontologies subject. Reference [
34] focuses on semantic ontologies for sharing standards. In this work, STIX and IODEF were mapped into RDF/OWL ontologies and the mappings were analyzed providing an overview of their characteristics and showing differences and similarities between the standards.
Some similar research interested in the evaluation or comparison of standards or platforms were conducted. One of the first works on this subject [
35], introduces 8 different exchange formats: Common Intrusion Detection Framework (CIDF), IODEF, Common Announcement Interchange Format (CAIF), Intrusion Detection Message Exchange Format (IDMEF), Abuse Reporting Format (ARF), Common Event Expression (CEE), Extended Abuse Reporting Format (X-ARF) and Syslog. These formats are evaluated based on an evaluation methodology proposed that consists of 10 different criteria such as interoperability, confidentiality and practical application. The evaluation provides a significant methodology and good results but some important formats, that are currently relevant, have not been addressed showing that some results are out of date. In Reference [
36], the complete CTI panorama is considered and some CTI standards are presented. Besides that, a good evaluation of some open source threat intelligence tools is done in order to compare them with the tool proposed in the work. In Reference [
16] the classification and analysis of 23 threat intelligence platforms are made based on the licensing model, supported standards, type of platform and type of information shared. The result of the analysis presents some interesting facts about the CTI panorama, like the finding that most of the threat intelligence platforms are closed source, the description of STIX as the de-facto standard for describing threat intelligence and the discovery that most platforms prioritizes sharing over analyzing the information. However, the results are consolidated in eight key findings, which does not allow an in-depth understanding of the features and operation of the platforms. In Reference [
37] a satisfactory comparative analysis of some important incident reporting formats, including different versions, is performed revealing strengths, weaknesses and additional information of the formats. In Reference [
12], in order to explore the existing interoperability challenges when using specific sharing solutions, Rantos et al. [
12] analyse semantic, syntactic and technical aspects of the most prominent standards considered by the work. Thus, some characteristics including type of data and supporting sources were described and can be used as a base for future research in the area.
A recent work [
13] provides a comparative analysis of cyber threat intelligence sources, formats and languages. Several CTI sources are presented and examined, and based on the results of the examination together with literature research, some CTI standards were selected for further analysis. Many criteria and features were considered in the comparison, providing a great and detailed description of the capabilities of some relevant CTI standards. Some specific CTI standards are analyzed in the work, like STIX and MISP, but some common formats like CSV and RSS are also included in the comparison, which differs from the standards selected in this paper that were designed specifically to represent threat intelligence data. With a similar goal to this work, Bauer et al. [
14] presents a framework capable of analyzing and comparing threat intelligence sharing platforms. Based on a systematic literature review, 40 different publications that contained characteristics or requirements for Threat Intelligence Sharing Platforms (TISPs) were studied. Therewith, 62 essential evaluation criteria were determined and divided into six main categories that were used by the proposed framework to evaluate the platforms. The work mentions that the framework was applied to ten different TISPs, but only three of them had the results described. The results revealed interesting information about the described platforms, including some similarities and differences. However, for limitation issues, only a small set of platforms were considered.
Most of the research and work developed in the field focus on comparing a large number of platforms or standards and does not provide a critical analysis. On the other hand, few works present great evaluation or comparison but only of a few platforms or standards. Besides, some works focus efforts on TI initiatives that have not been consolidated in the area or are out of date. So, to the best of our knowledge, no prior research has been conducted that simultaneously analyzes and evaluates standards and platforms relevant for the TI scenario, based on a methodology that covers a wide range of criteria for the evaluation.
Some fundamental concepts must be presented to facilitate the understanding of the methodology adopted and results obtained. These definitions will be presented as following.
2.1. New Threat Landscape
The great evolution of computing in recent years largely stems from the appearance of multiple and heterogeneous devices [
38] that can interact with other objects and applications over the internet [
39]. However, the heterogeneity and interconnectedness of these devices lead to a significant increase in the number of security attacks [
40] and the threat environment is expanding in alarming proportions. This growth comes together with more complex attack scenarios and sophisticated threats. Nowadays, adversary behavior is more focused on the target and it considers multi-staged attacks that aim to persist on the host or system and cause ongoing damage [
41]. Most of these attacks do not generate noise or substantial changes in the environment, making it harder to detect.
Some of these new generation threats are denominated Advanced Persistent Threats (APTs). They perform a sophisticated type of attack characterized by establishing a persistent foothold into the target and stay undetected for a long period of time [
5]. Also, there are polymorphic threats with the capability of constant modification, making the detection a complex task. Additionally, other type of threat largely exploited is the zero-day vulnerabilities. Since they explore unknown vulnerabilities of software, it is easy to stay undetected for long periods until the flaw is discovered and patched [
4].
2.2. Threat Intelligence
The term intelligence has the most diverse definitions. This can be explained due to the fact that intelligence is a concept strongly dependent on the context it is inserted. A generalized definition of the term was presented in Reference [
42] and considered widely applicable. It describes intelligence as the process of transforming topics from the completely unknown stage until reaching a state of complete understanding. In order to achieve this goal, random and generic data must be filtered in a minor and relevant data set based on the context intended, which are then processed and transformed into information.
In this sense, the information, when analyzed and contextualized, becomes intelligence [
7]. Considering such assumptions, a generic intelligence production process is commonly composed of three main stages: collection of data, processing the data to transform into information and analysis of the information to produce intelligence.
The intelligence concept can be divided into different strands, where actionable intelligence is one of them. So, for intelligence to be considered actionable, it must meet the requirements of being timely, accurate and relevant [
43].
Following this perspective, threat intelligence should satisfy these characteristics to provide assistance in developing efficient mechanisms to respond to threats, which is commonly defined as a type of actionable intelligence. Thereby, in addition to the generic intelligence production flow aforementioned, the stages of deploying and disseminating the intelligence are also contemplated as essential to the generation process of threat intelligence. So, in the context of this work, the intelligence process flow is composed of five main stages, as presented in
Figure 1:
Collection: This step refers to the gathering of data, which are simple indicators or facts.
Processing: works on combining the data aiming to answer specific questions and provide information.
Analysis: evaluates data and information together helps to uncover patterns and to produce actionable intelligence. With the intelligence produced, it is possible to
Deploy: it by making decisions to utilize the intelligence.
Dissemination: Expand it by sharing the intelligence with interested parts.
2.3. Cyber Threat Intelligence
Within the spectrum of threat intelligence is the concept of Cyber Threat Intelligence (CTI), a relatively new approach that has become highly encompassing and used to define different types of services offered. It can be considered an actionable intelligence generated based on evidence of mechanisms, indicators, implications and context concerning threats or incidents in the cyber domain. It provides knowledge about adversaries and methods that can assist in the decision making process of responding to threats [
43].
To the CTI be applied correctly and have effective results, it is necessary to establish a process flow for its production [
1]. First, it is important to understand the needs of the users of the intelligence being developed and the context in which it is inserted [
44], thus the requirements are important to be defined properly.
Once the requirements for the CTI are defined, start the data collection stage. It is known that data and information without treatment and context are not considered intelligence, but these are the basic materials for its production. Then, some mechanisms consume this information and perform the processing and analysis to generate structured information and find patterns. The treated information can be integrated with other defense mechanisms and then used to perform and develop methods of defense and threat mitigation [
43].
Finally, as organizations lack the ability to understand the cyber threat landscape holistically, the stage of sharing and disseminating threat information between organizations is of utmost importance [
41].
2.4. Threat Intelligence Standards
A crucial aspect of the entire threat intelligence process is the format of the shared data. First, for an adequate and automated processing of the collected data, it is important that they are formatted in a structured model and outlined in a common language. In addition, the establishment of standards provides a prior definition about the type of information will be shared and the density of that information [
37]. As a result, a variety of initiatives have emerged with the aim of standardizing the information collected, consumed and disseminated within the CTI ecosystem [
34].
2.5. Threat Intelligence Platforms
The establishment of a new threat landscape encouraged the change of traditional defense models. New systems with proactive action and capabilities of real-time response to incidents are being developed and commonly referred to as Threat Intelligence Platforms (TIPs) [
34]. They are specialized software systems that implement the processes of collection, processing, analyzing, producing, deploying and integrate internal and external threat intelligence. The main goal of this type of platform is to serve as an assistant to decision makers related to incident response [
18].
4. Standards Evaluation Results
First step to evaluate, the standards are selected based on the strategy described in
Section 3.1. Furthermore, the results of the evaluation are based on the evaluation criteria proposed in
Section 3.2.
After gathering the most suitable results from the searching process of TI standards, some relevant initiatives were found. In References [
33,
36] some projects that aim to standardize threat intelligence data are mentioned, such as STIX, TAXII, CybOX, OpenIOC, CAPEC, MAEC and ATT&CK, being STIX considered the most used one.
In Reference [
37] other standards are mentioned such as VERIS, STATL, ARF and X-ARF and some of them are evaluated. Other works [
16,
19] only mentions the standards considered as consolidated, which are: OpenIOC, CybOX, STIX, TAXII e IODEF. A survey [
45] presents in statistical terms the most used patterns, which are: STIX, OpenIOC, CybOX e IODEF. In Reference [
32] a comparison between IODEF/RID and STIX/TAXII, considered as the most popular standards. Finally, some recent works [
9,
12] present standards that are considered prominent nowadays.
Since all the standards found are released for community use, the popularity was the key criteria for selecting them. Considering the results obtained with the literature research, complemented with the review of the official web sites and documentation of most standards, the standards were ranked in terms of popularity and the results are presented in
Table 4.
4.1. Standards Selected for Analysis
Given the presented results, the standards selected for further analysis and evaluation are STIX, TAXII, IODEF, RID, CybOX and OpenIOC. Following, a succinct presentation of the selected standards is provided.
4.1.1. Structured Threat Information eXpression
STIX is a language created by MITRE and developed to capture, specify, characterize and communicate information in the context of cyber threat intelligence [
41]. It provides mechanisms to represent structured information in different scenarios of the cyber threat ecosystem.
The language was designed with principles such as interoperability, extensibility, focus on automation and machine readability. In the first version, STIX was modeled in the XML format and it was composed of eight cores. The second version was developed using serialization in JSON format [
46].
Its structural architecture has been significantly modified, is currently composed of twelve main objects that correspond directly to concepts embedded in the context of CTI [
47]. With its holistic architecture, STIX is able to present information in a standardized, comprehensive and structured manner while allowing application in different use cases. In addition, it is directly integrable with other languages in the TI context [
17].
Cyber Observable eXpression is a language created by MITRE and developed to specify, characterize and communicate information about cyber observables in a standardized way. With the release of the second version of STIX, it is no longer used as an independent language. It was integrated into the second version of STIX that defines a structured representation for observable objects in the cyber domain, called
Cyber Observable Object [
48].
This standardization can be used to describe different types of data, from a host characterization to information about digital forensics. The objects are represented using serialization in JSON format [
49].
4.1.2. Trusted Automated Exchange of Intelligence Information
TAXII is an application layer protocol created by MITRE that defines a set of services to exchange TI information messages between organizations [
50]. It was projected specifically for the transport of information formatted in the STIX language but is not limited to it. TAXII uses Hyper Text Transfer Protocol Secure (HTTPS) as the transport protocol and supports different sharing models, including
hub-and-spoke, P2P and
publish-subscribe.
4.1.3. Incident Object Description Exchange Format
IODEF is an Internet Engineering Task Force (IETF) initiative that aims to facilitate information sharing between organizations and increase the possibility of mitigating cyber threats [
51]. In the first version, the data model was focused on representing incidents. Bringing a more holistic approach, the second version of IODEF was designed with significant evolution in its structural part, which now includes structures for the description of indicators, attackers and incident response methodologies [
52]. Both versions use the XML format.
4.1.4. Real-Time Inter-Network Defense
RID is an IETF initiative created to facilitate the process of sharing data about security incidents mainly structured under the IODEF format. It outlines a proactive internal communication network, capable of integration with mechanisms for detecting, identifying, mitigating and responding to incidents, aiming to compose a complete solution in the treatment of security incidents [
53]. RID messages are transported under the HTTPS protocol and, in order to provide more security, the protocol adds another layer of security to manage sessions.
4.1.5. Open Indicator of Compromise
OpenIOC is a framework that offers a standardized, structured and machine-readable format used to record, define and share information encompassed in the context of cyber attacks and incidents [
54]. This format is written in XML, with a relatively light and small design. Its architecture is composed of more than 500 specific types of data incorporated to represent indicators.
4.2. Evaluation of the Standards
After the selection and definition of the standards, an evaluation was made based on academic literature, the study of official documentation and practical demonstrations.
Taking into account the aforementioned criteria, the evaluation of the standards was made and it is summarized in
Table 5. Regarding the results illustrated in
Table 5, two considerations must be highlighted. First, as explained before in
Section 4.1.1, since CybOX and STIX were usually used together and both standards are maintained by the same organization, CybOX was integrated into the second version of STIX and it is no longer used as an independent language. Thus, as CybOX is now a part of STIX structure it was considered more plausible to evaluate them as a single standard. Second, it was noticed that even though TAXII and RID are autonomous protocols, they are mostly used combined with STIX and IODEF, respectively. It stems from the fact that TAXII and RID are protocols designed specifically to facilitate the transport of STIX and IODEF. Hence, it was chosen to evaluate these protocols as pairs (STIX/TAXII and IODEF/RID) considering the fact that their functions are complementary.
4.2.1. Data Model Architecture
From an architectural perspective, STIX is the language with the most holistic architecture and it is applicable in different use cases. The four entities considered essential to delineate a holistic contextualization of the cyber threat intelligence scenario can be fully represented and characterized by the classes that compose the STIX schema. IODEF and OpenIOC also have a satisfactory architecture but with some flaw points. Both standards do not have the necessary resources for an adequate definition of defense mechanisms or courses of action. Besides, OpenIOC has some shortcomings in the process of characterizing a threat actor in a more specific way.
4.2.2. Intelligence Process
From the process perspective, STIX has the capability to meet most of the proposed requirements. The use of serialization in JSON format provides a common and structured format, with relatively low overhead and machine-readability. The twelve objects that compose STIX architecture are well described and documented providing an unambiguous data model with coupled relationship mechanisms. When used together with TAXII, it offers a reliable transport mechanism. Finally, since it has a significant practical application, most of the TI platforms and tools have integration methods with this standard.
IODEF and OpenIOC are based in the XML format, so they also provide a common and structured format with machine-readability. However, IODEF can present some problems due to free text fields that compose its data model. Regarding the relationship mechanisms, OpenIOC provides logic operators (AND/OR) to create connections between indicators, on the other hand, besides the interconnections on the data model, IODEF does not present specific mechanisms to relate information.
IODEF and OpenIOC are supported in many platforms and tools and can be integrated with different systems. IODEF can be used together with the RID protocol, providing an efficient and secure transport, while OpenIOC does not focus on implementing transport mechanisms.
4.3. Synthesis
As a result of the evaluation, it can be said that STIX is de-facto standard in the threat intelligence context. First, STIX is the most popular and compatible one, being supported by many platforms and tools and used by most organizations. Second, due to its holistic architecture and its capability of addressing a lot of scenarios in the threat intelligence scope, can be considered the most complete standard.
Even though the other standards are still supported by some platforms and have satisfactory applicability, the features offered by STIX have stood out. So, considering the characteristics of STIX and the capacity of possible results it can generate, it can be said that the standard has the most satisfying performance.
6. Discussion
To achieve great cyber threat detection and preventive capabilities, most organizations need to rely on available open source TI platforms. Similarly, these platforms need consolidated standards to provide an automated, shareable and reliable service. Thus, it is essential to analyze the features and operation modes of these two strands of the threat intelligence domain.
By evaluating some common TI standards, it is notorious that STIX, combined with TAXII features, can be considered the most holistic and applicable one. In addition, statistics show that it also is the most used standard among organizations [
45]. Therefore, an important step is a definitive consolidation of this standard, so the goal of establishing broad integration and interoperability between organizations can be accomplished. Besides, the definition of an accepted standard can provide the optimization of processing, analysis and sharing tasks performed by TI platforms because it focuses efforts on a predefined data model and one based on STIX would certainly be holistic and very applicable.
Regarding the TI platform analysis, several interesting solutions were found. Some of them focus on providing speed and performance, others brought great efforts on the visualization of the information, while a few implement a little bit of each feature. As a matter of fact, diverse types of systems, with different goals, are defined as threat intelligence platforms. This probably derives from the fact that there is currently no standardized definition for the concept or process of cyber threat intelligence. As CTI is a very extensive domain, it would be relevant to establish scopes to better characterize the platforms available, making it easier to decide which platforms are best applied in each use case.
Taking into consideration that TI platforms have different goals, it can be said that currently there is no fully complete platform, with the capacity to attend all the CTI processes adopted in this work. Thus, a possibility to expand and optimize the results obtained with the application of the CTI processes would be the integration between different TI platforms, with complementary objectives. Adopting this perspective, it is possible to reconcile different aspects such as performance and visual mechanism, achieving a fully developed CTI process, which provides everything from data collection to the transformation of data into actionable intelligence.
For the reasons discussed it is still necessary to carry out research and work in order to characterize the concept of CTI in a more specific way. Not only a definition should be established, but also the processes that are involved in this domain. Thereby, the wide range of systems available, denominated as TI platforms, could be better used and applied, and new systems that will be developed could be better designed, being able to adopt specific and more optimized processes or a complete approach that fulfills all predefined requirements for creating threat intelligence.
7. Conclusions and Future Work
As the cyber security landscape is fundamentally changing and a new threat scenario is emerging, the development and investigation of more efficient defense mechanisms became a necessary task. In this work, an overview of the cyber threat intelligence scenario and existing standards and platforms of the threat intelligence spectrum was provided. Based on academic literature and official sites and documentations, a group of relevant standards and platforms were defined. Considering the scope of the research, a selection strategy was proposed and applied in order to determine the most popular and efficient standards and platforms that are free or open source. Then, the standards and platforms selected were evaluated based on a developed methodology that contemplates architectural and processual criteria.
From the evaluation of TI standards, we concluded that STIX is the most consolidated standard in the area, mainly due to its holistic approach, which makes it applicable in a wide range of scenarios, and compliance with fundamental requirements for a standard, such as interoperability and machine readability. Concerning TI platforms, MISP and OpenCTI were considered the most complete and flexible platforms. Although there are sophisticated solutions available, there is none that addresses the entire CTI process.
To conclude, even though some great solutions are available in the market, it is still a challenge to find a thoroughly and absolute solution for a defense based on threat intelligence, since the platforms have divergent focuses and consequently correspond to only a few stages of the threat intelligence production flow.
Future work will address assessing and validating the methodology and results here presented by executing an experimental evaluation and running tests using data sets of cyber threat data. New research will be focused on evaluating the completeness of the CTI process that can be supplied by available platforms in a practical way, using the benefits of interoperability among platforms. Along the same lines, research is to focus on the integration between complementary platforms in order to provide a more complete solution to manage and use threat intelligence. Finally, the delineation of a standardized definition for the CTI concept and process to assist in the design of new and optimized threat intelligence systems capable of establishing an efficient defense model is still a research gap.