In this section, we discuss some compliance and security aspects regarding the observed statistics based on the data. This includes a review of relevant laws and standards applicable to the companies included in the study, exploring security controls to mitigate the underlying causes of breaches, and examining response strategies employed when incidents occur.
5.2. Vulnerabilities Mitigation
To ensure the proper protection of customers’ data and compliance with these regulations, a company must effectively implement security controls that mitigate the vulnerabilities that could lead to a breach. Different incident types require different security countermeasures as mitigation.
As seen in
Figure 15, data breaches in 2023 were most frequently caused by phishing attacks, and the most costly originated with malicious insiders (INSDs) [
58]. However, when categorizing these attack vectors into the types of the dataset, it was observed that the majority of them were related to hacking activities (HACK). The ‘Accidental’ label in the figure may be associated with the union of the DISC, PHYS, STAT, and PORT types of the dataset, while ’Physical’ may be associated with the union of PHYS, STAT, and PORT. Social engineering and phishing, which may be considered a type of social engineering, are not strictly related to a category of the dataset.
This also complements the analysis of the dataset, informing the financial impact for different attack vectors, which is important to consider in a risk assessment and in prioritizing vulnerabilities mitigation. As an example, according to
Section 4.4, HACK data breaches represent the second-most-frequent cause, and also correspond to significant average costs, as seen in
Figure 15. This suggests a prioritization for mitigating HACK-related vulnerabilities.
This section reviews some suitable security controls that can mitigate these and other vectors. Such countermeasures could have reduced the likelihood and/or the impact of the breaches in the scope of this study.
To improve overall security, companies may adopt a structured framework, such as the NIST Cybersecurity Framework (CSF), which is an agnostic framework categorizing several security controls in five cores, as shown in
Table 7. These core categories provide an organized approach to improving cybersecurity measures.
Especially for vulnerability mitigation, the Protect core function presents some valuable recommendations divided into categories: Identity Management, Authentication and Access Control, Awareness and Training, Maintenance, Protective Technology, Information Protection Processes and Procedures, and Data Security.
The latter, more pertinent to this study, is then divided into subcategories, which are as follows: data at rest are protected; data in transit are protected; assets are formally managed throughout removal, transfers, and disposition; adequate capacity to ensure availability is maintained; integrity-checking mechanisms are used to verify software, firmware, and information integrity; the development and testing environments are separate from the production environment; integrity-checking mechanisms are used to verify hardware integrity; and protections against data leaks are implemented.
Once again, the latter subcategory is more relevant to this work, and the NIST CSF references the Center for Internet Security (CIS) Critical Security Control, COBIT 5, ISA 62443-3-3:2013, ISO/IEC 27001:2013, and NIST SP 800-53 [
59,
60,
61].
Specific to the NIST SP 800-53, which reviews security and privacy controls for information systems and organizations, the framework mentions the sections regarding information flow enforcement (AC-4), separation of duties (AC-5), the principle of the least privilege (AC-6), personnel screening (PS-3), access agreements (PS-6), boundary protection (SC-7), transmission confidentiality and integrity (SC-8), cryptographic protection (SC-13), covert channel analysis (SC-31), system monitoring (SI-4), and protection from information leakage due to electromagnetic emanation (PE-19). Regarding the latter, the TEMPEST is a valuable specification regarding equipment shielding against non-intentional leakage of radio or electric signals, sounds, and vibrations.
As these countermeasures are closely related to the attack vector, we considered the breach_type of the dataset for reviewing them.
However, it is essential to note that listing all security measures for mitigating data breach-related vulnerabilities is impractical due to the vast number of attack vectors. Therefore, this section provides general good practices against common attack vectors, not an exhaustive list of available security controls.
5.2.1. CARD
The Payment Card Industry Data Security Standard (PCI-DSS) is a critical debit card- and credit card-related data security standard. The PCI requires technical and operational controls to be put in place by any entity that stores, processes, or transmits credit card data.
The PCI-DSS delineates specific requirements for protecting payment card data. These requirements are detailed in
Table 8, providing a comprehensive overview of the PCI-DSS standards and their associated requirements.
This is enforced by three ongoing steps: an assessment, identifying all locations of cardholder data, an inventory of assets, and analyzing them for vulnerabilities that could expose cardholder data. The following stage is to repair the vulnerabilities found, and, lastly, to report the assessment and remediation details and submit the resulting document to entities the company does business with.
Regarding encryption, PCI-DSS requires compliance with the Point-to-Point Encryption (P2PE) standard by using one of their listed validated solutions. Two relevant data breaches involving debit card and credit card information were the Home Depot and Target Corp. breaches, displayed in
Figure 6, which represent the most voluminous breaches.
The most probable cause of the breach at Target was an infection by the Citadel malware [
62]. This malware, which is based on its predecessor Zeus, executes a Man-in-the-Browser attack. Another malware used in the attack was BlackPOS, which aims at Points of Sale (POS) devices [
63].
Section 5.2.2 provides more information on banking malware.
Consequently, approximately 40 million credit card and debit card records were leaked, including their encrypted PINs and other PII.
According to the studied dataset, Home Depot was also infected by BlackPOS [
63], leaking 56 million payment records. The two companies were also PCI-DSS compliant at the time of the breach [
64], although
Table 8 shows some requirements that could have prevented a malware infection if successfully implemented, such as regularly updating antivirus software and maintaining secure systems and applications. Additionally, the Home Depot data breach could have been prevented using P2PE and network segregation. Hence, it is observed that PCI-DSS serves as a reliable foundation for credit card security, but for better security it should not be solely implemented.
One additional technology that may be used to improve transaction security is the Europay, MasterCard, and Visa (EMV) micro-processing chip, increasing complexity and costs for card counterfeiting, which is known as skimming.
Counterfeiting cases are steeply lowering in areas where EMV is implemented, but there has been a consequential rise in Card Not Present (CNP) crime [
65]. A CNP crime is the unauthorized use of another individual’s payment details for a transaction, mainly through online means. The payment information may be obtained after a data breach: for example, Bodker et al. [
66] describe the script followed by criminals in a CNP crime, which allows a more reasoned consideration for mitigation strategies.
Furthermore, both Target and Home Depot breaches were initiated with a phishing attack [
63], which reinforces the need for Security Education, Training, and Awareness (SETA), as discussed in
Section 5.2.7.
5.2.2. HACK
As evidenced in the cases examined in this paper, such as eBay, Target, and Home Depot, phishing is a common threat vector used to initiate data breaches. A practical approach for reducing the success rate of these attacks is implementing a robust SETA program, which is discussed in more detail in
Section 5.2.7. Such a program plays a fundamental role in enhancing employees’ ability to recognize and thwart security threats, such as phishing attempts.
Naqvi et al. [
67] reviewed the literature on phishing mitigation procedures through different vectors, such as e-mail and websites. Most proposed techniques rely on Machine Learning or training and awareness. Multifactor Authentication (MFA) may protect the user account even after successful phishing, as the user’s identification and password obtained with the technique would not suffice for logging in, requiring an extra factor.
As related in
Section 5.2.1, in Target and Home Depot breaches, after successfully phishing credentials, the attacker used banking malware to exfiltrate data. As the financial sector represents the most significant contribution portion in the dataset, we find it convenient to discuss this type of malware.
Black et al. [
68] surveyed some of these malware (namely, Zeus V2, Citadel, Carberp, Vawtrak, Dridex, Dyre, and Rovnix), providing Indicators of Compromise (IoC) for identifying their infection and evaluating their similarities and differences.
However, it is relevant to note that certain malware strains are region-focused, such as Guildma, Grandoreiro, and Javali, which primarily targeted Brazilian entities [
69]. A threat intelligence project may be needed to identify common malicious activities within the organization’s operational domain.
Even after a successful malware infection, the data breach may be prevented if the company effectively applies other security measures, such as encryption and access control. This was not the case, for example, with LinkedIn, which, as discussed in
Section 4.3, had millions of unsalted password hashes leaked, highlighting the importance of comprehensive security measures to protect sensitive data.
Several factors compounded LinkedIn’s security vulnerabilities. Firstly, the company employed the SHA-1 hashing algorithm, which has been demonstrated to be vulnerable to various attacks. NIST SP 800-131A revision 2 has disallowed the use of SHA-1, permitting it for non-digital signature applications only. Currently, SHA-2 and SHA-3 are secure message-digesting algorithms.
Secondly, LinkedIn’s security was compromised by the absence of a salt algorithm to enhance the security of hashed passwords. When the same password is processed using the same message-digesting algorithm, it consistently generates the same hash value, which increases predictability and susceptibility to brute-force attacks. Salting algorithms involve appending a unique string to the password before hashing it, significantly improving its security. Additionally, password leak bases should be continually monitored, in search of credentials in use at the organization.
Section 4.1 briefly introduced the Heartland Payment System breach, which relied on SQL injection. For application-level vulnerabilities, such as the one exploited at HPY, the Open Worldwide Application Security Project (OWASP) is a well-known reference. They regularly publish the Top 10 vulnerabilities in the application security scope, along with their mitigation strategies, such as input validation, Web Application Firewall (WAF), and software testing.
Although not in the studied dataset, attackers successfully intruded on cloud providers in the 2020 SolarWinds hack case, exposing and breaching their customers’ data [
70]. In this incident, adversaries inserted arbitrary code in the source code of a company product called Orion. Afterward, SolarWinds distributed the malicious code to its customers as part of the product, infecting over 18,000, including government entities and private companies [
71]. This example reinforces the importance of Supply Chain Management (SCM) in cybersecurity.
In large and technologically complex companies, keeping the systems up to date may be challenging. As a consequence, attackers may exploit known vulnerabilities in the systems. Thus, it is fundamental to establish a patch-management program to timeously update and secure the organization’s assets.
For zero-day attacks, which explore previously unknown vulnerabilities, a security patch has not yet been published by the product developer, and signature detection is ineffective. As an alternative, ML methods can detect such intrusions based on the perception of suspicious activities that differ from the expected baseline, which may enhance the detectability for novel attacks.
Data Loss Prevention (DLP) solutions also contribute to data security and avoidance of data breaches. Such technologies detect and deter unauthorized data transfers, including preventing PII data breaches. However, it is important to note that DLP is ineffective in detecting data exfiltration through steganographic techniques.
When there is a need to publish statistical metrics related to a dataset, but concerns about preserving the privacy of the individuals within the dataset are paramount, leveraging differential privacy can be a valuable and appropriate approach to addressing this challenge. Differential privacy, introduced by Dwork [
72], provides a framework for releasing aggregate information about a dataset while adding noise or perturbation to the data so that individual records remain private and indistinguishable. This ensures that sensitive information is protected and that statistical knowledge can be derived without compromising the privacy of the data subjects, whilst maintaining the utility of the data [
73].
Because of this balance between utility and privacy enhancement, differential privacy has applicability in several areas, and is used by the US Census and by big companies such as Google, Apple, and Microsoft [
74].
5.2.3. INSD
An insider threat is anyone with authorized access to or knowledge of an organization’s resources. The company trusts this person, who knows the company’s fundamentals and has access to its assets. Because of that, a malicious insider can potentially cause great damage to the company imperceptibly. The average time taken for a company to detect an insider’s malicious actions is 85 days [
75].
An insider may be classified as unintentional or intentional. As unintentional insider threats are more suitable, within the scope of this work, to DISC, PHYS, PORT, and START, in this section, we discuss mainly intentional malicious insiders.
The main motivations for conducting an insider attack are financial benefits or espionage [
76]. Insider-incident action is privilege abuse, while the actions are undertaken mainly via privilege abuse. Because of that, the least-privilege policy and Privileged Access Management (PAM) technology are helpful tools for preventing insider leakage.
The leading adopted technologies used for mitigating insider threats are Data Loss Prevention (DLP), Privileged Access Management, User and Entity Behavior Analytics (UEBA), Security Information and Event Management (SIEM), Endpoint Detection and Response (EDR), and Insider Threat Management (ITM) [
75].
Administrative security controls may also be implemented. A background check and an employee screening upon hiring may reveal a mischievous past history for the candidate, enabling the company to cancel the employment process. If a person passes this investigation, enforcing a Non-Disclosure Agreement (NDA) signing is an additional countermeasure, as it will legally constitute their liability.
After an employee is hired, other security measures should still be adopted. One is assessing the need to know for each employee, enforced through an access control mechanism. Granting more access to knowledge and data than the employee needs to perform their usual tasks exposes the information unnecessarily. A similar control is based on the principle of the least privilege, which grants a worker the minimum necessary privileges.
Separation of duties is another form of mitigating inside intentional threats, which divides critical tasks among several employees, as per their department in the organization, for example. A job rotation policy, although sometimes infeasible, may also help manifest fraud, sabotage, or espionage. Terminating the contract with the employer is another critical step in preventing data leakage, and the company must ensure that the user’s accounts are disabled, preferably during the exit interview, in which any equipment belonging to the organization should be returned, and after which the ex-employee should be escorted out of the facility.
5.2.4. PHYS, PORT, STAT: Losses
Losing a device is greatly facilitated by its mobility, as an employee may take it anywhere and be robbed or mislay the portable equipment or document, possibly containing sensitive company data. In that regard, Bring Your Own Device (BYOD) has increased the potential for such occurrences. It refers to using employee-owned mobile devices to access business enterprise content or networks. Similar portability concepts are Choose Your Own Device (CYOD), Company-Owned and Personally Enabled (COPE), and Company-Owned Business Only (COBO), and they all raise security concerns.
Wani et al. [
77] list some challenges these mobile devices bring to hospitals, which may also apply to companies in general. They categorize these challenges as related to technology, human factors, and policies. They also provide possible solutions to these challenges.
Table 9 presents these challenges and solutions.
In addition to BYOD, teleworking and co-working spaces may pose a security threat to companies. These work models, which have emerged since the COVID-19 pandemic, also imply new security gaps similar to the BYOD-related ones.
Geofencing is another suitable security control in this scenario, which refers to triggered actions in response to a device leaving a pre-defined geolocation. Such actions could be, for instance, disabling its network interface card or remotely wiping the device to prevent data leakage upon exiting an authorized area.
On this subject, Uz [
78] evaluated the effectiveness of remotely wiping data, considering the deleted data may be forensically retrieved, as explained in
Section 5.2.5.
Encryption is also recommended for data protection, and, for mobile devices, File-Based Encryption (FBE) is mandatory in Android since its 10th version [
79]. For notebooks, BitLocker and VeraCrypt are some available options.
One appropriate access control method for BYOD is Attribute-Based Access Control (ABAC). As per this paradigm, the company can deny and concede access to an identity based on attributes of the request, such as location, hour of the day, and object being accessed.
5.2.5. PHYS, PORT, STAT: Disposals
When disposing of sensitive data, one must be aware of the possibility of an adversary searching the dustbin, which is known as dumpster diving, a social engineering attack.
With this method, the attacker may access any object the company discards, such as equipment and documents. There may be sensitive data among this disposed-of material, such as employees’ noted passwords or customers’ data. In that case, the malicious actor will have more information to conduct the attack.
It is noteworthy that since the California v. Greenwood case in 1988, the legality of the warrantless search-and-seizure of garbage left in public areas has been established [
80]. Because of that, for one more layer of security against data breaches, companies should keep their waste bins locked in private areas.
As an additional countermeasure to this approach, a company should, at the end of the data life cycle, carry out an adequate disposal of information. To accomplish that, Data Classification and Asset Disposal Policies should be implemented and publicized to raise employees’ awareness. To aid suitable editing of these and other policies, several esteemed security organizations provide policy templates, such as the SANS Institute (
sans.org/information-security-policy/ accessed on 7 April 2024) and CIS (
cisecurity.org/ accessed on 7 April 2024).
Before disposal, the media must be sanitized—that is, have its data rendered inaccessible for a given level of the attacker effort, depending on the classification of the data. Proper media-sanitization techniques are presented by NIST SP 800-88 [
81] for different media types.For paper documents, for example, the standard states that they must be shredded in pieces small enough that there is reasonable assurance that the data cannot be reconstructed in proportion to the data confidentiality. To further hinder a malicious reconstruction, sensitive documents may be mixed with public paper in the shredder input. Regarding the size of the shredded pieces for each classification level, the German standard DIN 66399 [
82] provides some valuable guidelines, some of which are summarized in
Table 10.
Similar disposal approaches should be deployed to digital devices, such as Hard Drives (HD), Solid-State Drives (SSD), flash drives, and CDs/DVDs. Despite physical destruction and shredding still being possible for these types of media, and it indeed being recommended for more sensitive cases, the nature of these devices allows for other erasure mechanisms, especially for the least sensitive data.
It is known that simply deleting files via the operating system is not an effective way to purge data, as data carving techniques can retrieve said files [
83]. Other techniques, such as zero filling, in which all data are overwritten with zeroes, are effective against commonly available data retrieval mechanisms, according to NIST SP 800-88. Additional filling rounds may be performed to increase security.
Specifically for magnetic HDs, the degaussing technique may be used. It consists of applying a magnetic field to the hard drive, which changes the magnetic patterns on the device, consequently destroying the data.
The degaussing approach will not be practical for SSDs, which are not magnetic. For this type of media, a secure way of dealing with data remanence is crypto-shredding, also named crypto-erasure. In this procedure, the data stored in the device are encrypted with a secure algorithm, and then the decryption key is discarded, rendering the data unrecoverable.
In addition to disposals, these sanitization techniques should also be applied when donating or selling the devices if the sensitivity of the data allows the transfer of the property of the media.
5.2.6. PHYS, PORT, STAT: Thefts and Inappropriate Accesses
This section primarily discusses physical security aspects that may be implemented at a company facility to prevent a data breach. We understand that the mitigation approaches related to thefts of the company assets in possession of an employee outside of the company’s premises are embraced in
Section 5.2.4.
The physical-security design in a company starts in the architectural-arrangement stage of the facility construction. Crime Prevention Through Environmental Design (CPTED) strategies may be employed during this phase. Through this approach, criminals are deterred and more easily detected by the physical layout of the space.
The main CPTED principles are natural surveillance, access control, territorial reinforcement, and maintenance [
84]. As an example, it is stated that fences should be at least 3 ft (about 1 m) high to deter casual trespassers and at least 8 ft (approximately 2.5 m) high to deter purposeful infiltrators [
85].
Other physical controls should be implemented to prevent incidents, especially in more sensitive areas, such as data centers. Examples include the use of a keypad (preventive), guards (deterrent), security cameras (detective), and alarms (corrective).
Nonetheless, all these security measures will be rendered useless if the human factor is successfully explored. An adversary may, for example, covertly sneak through a door opened by authorized personnel, a practice known as tailgating, or they may convince someone to let them enter, for example, by saying that they forgot their badge and are in a hurry. The latter is a social engineering tactic known as piggybacking. An effective countermeasure to these intrusions is the use of a mantrap.
However, intruders do not always perpetrate physical incidents. Authorized guests, for example, may perform unauthorized activities, and, in that case, additional physical countermeasures must be put in place.
One possible gap is the direct observation of devices’ screens and keyboards. In such cases, an adversary may obtain sensitive data such as passwords through shoulder surfing. To hinder this activity, it may be necessary to relocate the devices.
Another security measure regarding the employee’s workstation is the implementation of a Clean-Desk Policy, which enforces that all desks within the company must be clear of objects and documents. After successfully implementing this policy, an intruder cannot steal a sensitive document from a worker’s desk.
NIST SP 800-12 [
86] Chapter 15 reviews other physical security practices. Physical safety, which aims to protect people’s physical integrity, life, and health, is another relevant topic in this discussion. However, as these incidents do not usually result in data breaches, which are the focus of this work, we do not include them in the discussion.
5.2.7. DISC
Unintended disclosure may be classified as the result of an unintentional insider threat, either due to negligence or recklessness.
An effective Security Education, Training, and Awareness program may be capable of reducing the incidence of these cases and promoting compliance in an organization.
Education aims to equip IT personnel with security skills through methods like cyberattack simulations, targeting a high level of expertise. Training focuses on enhancing security knowledge among all employees through classes, for example. Awareness efforts aim to capture the attention of all employees regarding security concerns through mediums like banners, addressing a basic level of security understanding [
87].
It may be observed that security awareness programs are helpful in mitigating risks associated with the general utilization of technological resources by the general user, such as credentials compromise and social engineering attacks, like phishing. Conversely, security training and education focus on the prevention of cyber incidents rooted in technical vulnerabilities, such as weaknesses originating from misconfiguration, and they should be directed to IT personnel.
Alyami et al. [
88] assessed the critical factors for deploying a successful SETA program, based on a survey with 65 respondents. They produced a ranked list of essential factors of success. Gamification is also seen as a reasonable way of enhancing engagement in the program.
According to PCR (
Table 2), the DISC type categorization includes publicly posted information sent to the wrong party. In addition to SETA, a two-person control may also reduce the likelihood of these disclosures. With this approach, two people must authorize an action before its execution.
5.3. Containment, Recovery, and Response
The affected company must study a response strategy after an attacker and a data breach have circumvented the security controls. From the technical point of view, the company must quickly contain the data leakage to minimize the potential damage and then identify and eradicate the components of the incident. NIST SP 800-61 [
89] provides a more in-depth guide for computer security incident handling.
This NIST publication divides the incident response process into five steps: Preparation, which occurs before an incident and corresponds to preventive security measures; Detection and Analysis, in which the attack vector and TTPs are identified; Containment, Eradication, and Recovery, which comprises an initial restriction of the malicious activity and a subsequent cleanse of malicious artifacts (though keeping them for forensic analysis); followed by the Restoration of the systems’ operation. Finally, post-incident activities include discussing and documenting the incident to understand it better and prevent similar future intrusions.
For a better comprehension of the causes of the incident and of eventual system modifications made by the intruder, forensic tools and techniques may be helpful. NIST SP 800-86 [
90] provides guidelines for integrating forensic techniques into incident response, including data collection, examination, and reporting from different sources, such as files, operating systems, networks, and applications. When performing digital forensics, it is important to maintain a chain of custody and preserve the integrity of the evidence, not removing it.
Specifically, in the data leakage domain, ref. [
91] proposed a data breach response methodology based on ISO 27035 [
92] and NIST 800-61 [
89]. Their study emphasized the importance of automating this process, especially due to the short time required to notify a breach for legislative compliance.
Hillmann et al. [
93] conducted 12 interviews with customers regarding their expectations regarding a data breach response. They concluded that expectations vary according to several factors, such as breach severity, data leakage type, and company sector. Hence, a company must adapt its response strategy to the specific scenario to maximize the chance of meeting its customers’ expectations.
For notification, especially for complying with the legislation mentioned in
Section 5.1, the FTC mentions the importance of notifying law enforcement and affected businesses and individuals, specifying what happened, what information was stolen, how the attackers used the information, what remediation measures were taken, and how customers may contact the organization regarding the breach. They also provide a model letter for data breach notification.