Next Article in Journal
Advanced Techniques for Geospatial Referencing in Online Media Repositories
Next Article in Special Issue
Multi-Constraint and Multi-Policy Path Hopping Active Defense Method Based on SDN
Previous Article in Journal
SoK: Analysis Techniques for WebAssembly
Previous Article in Special Issue
Efficient Privacy-Aware Forwarding for Enhanced Communication Privacy in Opportunistic Mobile Social Networks
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Towards a Hybrid Security Framework for Phishing Awareness Education and Defense

by
Peter K. K. Loh
*,
Aloysius Z. Y. Lee
and
Vivek Balachandran
Singapore Institute of Technology, 172 Ang Mo Kio Ave 8, Singapore 567739, Singapore
*
Author to whom correspondence should be addressed.
Future Internet 2024, 16(3), 86; https://doi.org/10.3390/fi16030086
Submission received: 9 February 2024 / Revised: 27 February 2024 / Accepted: 28 February 2024 / Published: 1 March 2024
(This article belongs to the Special Issue Information and Future Internet Security, Trust and Privacy II)

Abstract

:
The rise in generative Artificial Intelligence (AI) has led to the development of more sophisticated phishing email attacks, as well as an increase in research on using AI to aid the detection of these advanced attacks. Successful phishing email attacks severely impact businesses, as employees are usually the vulnerable targets. Defense against such attacks, therefore, requires realizing defense along both technological and human vectors. Security hardening research work along the technological vector is few and focuses mainly on the use of machine learning and natural language processing to distinguish between machine- and human-generated text. Common existing approaches to harden security along the human vector consist of third-party organized training programmes, the content of which needs to be updated over time. There is, to date, no reported approach that provides both phishing attack detection and progressive end-user training. In this paper, we present our contribution, which includes the design and development of an integrated approach that employs AI-assisted and generative AI platforms for phishing attack detection and continuous end-user education in a hybrid security framework. This framework supports scenario-customizable and evolving user education in dealing with increasingly advanced phishing email attacks. The technological design and functional details for both platforms are presented and discussed. Performance tests showed that the phishing attack detection sub-system using the Convolutional Neural Network (CNN) deep learning model architecture achieved the best overall results: above 94% accuracy, above 95% precision, and above 94% recall.

1. Introduction

The importance of phishing awareness to businesses and individuals is accentuated by the danger posed by increasingly deceptive Artificial Intelligence (AI)-powered phishing emails. Phishing emails may use fake social media profiles, in-depth research, and more to trick unsuspecting victims into clicking malicious links [1,2,3,4]. End users now find it challenging to identify and steer clear of such sophisticated phishing emails. Phishing attacks also form the foundation for a majority of advanced and potent malware attacks. Globally, 323,972 internet users fell victim to phishing attacks in 2021. This is despite Google’s cyber security measures blocking 99.9% of phishing attempts from reaching users. With an average of 136 USD lost per phishing attack, this amounts to 44.2 million USD stolen by cyber criminals through phishing attacks, mostly through emails [2].
As AI’s popularity grows and its usability expands, it is becoming more embedded in the threat actor’s arsenal [5,6]. Generative AI, in particular, will likely make life more difficult for cybersecurity practitioners and end users alike. However, AI can also be used to bolster defenses. It is uniquely suited to detect AI-powered phishing attempts [3,7]. For this reason, security leaders should consider deploying AI support for email security purposes. That said, operational expenses must also be kept in mind. While using an AI model to monitor all incoming messages could go a long way towards preventing AI phishing attacks, the cost may still be prohibitively high.
While investing in robust security solutions is important for businesses, it is equally important to recognize that employees play a pivotal role as the first line of defense against email phishing attacks. Training via third-party courses or workshops is inadequate in preventing human error or reducing it on a sustainable basis [4,8]. Without proper ongoing learning in place, employees may face risks of becoming victims if training does not sufficiently equip them to recognize diversity in phishing threats or to detect threats of increasing sophistication. In addition, theory and practical experience of email phishing attacks and how to deal with them are very helpful. Numerous sectors, including cybersecurity, internet service providers, web security firms, and online businesses that depend on the security of their customers’ data can gain from this education.
Generative AI models can also make security awareness training more customizable, efficient, and effective. For instance, an AI chatbot could automatically adapt a training curriculum on a user-by-user basis to address an individual’s weak spots, based on historical or real-time performance data. Additionally, the technology can identify lapses in attack-specific awareness over time and deliver refresher-training content accordingly. Phishing campaigns have been employed by organizations not only to train but also to test the awareness of their employees against various phishing attacks [4,8]. Crafted test emails embedded with links, like those used in actual attacks, are sent to employees. Organizations can then identify at-risk employees who click on these links and reveal themselves to be vulnerable to actual attacks. These employees can then be selected for re-education to refresh their awareness and/or enrich their experience.
That said, there is no guarantee that a well-trained workforce will always be safe against phishing attacks. Therefore, to comprehensively protect an organization’s digital well-being, the need arises for a hybrid security framework that accommodates suitable AI-assisted phishing detection technology to reinforce a customizable, evolving security awareness education. An AI-assisted phishing detection technology can be progressively trained to detect phishing emails of increasing sophistication. Concurrently, a customizable, evolving phishing awareness training and testing solution can be tailored to different business models and effectively equip users with the experience and knowledge to identify and deal with harmful phishing attacks.
Our paper presents a proposed hybrid security framework prototype that supports a customizable, evolving phishing awareness training and testing solution reinforced by an AI-assisted phishing detection platform. The prototype is currently used by a government agency and can be employed by other organizations to train and test its employees. The prototype solution offers the following benefits:
  • Adaptable to different business models and operations;
  • Assess an organization’s security posture and identify at-risk employees;
  • Able to generate test cases to scale with increasing phishing email deceptiveness;
  • Offers better privacy for an organization’s security posture and vulnerable employees;
  • Reinforces an organization’s security posture with AI-based phishing URL detection.
The remainder of this paper is organized as follows: Section 2 reviews related AI research work in the phishing domain; Section 3 introduces the 3rd generation neural network machine learning model that can be used to support realistic human-like text production; Section 4 provides a background on the practice of designing and refining prompts that are used to elicit responses from AI models; Section 5 discusses the accuracy of the large language model used in the generative AI component in our framework; Section 6 presents the phishing education solution prototype architecture; Section 7 and Section 8 introduce and detail the functionality of the phishing campaign formulation portal of the phishing education solution prototype; Section 9, Section 10 and Section 11 present the training dataset collection and data pre-processing considerations and deep learning model architecture specifications for the AI-assisted phishing detection platform; Section 12 compares the performances of the different deep learning architectures considered for the AI-based phishing URL detection platform; and Section 13 concludes the paper with a view for future research.

2. Related or Existing Work

In advanced phishing, AI can be and is already being used by attackers to automate or semi-automate their attacks and significantly reduce the amount of effort needed to carry them out. Attackers can increase their return by scaling up the effectiveness and volume of attacks by spending less time developing successful techniques. Phishing attacks enabled by AI can mimic trusted user behavior, make calls for the attacker with voice impersonation, and so on [5,6].
A range of research works on AI-based phishing exists. Few of these focus on using AI to detect phishing attacks [3,7,9] while others detail how sophisticated phishing attacks can be developed with AI assistance [10,11,12]. We review and critique both categories.
Ref. [9] provides a critical survey and review of research works from natural language processing (NLP) and machine learning (ML) communities to build accurate detectors for English. An in-depth error analysis of the state-of-the-art detector was also conducted and research directions to guide future research in the detection of disinformation.
Ref. [7] presents an AI as a service (AIaas)—based defense framework to detect phishing attacks. OpenAI’s GPT-3 API was used to accurately distinguish between machine- and human-generated text. The results demonstrate that there is potential to design a credible defense framework against advanced AI text generators without requiring significant AI expertise or resources. However, the results will need to be validated by a much larger-scale study encompassing multiple contexts and models.
Ref. [3] provides a literature review of AI techniques: Machine Learning, Deep Learning, Hybrid Learning, and Scenario-based techniques for phishing attack detection. The paper also provides a comparison of different studies detecting phishing attacks for each AI technique and examines the benefits and limitations of these methodologies.
Ref. [10] presents a long short-term memory (LSTM) neural network that learns to socially engineer specific users into clicking on deceptive URLs. The model is trained with word vector representations of social media posts. It is also dynamically seeded with topics extracted from the target’s timeline in order to make a click-through more likely. Users are also triaged to determine which ones are more likely to be phished and vulnerable users are sent highly personalized messages. Research has since progressed with the use of tools like Chat-GPT, which obviates the need to first segregate vulnerable targets before launching the phishing attack.
Ref. [11] explores the possibility of using Chat-GPT to develop advanced phishing attacks and automate their large-scale deployment. Their threat model involves a sophisticated and fully automated phishing kit. A user (attacker) needs to have a programming foundation, e.g., in Python and knowledge of OpenAI Codex models and Github Copilot. The Codex model integrates with Copilot which, in turn, is integrated with Integrated Development Environments (IDEs). Attack test cases center on the cloning of websites to create phishing duplicates.
Ref. [12] explores the different applications of generative AI in social engineering attacks. Using the blog mining technique, some insights into the evolving threat landscape are proposed. The rise in generative AI models, like ChatGPT, FraudGPT, and WormGPT, has augmented existing threats and ushered in new dimensions of risk. These range from phishing campaigns that mimic trusted organizations to deepfake technology impersonating authoritative figures.
Ref. [4] provides an overview of how the application of artificial intelligence to phishing significantly impacts the healthcare industry and concludes with efforts that should be made to reduce the success rate of all phishing attacks, including those that have been augmented by the use of AI. The paper emphasizes that awareness training for end users is imperative. It advises that users should be trained to detect phishing e-mails and to manage all e-mails with vigilance. It also proffers that users receive periodic refresher training to maintain and update their capability.
From the above, it can be concluded that an optimal detector for advanced phishing is still in the research and developmental stage and the overall spread of AI-based phishing attacks cannot yet be countered by technical means alone. In the end, the final line of defense would still be the end user. To address this, our research examines the use of generative AI as a basis to develop a customizable and scalable solution prototype to educate and enhance user capability in dealing with increasingly sophisticated email-based phishing attacks. In comparison with privately conducted courses and workshops, our prototype solution offers the following benefits:
  • Adaptable to different business models and operations;
  • Assess an organization’s security posture and identify at-risk employees;
  • Able to generate test cases to scale with increasing phishing email deceptiveness;
  • Offers better privacy for an organization’s security posture and vulnerable employees;
  • Reinforces an organization’s security posture with AI-based phishing URL detection.

3. GPT-3

A third-generation autoregressive language model called GPT-3 creates text via deep learning that resembles human speech. It is an AI system made to produce lists of words, lines of code, or other types of data, starting with a source input known as a prompt. For instance, it can be applied to statistically predict word sequences in machine translation. An unlabeled dataset of texts, including those from Wikipedia and many other websites, mostly in English but also in other languages, was used to train the model [13].
The availability of tools like GPT-3 heralds the beginning of a new era in which high-quality, low-cost texts can be generated in large quantities. Translations, summaries, blogs, websites, and many other forms of work will be greatly disrupted. It will take getting used to for readers and text consumers to not be able to tell if a source is written by a human or a bot. With fewer errors and better language, future readers could even notice an improvement in writing [13].
When created in volume, utilizing automation tools and sent via email with a specific purpose, the emails generated by GPT-3 considerably increase the risk of being mistaken for actual emails. GPT-3 produced emails that looked quite convincing, particularly when using a targeted email dataset for training and keywords in the prompts. Even with safeguards set up for GPT-3, phishing emails were still able to be produced [14].
Of concern would be that a study has found that GPT-3 can produce writings that are more extremist in nature, in comparison to the older GPT-2. GPT-3 was shown to excel in producing text that faithfully imitates radical materials that may be used to radicalize people towards violent far-right extremist views and practices [15].

4. Prompt Engineering

To effectively employ Large Language Models (LLMs) for a broad range of AI-based applications and tools, prompt engineering is a relatively recent field that focuses on creating and improving prompts. GPT-3 will heavily rely on the proper usage of prompts as inputs to generate the intended output for the user. The quality of the prompts used is one of the crucial elements that define its success. GPT-3 can stay on topic and cover the subjects the user is interested in if the prompts are clear and concise. On the other hand, poorly specified prompts may result in rambling or unfocused results, which will make the output less meaningful and useful [16].
A clear and straightforward prompt will make it more likely that GPT-3 will comprehend the subject or job at hand and be able to produce a suitable response. Trying to be as explicit as possible with prompts and avoiding using terminology that is unnecessarily complicated or confusing is necessary. To assist and steer the output, a well-defined prompt should have a distinct aim and emphasis. Prompts that are too general or open-ended should be avoided [16].
The Awesome ChatGPT Prompts repository on GitHub is a great resource for prompts that can steer GPT-3 to act as a certain character or role [17]. The following screenshots, Figure 1, Figure 2 and Figure 3, contain examples of prompts that can steer the direction of GPT-3′s character or role. The framework is adaptable to more sophisticated email phishing threats via appropriately defined prompts. Appropriately designed prompt content can guide the GPT-3 engine to act as a specified character or role with user-specified actions or behaviors.

5. Accuracy of GPT-3

GPT-3 has rapidly gained popularity among public users, not just those in the AI and NLP communities who may be more acquainted with LLMs. One of the primary factors is the abundance of GPT-3 use cases that are published online by both academic and non-academic users, in addition to social media. It is therefore important to analyze the accuracy of the system itself.
A study found that on several tasks, GPT-3 scores better than several cutting-edge LLMs, and on a few tasks, it even beats fine-tuned models. Although GPT-3 works well in many of the tasks, some instances of failure do occur for each job. Sometimes GPT-3 produces a summary for a summarization job that is even lengthier than the original text. When performing machine translation, GPT-3 occasionally renders an erroneous translation for some phrases, subtly changing the meaning. Consequently, handling these special situations is a difficult but crucial job [18].
In terms of multilingualism, GPT-3 performs admirably in a few high and medium-resource languages. However, GPT-3 is still unable to comprehend and produce sentences in low-resource languages. Also, despite the high resource demand for non-Latin script languages, GPT-3 is unable to translate phrases in such languages. This brings up the issue of how languages are represented in GPT-3 (Bang et al., 2023). The accessibility of the technology is thus constrained by the difference in performance when dealing with low-resource languages [19].
When asked questions on certain topics, GPT-3 gave confident replies that may seem absurd to experts on the subject. “Artificial hallucination” has been used to characterize this occurrence [20]. While GPT-3 may produce convincing scientific articles, the data it produces are a mixture of real data and made-up data. This raises questions regarding the integrity and correctness of utilizing extensive LLMs, like GPT-3, in academic writing. To uphold strict scientific standards, it is suggested that the procedure for assessing papers for journals and conferences be reconsidered. Moreover, full disclosure of any usage of these technologies in the writing process is recommended [21].

6. Education System Architecture—ph1sher

In this section, we will explore in detail the functionalities of our new phishing detection and training sub-system—ph1sher. The system essentially contains three major functionalities:
  • User awareness: A comprehensive collection of a wide range of available resources for educating and creating awareness about phishing among users.
  • Phishing campaign: A generative-AI-based system that can dynamically create phishing emails based on the specified scenario.
  • URL detection: To identify malicious phishing URLs based on an AI detection model.
The backbone of the ph1sher tool is Python Flask, which communicates with the GPT API and Amazon Web Services (AWSs) server through our web portal or Chrome extension as shown in Figure 4.
A user can access the system through our web portal to access one of the three services that we provide—User awareness, Phishing campaign, or Phishing URL detection. The User awareness option will redirect the user to a categorized section of resources related to phishing attacks and countermeasures. These educational resources can be used for general awareness about phishing.
The major highlight of our tool is the use of generative AI to launch customized phishing campaigns that cater to various scenarios. The user can choose the campaign that he desires from the web portal. Our Flask backend will handle this HTTP request and redirect the external GPT API to generate context-aware customized phishing emails according to the targeted victim profile. Various prompts used in our phishing campaign portal are discussed in detail in Section 7.
For URL detection, the Flask backend will make use of the Chrome extension and the backend detection deep learning model to classify the URL. The Chrome extension intercepts the URL visited by the user, and if it is classified as dangerous, then the user is redirected to a captive portal. The design and training of the detection model are discussed extensively in Section 9, Section 10 and Section 11. Based on the user input via the portal, our Flask backend invokes the appropriate functional unit to perform URL detection, Phishing campaign, or User awareness.

7. Phishing Campaign Portal

A total of 18 prompts that were used to generate phishing emails with the help of the GPT-3 API have been designed and illustrated in Table 1.
The emails aim to invoke in the target user a mix of anxiety, confusion, curiosity, excitement, fear, hope, trust, or urgency, depending on the theme. Table 1 contains the theme of the prompts that are displayed on the phishing campaign page, as well as the prompts that are supplied to the GPT-3 API.
The web interface allows for phishing campaigns to be easily generated with the aid of the GPT-3 API. The main page of the phishing campaign will have all the point-and-click themes of all the attacks laid out in a grid manner, as shown in Figure 5.

8. Phishing Campaign Details

The gpt-3.5-turbo engine is used for the generation of phishing email content. It takes the role of an assistant and is supplied with the prompt relevant to the theme mentioned above. Figure 6 shows the engine code configuration implemented. The temperature and top_p parameters are both set to one to maximize the randomness of the content that is being generated. The max_tokens parameter is set to 2000 to prevent the output from being truncated. The frequency_penalty and presence_penalty parameters have been set to 0.5 to reduce the amount of repetition in the output.
The framework invokes an organization’s email system’s new message API and inputs the generated email. After the email content has been successfully generated, it will be displayed to the user as a new email message. The subject header is in bold to let the user know that they will have to place it at the subject of the email. The user can repeatedly generate new phishing email content with the Generate Output button until they are satisfied with the output. The email content will also contain placeholders that the user can replace with their required details. When the email is finalized, it is sent via the email system to the specified recipient, just like a regular email.

9. Dataset Collection

To train the deep learning model to recognize malicious URLs, a dataset of URLs will first have to be collected. Three sources of data were identified, namely PhishTank, PhishStats, and OpenPhish. PhishTank contained both benign and malicious URLs while PhishStats and OpenPhish only contained malicious URLs. These platforms have APIs for the data, but they are paid services. Therefore, customized scripts, e.g., web scrappers, parsers, and downloaders, were used to obtain the data.

9.1. PhishTank

PhishTank displays its phishing feed on its website (see Figure 7). The URLs are categorized as valid or invalid phishes, in which both benign and malicious URLs can be obtained. For valid phishes, PhishTank collates them into various file formats, which include the CSV file format. URLs are then extracted from the downloaded CSV file using a script.
For invalid phishes or benign URLs, PhishTank does not have them in an available file. Therefore, they had to be selectively scraped from the phishing feed that is displayed on the website. URLs that were too long were being truncated by the website and had to be scrapped from the specific sub-directories to obtain the URLs.

9.2. PhishStats

PhishStats has a public CSV feed with phishing URLs from the past 30 days that is updated every 90 min. The CSV file was downloaded, and the data were parsed in the desired format, as depicted in Figure 8.

9.3. OpenPhish

OpenPhish publishes its limited data feed of malicious URLs to its website, which is updated every 12 h (see Figure 9). The data that are published lists the URLs in a single column, facilitating the task of extracting the data with the aid of a scraper.

9.4. Cloud Server

Due to the phishing feed of PhishStats and OpenPhish being available for only a limited amount of time, a solution was required to constantly extract data from them. Therefore, a cronjob was run every 12 h via a cloud server, which ensured that URLs that were published on these platforms could be automatically gathered. Figure 10 shows this.
An AWS Free Tier cloud server was used to complete this task as a single instance of a Linux server could be run 24/7, without incurring additional costs. PuTTY was used to connect to the Linux server remotely and WinSCP to download the datasets.

10. Data Preprocessing

Building a good deep-learning model for malicious URL recognition requires effective data preprocessing. It guarantees that the model can efficiently be trained from the data and generate accurate predictions. Four main steps were taken to preprocess the data, namely data labeling, removing duplicate rows, removing rows with certain keywords, and feature extraction.

10.1. Data Labeling

Data labeling is the basis of supervised learning, a popular technique for developing deep learning models. The model can learn to differentiate between malicious and benign URLs with the help of properly labeled data, which serves as the foundation for accurate predictions. The URLs were labeled zero if they were benign and one if they were malicious. This was performed based on the data source which the URL originates from. The data were later consolidated to form a single CSV file with the “URL” and “malicious” columns.

10.2. Removing Duplicate Rows

Duplicate rows may appear due to the data being collected from different sources and may contain the same URL. When training the deep learning model to detect malicious URLs, duplicate rows in the dataset can cause biases and inefficiencies. As the model will give the repeated occurrences excessive weight, duplicate rows have the potential to inject bias into the training data and result in an unbalanced representation of patterns. Furthermore, duplicate rows may cause the model to overfit to the point that it memorizes the duplicates rather than learning patterns that apply to other cases. Consequently, the model may function well on training data but poorly on actual data. Figure 11 shows the script used for this.

10.3. Removing Rows with Redacted Keywords

Redacted keywords are words or phrases that have been purposefully hidden or masked, usually to safeguard private data. Examples of redacted keywords include “redacted”, “ionos” and [email protected] (see Figure 12). Most of the redacted keywords were found to be email addresses which were obfuscated to hide the original email address. Redacted keywords can occur because of data sensitivity, non-disclosure agreements, or privacy issues, which reduces the dataset’s usefulness for deep learning model training. It may be difficult for the model to efficiently find patterns if redacted keywords affect how other features in the data are interpreted. Therefore, rows containing redacted keywords were removed completely from the dataset. Figure 13 shows the script for this.

10.4. Feature Extraction

Only lexical features were used for the deep learning model’s training. The host-based and content-based features were purposefully left out because of the difficulties and time limits involved in extracting them. Websites are dynamic and subject to quick changes. A once benign URL might later turn malicious or stop working. The host-based and content-based features extracted from URLs would require manual monitoring and verification, and this process would take up a huge amount of time.
Although URLs are often represented as text data, deep learning models require numerical input. By transforming URLs into numerical representations that capture the crucial elements of each URL, feature extraction fills up this gap. The model can learn and recognize patterns linked to URLs with the aid of feature extraction, which improves the accuracy of detection. Table 2 shows the 32 features that were chosen to be extracted from the dataset that was collected.
The extracted features were then inserted into the dataset, together with the ‘URL’ and ‘Malicious’ columns for the training of the deep learning model later. A total of 817,997 rows of URLs remained after preprocessing the data, 468,005 of which were malicious and 349,992 were benign. A sample is depicted in Table 3.

11. URL Detection Model Architecture Specifications

The selection of the model architecture is a crucial step in the design of an accurate deep-learning system for supporting the identification of malicious URLs. A well-designed architecture can have a significant impact on the model’s ability to accurately identify threats while maintaining a low false positive rate. We developed six different models for Phishing URL detection that are based on various neural network architectures. We developed detection models based on Feedforward Neural Network (FNN), Bi-directional RNN (Bi-RNN), Gated Recurrent Unit (GRU), Long Short-Term Memory (LSTM), Recurrent Neural Network (RNN), and Convolutional Neural Network (CNN). Performance-related details on these architectures are discussed in Section 12.
The efficiency of the developed neural network models is measured using standardized measurement matrices of confusion matrix, accuracy, precision, recall (sensitivity), and F1-score. They provide a comprehensive measure of the quality of the machine learning model.

11.1. Confusion Matrix

The confusion matrix is defined as depicted in Figure 14.
TP (True Positive): The model correctly predicted a positive class.
FN (False Negative): The model incorrectly predicted a negative class when it should have been a positive one.
FP (False Positive): The model incorrectly predicted a positive class when it should have been a negative one.
TN (True Negative): The model correctly predicted a negative class.

11.2. Accuracy

The model’s accuracy indicates how frequently its predictions are accurate. It is determined by dividing the number of correctly predicted instances, which include true positives and true negatives by the total number of cases.
Accuracy = (TP + TN)/(TP + TN + FP + FN)

11.3. Precision

Precision measures how many of the total number of predictions that are specified as positive are correctly assigned. The ratio of actual (true) positives to all cases that were predicted to be positive is used to compute it.
Precision = TP/(TP + FP)

11.4. Recall (Sensitivity)

Recall or Sensitivity is the total number of actual positive cases that were predicted correctly. The ratio of actual (true) positives to the overall number of positive occurrences is used to compute it.
Recall (Sensitivity) = TP/(TP + FN)

11.5. F1 Score

Precision and Recall are balanced by the F1 Score, which aids in determining the right trade-off for the given situation. It offers a balance between these two measures, which is beneficial in situations when datasets are unbalanced.
F1 Score = 2 × (Precision × Recall)/(Precision + Recall)

12. Architecture Performance Comparisons

Six different neural network architectural models were developed and compared: Feedforward Neural Network (FNN), Bi-directional RNN (Bi-RNN), Gated Recurrent Unit (GRU), Long Short-Term Memory (LSTM), Recurrent Neural Network (RNN), and Convolutional Neural Network (CNN). These models were chosen based on previous studies, which have shown that these models have demonstrated promising performance in a variety of phishing detection-related applications [22,23,24,25,26].
The training and testing workflow for the models are as follows:
  • Preprocessed dataset discussed in Section 8 and Section 9 is used;
  • Dataset is split into a training set and an evaluation set at an appropriate ratio;
  • Training dataset is used to develop the six deep-learning models;
  • Evaluation dataset is used to test the efficiency of the models;
  • Matrices specified in Section 11 are used for the evaluation of the models;
  • Execution time is measured for each model to evaluate the performance speed.
For instance, Figure 15 is the code representation for FNN implementation. Performance comparisons among the different deep learning architectures are illustrated in Table 4.

12.1. Results Discussion

The CNN architecture model was selected due to its overall performance in the evaluation measured and fast training times. The initial CNN architecture model contains multiple layers, and its architecture is explained with reference to the code in Figure 16:
Input layer: The input layer defines the model’s input shape. The last dimension, one, denotes that there is only one feature per time step, while the first value shows the number of features in the input data. This is typical for time series or sequence data.
Conv1D layers: The convolutional layer utilizes the Rectified Linear Unit (ReLU) activation function and includes filters, each with a kernel size of three. To extract features from the input data, convolutional layers are used.
Max-pooling layers: This layer, which uses max-pooling with a pool size of two, is inserted after the convolutional layer. Max-pooling shrinks the feature maps’ spatial size while preserving crucial data.
Flatten layer: This layer transforms the 3D tensor output of the previous layer into a 1D tensor. This step is necessary to connect the convolutional layers to the fully connected layers.
Dense layer: This layer has 128 units, a ReLU activation function, and it is fully linked. Fully connected layers make decisions based on the features that the convolutional layers have extracted.
Output layer: One dense neuron with a sigmoid activation function makes up the output layer. It produces a probability score that ranges from zero to one, which is the usual range for binary classification issues.
Model compilation: The Adam optimizer, a popular option for optimizing neural networks, is used in the model. Binary classification jobs are well suited by the loss function, which is set to Binary Cross-Entropy. The disparity between the expected and real labels is quantified. Accuracy will be reported by the model as a measure to monitor throughout training.
The number of convolutional and max-pooling layers has been increased to further increase accuracy, as depicted in Table 5. Four convolutional and max-pooling layers have been selected as the accuracy increased minimally with an increase in execution time.
A dropout layer was placed in between the flatten and dense layers. Randomly changing a portion of the input units to zero during each training stage helps minimize overfitting. It serves as a type of regularization and might motivate the network to learn stronger features. The results of adding the dropout layers with different dropout rates were measured and compared to those without the dropout layer as depicted in Table 6. Not adding a dropout layer still yielded better results but took a longer time to train.

12.2. Activation Function

In the context of neural networks, an activation function introduces non-linearity to the model. It aids the network in discovering intricate relationships and patterns within the data, empowering it to make precise predictions. ReLU, Softmax, Leaky ReLU, Parametric ReLU (PReLU), Exponential Linear Unit (ELU), and ThresholdedReLU have been tested in the development of the architecture as depicted in Table 7. Using the ELU activation function was found to have the highest accuracy, with a slight decrease in execution time.

12.3. Optimizer

The goal of optimization, which entails reducing or maximizing an objective function, is at the heart of machine learning. The objective of supervised learning is to reduce the size of a loss function that measures the discrepancy between model predictions and actual data. This minimization procedure comprises iteratively modifying the model’s parameters to find the optimal setting that most closely fits the data.
The process of adjusting parameters is propelled by optimizers. They decide how each training iteration updates the model parameters. Optimizers are essentially in charge of exploring the parameter space to arrive at a setup that minimizes the loss function. The method by which optimizers operate is to compute the gradients of the loss function about the parameters of the model. These gradients show which way the parameters should be changed to lower the loss. The Adam, Nadam, and Stochastic Gradient Descent (SGD) optimizers were used and compared. The results can be seen in Table 8.

12.4. Loss Function

The model can iteratively modify its parameters during training to decrease loss with the help of loss functions, which act as a quantifiable measure of this inaccuracy. Loss functions essentially act as a guide for the learning process, measuring how well the model is doing. Various loss functions have been tested to see which has the highest accuracy and Binary Cross-Entropy has an advantage, as depicted in Table 9.

12.5. Class Imbalance

Oversampling and under-sampling are approaches used to handle class imbalance concerns in the context of machine learning and classification tasks, with the main objective of enhancing model accuracy. When working with datasets where one class considerably outnumbers the other, skewing model performance, these techniques are helpful.
Class imbalance happens when there are disproportionately more instances of one class than the other. This frequently leads to the model in binary classification having a high bias towards the dominant class. As a result, the model may not perform well for the minority class, producing predictions with low accuracy, poor generalization, and bias.
By raising the proportion of members of the minority class, oversampling is a method for balancing the distribution of classes. The oversampling technique known as Synthetic Minority Over-Sampling (SMOTE) was applied. The class distribution is balanced by creating synthetic samples for the minority class. On the other hand, by choosing a subset of examples at random, under-sampling is a strategy that lowers the number of instances in the majority class. The imbalanced-learn library was used to balance the data. The imbalanced data may still have higher accuracy as compared to the oversampled and under-sampled data. This could be because noise can be introduced in the dataset by oversampling or under-sampling. In our example, under-sampling or oversampling does not have much impact on the results compared to the original dataset that was used, as illustrated in Table 10.

12.6. Batch Size

The amount of data samples that the neural network processes in a single forward and backward pass during training is referred to as the batch size. To train deep learning models, especially neural networks, on huge datasets, it is more feasible to divide the data into batches. A trade-off between computing efficiency, training stability, and convergence speed must be made while determining the optimal batch size.
It can be computationally demanding to process huge datasets all at once, and they could not fit into the training hardware’s memory. Utilizing resources effectively is made possible via batching. During training, batching introduces noise that may have a regularization impact that keeps the model from overfitting to the training set of data. Table 11 shows the performance comparisons over different batch sizes. We conclude that a batch size of 128 works best for this dataset.

12.7. Epoch

During a neural network’s training phase, an epoch is a single trip of the complete training dataset. A neural network is usually trained over several epochs, during which the model iteratively adjusts its parameters in response to the training data to enhance performance. Using 12 epochs was found to be the optimal setting, which resulted in the highest accuracy while maintaining fast execution times, as depicted in Table 12.
In summary, the combination of artificial intelligence with phishing detection and training has been both a challenging and promising undertaking. The use of GPT-3, a state-of-the-art language model, for the twin purposes of phishing URL detection and convincing phishing campaign generation, was a noteworthy feature of this research. As the URL detection sub-system As the sub-system uses a neural network, future improvements will depend significantly on available new training datasets. This use of cutting-edge artificial intelligence highlighted how such technologies can be used in a scalable way to deal with practical problems in a specified context. It also emphasizes how the cybersecurity landscape is always changing, especially when faced with attacks of progressively increasing sophistication.

13. Conclusions

In this research, we have developed a prototype solution that integrates both phishing attack detection and end-user phishing education. The use of large language models, like GPT-3 and a Convolutional Neural Network Deep Learning Architecture, enables the prototype to detect email-based phishing attacks and allows phishing campaigns customized to specified business use cases to be generated for end-user training.
By maximizing the defense effectiveness of phishing security at a granular level, AI and generative AI technologies could significantly reduce overall cyber risk. To comprehensively mitigate increasingly sophisticated email phishing attacks therefore, businesses and their employees must both understand how cybercriminals may be using the technology, as well as embrace it as part of an evolving hybrid security framework.

Author Contributions

Conceptualization, P.K.K.L. and A.Z.Y.L. with oversight from Cyber Security Agency of Singapore; methodology, A.Z.Y.L.; software, A.Z.Y.L.; validation, A.Z.Y.L., P.K.K.L. and V.B.; formal analysis, A.Z.Y.L.; investigation, A.Z.Y.L.; resources, A.Z.Y.L.; data curation, A.Z.Y.L.; writing—original draft preparation, P.K.K.L.; writing—review and editing, P.K.K.L., A.Z.Y.L. and V.B.; visualization, P.K.K.L.; supervision, P.K.K.L.; project administration, P.K.K.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Training datasets for phishing detection are extracted from public domain phishing link repositories detailed in Section 9 of this paper.

Acknowledgments

We thank the Cyber Security Agency of Singapore for their guidance and advice during the development of the research prototype.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. SlashNext, The State of Phishing 2023. Available online: https://slashnext.com/wp-content/uploads/2023/10/SlashNext-The-State-of-Phishing-Report-2023.pdf (accessed on 26 December 2023).
  2. Griffiths, C. The Latest 2023 Phishing Statistics (Updated December 2023). Available online: https://aag-it.com/the-latest-phishing-statistics/ (accessed on 27 December 2023).
  3. Basit, A.; Zafar, M.; Liu, X.; Javed, A.R.; Jalil, Z.; Kifayat, K. A comprehensive survey of AI-enabled phishing attacks detection techniques. Telecommun. Syst. 2020, 76, 139–154. [Google Scholar] [CrossRef] [PubMed]
  4. U.S. Department of Health and Human Services, Health Sector Cybersecurity Coordination Center (HC3), AI-Augmented Phishing and the Threat to the Health Sector, White Paper, Report: 202310261200, 26 October 2023. Available online: https://www.hhs.gov/sites/default/files/ai-and-phishing-as-a-threat-to-the-hph-white-paper-tlpclear.pdf (accessed on 28 December 2023).
  5. Mirsky, Y.; Demontis, A.; Kotak, J.; Shankar, R.; Gelei, D.; Yang, L.; Zhang, X.; Pintor, M.; Lee, W.; Elovici, Y.; et al. The Threat of Offensive AI to Organizations. Comput. Secur. 2023, 124, 103006. [Google Scholar] [CrossRef]
  6. Jackson, K.A. A Systematic Review of Machine Learning Enabled Phishing. arXiv 2023, arXiv:2310.06998. [Google Scholar]
  7. Lim, E.; Tan, G.; Hock, T.K.; Lee, T. Turing in a Box: Applying Artificial Intelligence as a Service to Targeted Phishing and Defending against AI-generated Attacks; GovTech: Singapore, 2021; Available online: https://i.blackhat.com/USA21/Wednesday-Handouts/US-21-Lim-Turing-in-a-Box-wp.pdf (accessed on 28 December 2023).
  8. Deloitte Risk Advisory, Phishing as a Service. June 2018. Available online: https://www2.deloitte.com/content/dam/Deloitte/in/Documents/risk/in-ra-phishing-as-a-service-noexp.pdf (accessed on 21 December 2023).
  9. Jawahar, M.G.; Abdul-Mageed, L.V.S. Lakshmanan, Automatic Detection of Machine Generated Text: A Critical Survey. November arXiv 2020, arXiv:2011.01314. [Google Scholar]
  10. Seymour, J.; Tully, P. Generative Models for Spear Phishing Posts on Social Media, NIPS Workshop on Machine Deception. arXiv 2018, arXiv:1802.05196. [Google Scholar]
  11. Begou, N.; Vinoy, J.; Duda, A.; Korczy, M. Exploring the Dark Side of AI: Advanced Phishing Attack Design and Deployment Using ChatGPT. In Proceedings of the IEEE Conference on Communications and Networkm Security (CNS), Orlando, FL, USA, 2–5 October 2023. [Google Scholar]
  12. Falade, P.V. Decoding the Threat Landscape: ChatGPT, FraudGPT, and WormGPT in Social Engineering Attacks. Int. J. Sci. Res. Comput. Sci. Eng. Inf. Technol. 2020, 9, 185–198. [Google Scholar] [CrossRef]
  13. Floridi, L.; Chiriatti, M. GPT-3: Its Nature, Scope, Limits, and Consequences. Minds Mach. 2020, 30, 681–694. [Google Scholar] [CrossRef]
  14. Karanjai, R. Targeted Phishing Campaigns using Large Scale Language Models. arXiv 2022, arXiv:2301.00665. [Google Scholar]
  15. McGuffie, K.; Newhouse, A. The Radicalization Risks of GPT-3 and Advanced Neural Language Models. arXiv 2020, arXiv:2009.06807. [Google Scholar]
  16. Akın, F.K. The Art of CHATGPT Prompting: A Guide to Crafting Clear and Effective Prompts. Available online: https://fka.gumroad.com/l/art-of-chatgpt-prompting (accessed on 1 January 2024).
  17. Akın, F.K. F/awesome-CHATGPT-Prompts: This Repo Includes CHATGPT Prompt Curation to Use CHATGPT Better. GitHub. Available online: https://github.com/f/awesome-chatgpt-prompts (accessed on 1 January 2024).
  18. Bang, Y.; Cahyawijaya, S.; Lee, N.; Dai, W.; Su, D.; Wilie, B.; Lovenia, H.; Ji, Z.; Yu, T.; Chung, W.; et al. A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity. arXiv 2023, arXiv:2302.04023. [Google Scholar]
  19. Joshi, P.; Santy, S.; Budhiraja, A.; Bali, K.; Choudhury, M. The State and Fate of Linguistic Diversity and Inclusion in the NLP World. arXiv 2020, arXiv:2302.04023. [Google Scholar]
  20. Ji, Z.; Lee, N.; Frieske, R.; Yu, T.; Su, D.; Xu, Y.; Fung, P. Survey of hallucination in natural language generation. ACM Comput. Surv. 2023, 55, 1–38. [Google Scholar] [CrossRef]
  21. Alkaissi, H.; McFarlane, S.I. Artificial Hallucinations in ChatGPT: Implications in Scientific Writing. Cureus 2023, 15, 2. [Google Scholar] [CrossRef] [PubMed]
  22. Rakotoasimbahoaka, A.C.; Randria, L.; Razafindrakoto, N.R. Malicious URL detection Using majority vote method with machine learning and deep learning models. In Proceedings of the 2020 International Conference on Interdisciplinary Cyber Physical Systems (ICPS), Chennai, India, 28–29 December 2020; IEEE: Piscataway, NJ, USA; pp. 37–43. [Google Scholar]
  23. Crişan, A.; Florea, G.; Halasz, L.; Lemnaru, C.; Oprisa, C. Detecting malicious URLs based on machine learning algorithms and word embeddings. In Proceedings of the 2020 IEEE 16th International Conference on Intelligent Computer Communication and Processing (ICCP), Cluj-Napoca, Romania, 3–5 September 2020; IEEE: Piscataway, NJ, USA; pp. 187–193. [Google Scholar]
  24. Mourtaji, Y.; Bouhorma, M.; Alghazzawi, D.; Aldabbagh, G.; Alghamdi, A. Hybrid rule-based solution for phishing URL detection using convolutional neural network. Wirel. Commun. Mob. Comput. 2021, 2021, 1–24. [Google Scholar] [CrossRef]
  25. Yang, P.; Zhao, G.; Zeng, P.P. Phishing website detection based on multidimensional features driven by deep learning. IEEE Access 2019, 7, 15196–15209. [Google Scholar] [CrossRef]
  26. Wei, W.; Ke, Q.; Nowak, J.; Korytkowski, M.; Scherer, R.; Woźniak, M. Accurate and fast URL phishing detector: A convolutional neural network approach. Comput. Netw. 2020, 178, 107275. [Google Scholar] [CrossRef]
Figure 1. GPT-3 acting as a Linux Terminal.
Figure 1. GPT-3 acting as a Linux Terminal.
Futureinternet 16 00086 g001
Figure 2. GPT-3 acting as a Football Commentator.
Figure 2. GPT-3 acting as a Football Commentator.
Futureinternet 16 00086 g002
Figure 3. GPT-3 acting as an AI Writing Tutor.
Figure 3. GPT-3 acting as an AI Writing Tutor.
Futureinternet 16 00086 g003
Figure 4. Phishing Training and Testing System Architecture.
Figure 4. Phishing Training and Testing System Architecture.
Futureinternet 16 00086 g004
Figure 5. Phishing Campaign Grid Layout.
Figure 5. Phishing Campaign Grid Layout.
Futureinternet 16 00086 g005
Figure 6. Phishing Campaign Engine Code Configuration.
Figure 6. Phishing Campaign Engine Code Configuration.
Futureinternet 16 00086 g006
Figure 7. PhishTank Data Feed.
Figure 7. PhishTank Data Feed.
Futureinternet 16 00086 g007
Figure 8. Downloaded CSV file from PhishStats.
Figure 8. Downloaded CSV file from PhishStats.
Futureinternet 16 00086 g008
Figure 9. OpenPhish Data Feed.
Figure 9. OpenPhish Data Feed.
Futureinternet 16 00086 g009
Figure 10. Crontab in Cloud Server.
Figure 10. Crontab in Cloud Server.
Futureinternet 16 00086 g010
Figure 11. Script to Remove Duplicate Rows.
Figure 11. Script to Remove Duplicate Rows.
Futureinternet 16 00086 g011
Figure 12. Example of Row with Redacted Keyword.
Figure 12. Example of Row with Redacted Keyword.
Futureinternet 16 00086 g012
Figure 13. Script to Remove Rows with Redacted Keywords.
Figure 13. Script to Remove Rows with Redacted Keywords.
Futureinternet 16 00086 g013
Figure 14. Confusion Matrix.
Figure 14. Confusion Matrix.
Futureinternet 16 00086 g014
Figure 15. FNN Architecture Design.
Figure 15. FNN Architecture Design.
Futureinternet 16 00086 g015
Figure 16. Initial CNN Architecture Design.
Figure 16. Initial CNN Architecture Design.
Futureinternet 16 00086 g016
Table 1. Phishing Themes and Prompts.
Table 1. Phishing Themes and Prompts.
ThemePrompt
Discount Code Write a long and detailed email to [person 1] from [person 2] in the Sales Team of [company name]. The email should inform [person 1] about a new discount code on their next purchase of products from the company. The discount code will only last for the next 24 h. Include a [link]. The email should contain a subject.
Employee Handbook Write a long and detailed email to [person 1] from [person 2] in the Human Resources Department at [company name]. The email should inform [person 1] about a new change in the employee handbook that they will need to acknowledge. Include a [link]. The email should contain a subject.
Fortune 500 Client Wire Transfer Write a long and detailed email to [person 1] in the finance operations department at [company name] from [person 2]. The email should explain that [person 2] is visiting a potential Fortune 500 client and that [person 2] requires an urgent wire transfer to be made to an account belonging to the potential client in order to close the deal. The email should include the amount of money [amount] that should be transferred and details of the bank account that should receive the payment—[account number], [bank code], and [branch code]. The email should contain a subject.
Free Legacy Equipment Write a long and detailed email to [person 1] from [person 2] in the IT Department at [company name]. The email should inform [person 1] about free legacy equipment that the company is giving away. Include a [link]. The email should contain a subject.
Gift Card Giveaway Write a long and detailed email to [person 1] from [person 2] in the Sales Team of [company name]. The email should inform [person 1] about free gift cards that the company is giving away and only the first [number] customers will receive the gift cards. Include a [link]. The email should contain a subject.
Holiday Entitlement Write a long and detailed email to [person 1] from [person 2] in the Human Resources Department at [company name]. The email should inform [person 1] about a new change in the company policy with regard to holiday entitlement. Include a [link]. The email should contain a subject.
KPI Meeting Write a long and detailed email to [person 1] from [person 2]. The email should inform [person 1] that they need to book an appointment for a meeting with [person 2] regarding KPIs and quarterly goals. The meeting will take approximately [duration] and will be held at [place] on [date] at [time]. Include a meeting [link]. The email should contain a subject.
Missed Parcel Delivery Write a long and detailed email to [person 1] from [person 2] in the Customer Service Team of [company name]. The email should inform [person 1] that they missed a parcel delivery. Include a [link]. The email should contain a subject.
NFT Giveaway Write a long and detailed email to [person 1] from [person 2] in the Sales Team of [company name]. The email should inform [person 1] about free NFTs that the company is giving away and only [number] customers will receive the NFTs. Include a [link]. The email should contain a subject.
Outstanding Fine Write a long and detailed email to [person 1] from [person 2] in [organization name]. The email should inform [person 1] about an outstanding fine. Include a [link]. The email should contain a subject.
Overdue Payment Write a long and detailed email to [person 1] from [person 2] in the Customer Service Team of [company name]. The email should inform [person 1] that they missed an overdue payment of [amount] to [company name]. Include a [link]. The email should contain a subject.
Overdue Taxes Write a long and detailed email to [person 1] from [person 2] in [organization name]. The email should inform [person 1] about the overdue payment of their taxes. Include a [link]. The email should contain a subject.
Pay Raise Write a long and detailed email to [person 1] from [person 2]. The email should inform [person 1] that they were given a pay raise by the [company name]. The pay raise will be [amount] and it will be effective on [date]. The email will contain an attachment of the pay raise details. The email should contain a subject.
Promotion Write a long and detailed email to [person 1] from [person 2]. The email should inform [person 1] that they were given a promotion to the position of [new position] by the [company name]. The promotion will be effective on [date]. The email will contain an attachment of the promotion details. The email should contain a subject.
Supplier Wire Transfer Write a long and detailed email to [person 1] in the finance operations department at [company name] from [person 2]. The email requires an urgent wire transfer to be made by [person 1] to an account belonging to the [supplier name]. The transfer will have to be made urgently or else there will be a penalty that will be incurred. The email should include the [amount] that should be transferred and details of the bank account that should receive the payment—[account number], [bank code], and [branch code]. The email should contain a subject.
Survey With Reward Write a long and detailed email to [person 1] from [person 2] in the Marketing Team of [company name]. The email should inform [person 1] about the survey from [company name] that promises a reward. Include a [link]. The email should contain a subject.
Unauthorized Login Write a long and detailed email to [person 1] from [person 2] in the Support Team of [company name]. The email should notify that unauthorized login attempts were made to the account of [person 1] and that [person 1] should sign in and complete the necessary steps to gain access to their account using the [link] provided. Include a customer service hotline [phone number]. The email should contain a subject.
Unknown Payment Write a long and detailed email to [person 1] from [person 2] in the Sales Team of [company name]. The email should thank [person 1] for the purchase of [product] costing [amount]. The email includes a link [link], which [person 1] can use to learn more about the product or cancel the payment. The email should contain a subject.
Table 2. List of Feature Descriptions.
Table 2. List of Feature Descriptions.
FeatureDescription
URL LengthThe total length of the URL string.
Hostname LengthThe length of the hostname part of the URL.
Path LengthThe length of the path part of the URL.
Digit CountThe count of numeric digits in the URL.
Alphabet CountThe count of alphabetic characters in the URL.
Subdomain CountThe number of subdomains in the URL.
Subdirectory CountThe number of subdirectories in the URL path.
Query CountThe count of query parameters in the URL.
Fragment CountThe count of URL fragments or anchors.
HTTP SchemeA binary indicator of whether the URL uses ‘http’ or ‘https’.
Is IP AddressA binary indicator of whether the URL is an IP address.
Has PortA binary indicator of whether the URL includes a port number.
At CountThe count of the ‘@’ symbol in the URL.
Comma CountThe count of commas ’,’ in the URL.
Double Slash CountThe count of double slashes ‘//’ in the URL.
Equal CountThe count of equal signs ‘=‘ in the URL.
Hyphen CountThe count of hyphens ‘-’ in the URL.
Percent CountThe count of percent signs ‘%’ in the URL.
Period CountThe count of periods ‘.’ in the URL.
Question CountThe count of question marks ‘?’ in the URL.
Semicolon CountThe count of semicolons ‘;’ in the URL.
Underscore CountThe count of underscores ‘_’ in the URL.
Account CountThe count of the ‘account’ keyword in the URL.
Admin CountThe count of the ‘admin’ keyword in the URL.
Banking CountThe count of the ‘banking’ keyword in the URL.
Client CountThe count of the ‘client’ keyword in the URL.
Confirm CountThe count of the ‘confirm’ keyword in the URL.
Login CountThe count of the ‘login’ keyword in the URL.
Server CountThe count of the ‘server’ keyword in the URL.
Signin CountThe count of the ‘signin’ keyword in the URL.
Webscr CountThe count of the ‘webscr’ keyword in the URL.
URL ShortenerA binary indicator of whether the URL is shortened by a URL shortening service.
Table 3. URLs with Extracted Features.
Table 3. URLs with Extracted Features.
URLURL LengthHostname LengthPath LengthDigit CountAlphabet CountSubdomain CountSubdirectory CountQuery CountFragment CountHTTP SchemeIs IP AddressHas Port
https://www.20-minsw.com/?formCode=share-savings&productId=42&_=%2Fapplication%23KJWqMdlUlBn8PPpbVhTylJ%2FhfYJoHVql5eA%2FIw%3D%3D (accessed on 8 February 2024)137161131011030100
https://magalu.semanadasofertas.site/?category=1&refer=9169593 (accessed on 8 February 2024)662818471020100
https://bufflo.csmarketplace.ink/boX9kxo1BV/9kblhk4ciw/6kuepzgmqs?q=boX9kxo1BV&s=d789f0501227e18a6936a0c7c72eb336 (accessed on 8 February 2024)117243330741320100
https://literate-silent-list.glitch.me/[email protected] (accessed on 8 February 2024)753010621010100
https://info-cnfpt.fr/digi/ext/eml/r?par=aHR0cHM6Ly9saXRlcmF0ZS1zaWxlbnQtbGlzdC5nbGl0Y2gubWU/Z3E9c2hhbm5vbi5kZWx1Y2lhQGRlbGF3YXJlcGFyay5jb20=&emtr=11765-319138-CPujObw-2 (accessed on 8 February 2024)1731315311230420100
https://centralqueenslandweddings.com.au/wp-content/upgrade/%d9%88%d8%ad%d8%af%d8%a9%20%d8%a7%d9%84%d8%aa%d8%ad%d9%83%d9%85/%d9%88%d8%ad%d8%af%d8%a9%20%d8%a7%d9%84%d8%aa%d8%ad%d9%83%d9%85/religion/%d8%b1%d8%a6%d9%8a%d8%b3%d9%8a/zimb/#[email protected] (accessed on 8 February 2024)25832193541341701100
https://srm.dewa.gov.ae/sap/public/bc/icf/logoff?redirecturl=http://ipfs.io/ipfs/bafkreih3ylotprsyicwzo2o5sdbvf5gchtbjzi75wyj6tpdxwptrntt5ie/?af=c2FuZHJhcGVycnlAdHdpdHRlci5jb20=&p2=2019-3-1-Hyderabad-1c (accessed on 8 February 2024)2061525201562520100
https://arweave.net/P8Vo58h4ADQLsx2704M6UTyyF5kY0hN_ZkFVq9AN-QE#[email protected] (accessed on 8 February 2024)87114412650101100
https://s3.amazonaws.com/appforest_uf/f1680189150363x698855450976705300/[email protected] (accessed on 8 February 2024)111165932651310100
https://kayueglobal.com/wp-includes/certificates/.ms/index.html#[email protected]&1c95e6a311558835533c61d96baa7725-787238hjsjgd893aef940f45a46ebda60f5d152f15541-230920n8=1c95e6a311558835533c61d96baa7725-239ngKq0-a97297e51a448db6a3955e02ebb1eb5b (accessed on 8 February 2024)25315401051270401100
http://sainara.com.hk/img/.img/[email protected]&-a9d614bfb4f9b7402a1a67002c1545c2-HDdas-cs0p271vm06y62qj14us29-h3z-3cL8kNZ9QOvbIZ4bAd4JusrVkP8vCgm37QiM-20yvskdu-a9d614bfb4f9b7402a1a67002c1545c2 (accessed on 8 February 2024)2101410651231220000
https://kayueglobal.com/wp-includes/IXR/index%20%2811%29.html#[email protected]&14f8024367861fe177ba566f8fcca9bc-787238hjsjgd81b82ebfc06faeb63c499532245e28d30-230920n8=14f8024367861fe177ba566f8fcca9bc-239ngKq0-7f6412e4051c56c4432bfc636ccdcc04v (accessed on 8 February 2024)25115381061230301100
https://www.hkplasticlymphatic.com/style/stylesheet/.css/[email protected]&-c02f18d8fbb045c266f2d2eac14cd5ba-HDdas-dsk8mhqay9nhkz3mka6t-h3z-0uMMUbLcgU2NsSFy2FxAuqJ1hrIdCsgzZQ4s-20yvskdu-c02f18d8fbb045c266f2d2eac14cd5ba2342623441671320100
https://hangfashion.com.vn/profiles/standard/translations/myid.telstra.com/manage/?view=login&appIdKey=fcd00c0656cc490&country (accessed on 8 February 2024)134185691051530100
https://kayueglobal.com/wp-includes/IXR/index%20%2811%29.html#[email protected]&ea046c47edc7aba836e0ff626708572d-787238hjsjgd8e20e664855f787d2d096095f8671f227-230920n8=ea046c47edc7aba836e0ff626708572d-239ngKq0-7236c4eb8932d7924131ec5603d1cbe9 (accessed on 8 February 2024)25115381111180301100
http://cpmapro.ca/download/.d/[email protected]&-21245748eb0c5ebfd6701664de3ea53c-HDdas-bya2i7nksf8bzq389h1cm13-h3z-It7cS02gu9Kd7MBnGOOSf5SFjHLUWfUofbZX-20yvskdu-21245748eb0c5ebfd6701664de3ea53c (accessed on 8 February 2024)2101013581310220000
https://www.ofertasamericanas.online/298589309/?console-playstation-5-ps5-+-controle-dualsense-playstation-5-+-game-uncharted:-colecao-legado-dos-ladroes-ps5-em-promo%EF%BF%BD%EF%BF%BDo-na-americanas&cod=255035033 (accessed on 8 February 2024)2172811221551120100
https://doximex.vn/wp-includes/Ppouo/login.php?cmd=submit_log&id=OTE1MTc0NTU3OTE1MTc0NTU3&session=OTE1MTc0NTU3OTE1MTc0NTU3 (accessed on 8 February 2024)1301028121000330100
https://exodus-wallet.securityfixes.com/[email protected]&ID=g3Wk37o5d92q2l4t407xDyJ97x1H8 (accessed on 8 February 2024)10731615771120100
Table 4. Performance Comparisons of Deep Learning Architectures.
Table 4. Performance Comparisons of Deep Learning Architectures.
ArchitectureAccuracyPrecisionRecallF1 ScoreConfusion MatrixExecution Time (s)
Bi-RNN0.942840.967120.931660.94906[67,128  2961]
[6390  87,121]
1886.289
FNN0.9250970.924000.946820.93527[62,807  7282]
[4972  88,539]
210.8156
GRU0.931130.933120.947400.94021[63,740  6349]
[4918  88,593]
248.538
LSTM0.930520.934290.944900.93956[63,875  6214]
[5152  88,359]
321.976
RNN0.923710.927080.940490.93374[63,172  917]
[5564  87,947]
253.741
CNN0.943330.958430.941690.94999[66,270  3819]
[5452  88,059]
629.896
Table 5. Performance Comparisons among a range of Convolution Layers.
Table 5. Performance Comparisons among a range of Convolution Layers.
ArchitectureAccuracyPrecisionRecallF1 ScoreConfusion MatrixExecution Time (s)
2 Conv1D and max-pooling layers0.943330.958430.941690.94999[66,270  3819]
[5452  88,059]
629.896
3 Conv1D and max-pooling layers0.9469370.963580.942790.95307[66,757  3332]
[5349  88,162]
795.653
5 Conv1D and max-pooling layers0.946760.970740.935040.95256[67,454  2635]
[6074  87,437]
964.9478
4 Conv1D and max-pooling layers0.946990.964560.941870.95308[66,853  3236]
[5435  88,076]
936.976
Table 6. Performance Comparisons among different Dropout Rates.
Table 6. Performance Comparisons among different Dropout Rates.
ArchitectureAccuracyPrecisionRecallF1 ScoreConfusion MatrixExecution Time (s)
With 0.2 dropout rate0.944520.956100.9463910.95122[66,026  4063]
[5013  88,498]
706.858
With 0.5 dropout rate0.946710.964240.941680.95283[66,824  3265]
[5453  88,058]
689.515
With 0.7 dropout rate0.945000.971290.931310.95088[67,515  2574]
[6423  87,088]
700.178
Without dropout layer0.946990.964560.941870.95308[66,853  3236]
[5435  88,076]
936.976
Table 7. Performance Comparisons among different Activation Functions.
Table 7. Performance Comparisons among different Activation Functions.
Activation FunctionAccuracyPrecisionRecallF1 ScoreConfusion MatrixExecution Time (s)
ReLU0.946990.964560.941870.95308[66,853  3236]
[5435  88,076]
936.976
Softmax0.929490.933940.943370.93863[63,850  239]
[5295  88,216]
865.938
Leaky ReLU0.945770.968610.935430.95173[67,255  2834]
[6038  87,473]
740.610
PReLU0.943280.964500.935190.94962[66,871  3218]
[6060  87,451]
901.384
ThresholdedReLU0.881100.863500.940670.90043[56,185  13,904]
[5548  87,963]
673.648
ELU0.947950.970060.937880.95370[67,383  2706]
[5808  87,703]
800.498
Table 8. Performance Comparisons among different Optimizers.
Table 8. Performance Comparisons among different Optimizers.
OptimizerAccuracyPrecisionRecallF1 ScoreConfusion MatrixExecution Time (s)
SGD0.936290.940170.948920.94453[64,443  5646]
[4776  88,735]
841.052
Nadam0.946650.965490.940270.95271[66,947  3142]
[5585  87,926]
821.842
Adam0.947950.970060.937880.95370[67,383  2706]
[5808  87,703]
800.498
Table 9. Performance Comparisons among Different Loss Functions.
Table 9. Performance Comparisons among Different Loss Functions.
Loss FunctionAccuracyPrecisionRecallF1 ScoreConfusion MatrixExecution Time (s)
Mean Squared Error0.945890.970890.933320.95174[67,473  2616]
[6235  87,276]
789.436
Mean Squared Logarithmic Error0.945590.961700.942330.95192[66,580  3509]
[5392  88,119]
735.121
Binary Cross-Entropy0.947950.970060.937880.95370[67,383  2706]
[5808  87,703]
800.498
Table 10. Comparison between Different Balances of Data.
Table 10. Comparison between Different Balances of Data.
DataAccuracyPrecisionRecallF1 ScoreConfusion MatrixExecution Time (s)
Oversampling0.947920.971640.936210.95359[67,534  2555]
[5965  87,546]
832.164
Under-sampling0.946240.966050.938930.95230[67,004  3085]
[5710  87,801]
649.639
Imbalanced0.947950.970060.937880.95370[67,383  2706]
[5808  87,703]
800.498
Table 11. Performance Comparisons of Different Batch Sizes.
Table 11. Performance Comparisons of Different Batch Sizes.
Batch SizeAccuracyPrecisionRecallF1 ScoreConfusion MatrixExecution Time (s)
320.942210.948750.950210.94948[65,290  4799]
[4655  88,856]
1103.911
640.947950.970060.937880.95370[67,383  2706]
[5808  87,703]
800.498
2560.946020.960810.944070.95237[66,489  3600]
[5230  88,281]
465.139
1280.948570.965750.943480.95449[66,961  3128]
[5285  88,226]
549.7330
Table 12. Performance Comparisons over Different Epochs.
Table 12. Performance Comparisons over Different Epochs.
EpochsAccuracyPrecisionRecallF1 ScoreConfusion MatrixExecution Time (s)
50.945060.965830.937040.95121[66,989  3100]
[5887  87,624]
273.181
100.948570.965750.943480.95449[66,961  3128]
[5285  88,226]
549.7330
130.948320.964630.944200.95431[66,852  3237]
[5217  88,294]
789.818
150.946630.967210.938440.95261[67,114  2975]
[5756  87,755]
761.121
200.947110.966760.939780.95308[67,068  3021]
[5631  87,880]
996.013
120.949080.969460.940550.95479[67,319  2770]
[5559  87,952]
618.987
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Loh, P.K.K.; Lee, A.Z.Y.; Balachandran, V. Towards a Hybrid Security Framework for Phishing Awareness Education and Defense. Future Internet 2024, 16, 86. https://doi.org/10.3390/fi16030086

AMA Style

Loh PKK, Lee AZY, Balachandran V. Towards a Hybrid Security Framework for Phishing Awareness Education and Defense. Future Internet. 2024; 16(3):86. https://doi.org/10.3390/fi16030086

Chicago/Turabian Style

Loh, Peter K. K., Aloysius Z. Y. Lee, and Vivek Balachandran. 2024. "Towards a Hybrid Security Framework for Phishing Awareness Education and Defense" Future Internet 16, no. 3: 86. https://doi.org/10.3390/fi16030086

APA Style

Loh, P. K. K., Lee, A. Z. Y., & Balachandran, V. (2024). Towards a Hybrid Security Framework for Phishing Awareness Education and Defense. Future Internet, 16(3), 86. https://doi.org/10.3390/fi16030086

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop