Attention-Based 1D CNN-BiLSTM Hybrid Model Enhanced with FastText Word Embedding for Korean Voice Phishing Detection †
Abstract
:1. Introduction
- For the first time, we investigate the performance of a novel hybrid ANN that combines a 1-dimensional Convolutional Neural Network (1D CNN), a Bidirectional Long Short-term Memory (BiLSTM) architecture, and a Hierarchical Attention Network (HAN) for detecting Korean voice phishing attacks.
- We demonstrate the effectiveness of a complementary approach combining data-centric and model-centric AI methodologies to address the challenge of limited dataset availability in the context of voice phishing detection in Korea.
- We train and evaluate the prediction performance of our proposed hybrid detection model and other baseline models using the updated version of the KorCCVi dataset.
- We compare our proposed method with several existing methods for detecting Korean voice phishing.
2. Related Work
2.1. Voice Phishing Detection
2.2. Convolutional Neural Networks and Hybrid Approaches
3. Methodology
3.1. Input Block
- Transcription and Audio Processing: We used Google’s Cloud Speech-to-Text API [53] to transcribe our voice data into textual format. To generate the most accurate transcriptions possible, we converted the audio channel and format and manually adjusted the audio files for optimal audibility and length. If necessary, we segmented audio files containing multiple voice phishing attacks and manually compared optimized transcript versions to select the most accurate version. All detailed processes can be found in our prior work [54].
- Data Cleaning: This step involved the removal of any irrelevant or redundant information from the raw data. It includes eliminating personal information such as phone numbers, punctuation marks, special characters, and digits. This cleaning task ensures that we remove irrelevant or redundant information that does not benefit in understanding voice phishing characteristics.
- Tokenization: While performing the cleaning tasks, we tokenized the dataset using the morphological analyzer MeCab-ko [55] due to its high-speed morphological analysis. This process breaks down the cleaned text into individual tokens or words, which serve as our model’s basic input units. Several tokenization strategies are available for the Korean language, and their impact on a DL-based voice phishing detection model presents a potential research topic of interest to scholars. This comparative study could uncover this application’s most effective tokenization method.
- Removal of Stop Words: This step involved the removal of Korean stop words that carry little semantic weight in the context of voice phishing.
3.2. Word Embedding Block
3.3. Features Extraction Block
3.4. Sequence Learning Block
3.5. Attention Mechanism Block
3.6. Classification Block
4. Experiments and Experimental Results
4.1. Dataset Details
4.2. Experimental Setup
4.3. Baseline Models
4.4. Evaluation Metrics
4.5. Experiment Results Analysis
4.6. Comparative Analysis
5. Discussion of the Results
6. Conclusions and Future Works
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Appendix A. Extract of the KorCCVi v2 Dataset along with the English Translation
ID | Transcript | Label |
---|---|---|
2403 | 다만 아직까지 피해자라고 증명할 증거가 없으셔서 피해자 입증 조사 도와드리려고 하는 부분이고요. 본인 같은 경우는 1차적인 혐의점이 없으셔서 녹취 조사로 진행이 되실 겁니다. 본인을 대신해 법원에 제출될 서류기 때문에 주위 잡음이나 제3자가 있는 공간에서 녹취조사 하시면 안되시고요. 실례지만 지금 직장이신가요?네 여보세요? 여보세요? 저기 잘 안들리는데요. | 1 |
However, there is no evidence to prove that you are a victim yet, so I am trying to help you investigate the victim. In your case, there is no primary suspicion, so the recording investigation will proceed. Since it is a document to be submitted to the court on your behalf, do not record and investigate in a space where there is noise around you or a third party. Excuse me, are you at work? hello? hello? I can’t hear you well there. | ||
669 | 건물주들이 어 말도 안 되는 가격을 말도 안 되게 가격을 아 보증금을 올리고 있는 그런 얘기가 많이 나오는데요. 그 런 점에서 그게 맞는 합당하다고 생각하시나요? 합당하지 않다고 생각합니다. 건물주들도 세입자들이 있어야만 돈을 벌고 수익을 얻기 때문에 서로 공생관계에 있다고 생각합니다. 앗 너무 지나친 월세 인상은 세입자들에게 큰 부담을 느끼게 됩니다. 더군다나 요즘 같은 코로나 사태 때 소비가 줄어든 가 가게들에게 월세를 그대로 받는다는 것은 큰 재앙과도 같습니다. 그래서 건물주가 내려주는 세 세인 만큼 나라에서 그 세액을 공제해주기도 한 하는 정책을 시행하고 있습니다. [TRUNCATED] | 0 |
There are a lot of stories about landlords raising their deposits at ridiculous prices. Do you think it is reasonable in that respect? I don’t think it’s worthy. I think that building owners are in a symbiotic relationship with each other because they make money and earn profits only when there are tenants. Oh, the excessive monthly rent increase puts a great burden on tenants. Moreover, it is like a big disaster to receive the monthly rent as it is from the shops that have reduced consumption during the corona crisis these days. Therefore, we are implementing a policy that allows the government to deduct the tax amount as much as it is the tax paid by the building owner. [TRUNCATED] | ||
2461 | 일단은 교육 김형석 주간문경 되었는데요. 증명서 일당들이 아직 되지 않았어요. 그렇기 때문에 또 다른 대포통장 바닐라 소득이 있어서 저희가 진행 중이고요네 그리고 금융감독원 뒤에서 이제 금일 내로 될 건데요 진행 안 했을 때 보니까 직접 계좌 말고 혹시나 본인께서 모르고 또 발견이 되잖아요. 그러면 금융감독원 보내는 모르는게 사건에 대해서는 불법 계좌로 없고요. 본인이 모르면 불법계좌 건에 대해서는 사건 종결 되기 전까지 진행하도록 할 건데요. | 1 |
First of all, it became a weekly reading for education Kim Hyung-seok. The certificates haven’t been done yet. That’s why there is another cannon account vanilla income, so we’re in the process. And behind the Financial Supervisory Service, it will be done today. Then, there is no illegal account for the case that the Financial Supervisory Service does not know. If you do not know, we will proceed with the illegal account case until the case is closed. | ||
1805 | 그 김정은이 트럼프한테 이 서로 원하는 내용을 논의해 보자고 했다는데 그 그거에 대해서 어떻게 생각해? 서로 원하는 내용을 논의해 보자는 거는 서로가 뭘 원하는지 알고 어느 정도 알고 있는 거 같고 그래서 뭔가 계속 글로 주고받으면은 해결되는 게 없을 것 같으니깐 서로 만나서 대화를 통해서 이 문제를 해결해 나가자는 것 같다고 생각해. 너는 계속 김정은의 이런 행동은 행동이 비핵화를 시키고 우리나라가 통일될 수 있을 거라고 생각해? 나는 잘 모르겠어. 뭔가 김정은이 이렇게 계속 한다고 해서 어떻게 일이 풀릴지도 잘 모르겠고 일이 더 커질 수도 있다고 생각해. 너는 이제 막 협상이 열리고 이러잖아. [TRUNCATED] | 0 |
He Kim Jong-un asked Trump to discuss what he wanted with each other. He What do you think about that? Let’s discuss what we want with each other seems to know what each other wants and know to some extent, so I think it’s like we’re going to meet each other and solve this problem through conversation because nothing seems to be resolved if we keep exchanging texts. Do you continue to think that Kim Jong-un’s actions will lead to denuclearization and Korea to be reunified? I’m not sure. Just because Kim Jong-un continues like this, I don’t know how things will work out, and I think things can get bigger. You’ve just opened negotiations and you’re like this. [TRUNCATED] |
References
- Hernandez, J. That Panicky Call from a Relative? It Could Be a Thief Using a Voice Clone, FTC Warns. 2023. Available online: https://www.npr.org/2023/03/22/1165448073/voice-clones-ai-scams-ftc (accessed on 15 March 2023).
- Stupp, C. Fraudsters Used AI to Mimic CEO’s Voice in Unusual Cybercrime Case. 2019. Available online: https://www.wsj.com/articles/fraudsters-use-ai-to-mimic-ceos-voice-in-unusual-cybercrime-case-11567157402 (accessed on 15 March 2023).
- Brewster, T. Fraudsters Cloned Company Director’s Voice In $35 Million Heist, Police Find. 2021. Available online: https://www.forbes.com/sites/thomasbrewster/2021/10/14/huge-bank-fraud-uses-deep-fake-voice-tech-to-steal-millions/?sh=3c9c45107559 (accessed on 15 March 2023).
- Jun-bae, S. [Crime Safety] Voice Phishing Status, Types, Trends and Implications for Countermeasures. 2022. Available online: https://kostat.go.kr/board.es?mid=a90104010311&bid=12312&act=view&list_no=422196&tag=&nPage=1&ref_bid= (accessed on 15 March 2023).
- Tran, M.H.; Hoai, T.H.L.; Choo, H. A Third-Party Intelligent System for Preventing Call Phishing and Message Scams. In Proceedings of the Communications in Computer and Information Science, Online, 5–6 November 2020; Springer: Singapore, 2020; Volume 1306, pp. 486–492. [Google Scholar] [CrossRef]
- Lee, M.; Park, E. Real-time Korean voice phishing detection based on machine learning approaches. J. Ambient. Intell. Humaniz. Comput. 2021, 14, 8173–8184. [Google Scholar] [CrossRef]
- Moussavou Boussougou, M.K.; Park, D.J. A Real-time Efficient Detection Technique of Voice Phishing with AI. In Proceedings of the Korean Institute of Information Scientists and Engineers Korea Computer Congress; Korean Institute of Information Scientists: Jeju, Republic of Korea, 2021; pp. 768–770. Available online: https://www.dbpia.co.kr/Journal/articleDetail?nodeId=NODE10583070 (accessed on 20 May 2022).
- Kim, J.W.; Hong, G.W.; Chang, H. Voice Recognition and Document Classification-Based Data Analysis for Voice Phishing Detection. Hum. Centric Comput. Inf. Sci. 2021, 11. [Google Scholar] [CrossRef]
- Moussavou Boussougou, M.K.; Jin, S.; Chang, D.; Park, D.J. Korean Voice Phishing Text Classification Performance Analysis Using Machine Learning Techniques. In Proceedings of the Korea Information Processing Society Conference, Yeosu, Republic of Korea, 4–6 November 2021; pp. 297–299. [Google Scholar] [CrossRef]
- Alpaydin, E. Introduction to Machine Learning; MIT Press: Cambridge, MA, USA, 2020. [Google Scholar]
- Jordan, M.I.; Mitchell, T.M. Machine learning: Trends, perspectives, and prospects. Science 2015, 349, 255–260. [Google Scholar] [CrossRef] [PubMed]
- Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016; Available online: http://www.deeplearningbook.org (accessed on 10 June 2023).
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
- Sarker, I.H. Machine learning: Algorithms, real-world applications and research directions. SN Comput. Sci. 2021, 2, 160. [Google Scholar] [CrossRef] [PubMed]
- Alom, M.Z.; Taha, T.M.; Yakopcic, C.; Westberg, S.; Sidike, P.; Nasrin, M.S.; Hasan, M.; Van Essen, B.C.; Awwal, A.A.S.; Asari, V.K. A State-of-the-Art Survey on Deep Learning Theory and Architectures. Electronics 2019, 8, 292. [Google Scholar] [CrossRef] [Green Version]
- Pouyanfar, S.; Sadiq, S.; Yan, Y.; Tian, H.; Tao, Y.; Reyes, M.P.; Shyu, M.L.; Chen, S.C.; Iyengar, S.S. A Survey on Deep Learning: Algorithms, Techniques, and Applications. ACM Comput. Surv. 2018, 51, 1–36. [Google Scholar] [CrossRef]
- Schmidhuber, J. Deep learning in neural networks: An overview. Neural Netw. 2015, 61, 85–117. [Google Scholar] [CrossRef] [Green Version]
- Ng, A. A Chat with Andrew on MLOps: From Model-Centric to Data-Centric AI. 2021. Available online: https://www.youtube.com/live/06-AZXmwHjo (accessed on 25 March 2021).
- Hamid, O.H. Data-Centric and Model-Centric AI: Twin Drivers of Compact and Robust Industry 4.0 Solutions. Appl. Sci. 2023, 13, 2753. [Google Scholar] [CrossRef]
- Moussavou Boussougou, M.K.; Park, M.G.; Park, D.J. An Attention-Based CNN-BiLSTM Model for Korean Voice Phishing Detection. In Proceedings of the Korean Institute of Information Scientists and Engineers Korea Computer Congress; Korean Institute of Information Scientists: Jeju, Republic of Korea, 2022; pp. 1139–1141. Available online: https://www.dbpia.co.kr/journal/articleDetail?nodeId=NODE11113590 (accessed on 15 March 2023).
- Basit, A.; Zafar, M.; Liu, X.; Javed, A.R.; Jalil, Z.; Kifayat, K. A comprehensive survey of AI-enabled phishing attacks detection techniques. Telecommun. Syst. Model. Anal. Des. Manag. 2021, 76, 139–154. [Google Scholar] [CrossRef]
- Tang, L.; Mahmoud, Q.H. A Survey of Machine Learning-Based Solutions for Phishing Website Detection. Mach. Learn. Knowl. Extr. 2021, 3, 672–694. [Google Scholar] [CrossRef]
- Goel, D.; Jain, A.K. Mobile phishing attacks and defence mechanisms: State of art and open research challenges. Comput. Secur. 2018, 73, 519–544. [Google Scholar] [CrossRef]
- Aleroud, A.; Zhou, L. Phishing environments, techniques, and countermeasures: A survey. Comput. Secur. 2017, 68, 160–196. [Google Scholar] [CrossRef]
- Das, A.; Baki, S.; El Aassal, A.; Verma, R.; Dunbar, A. SoK: A Comprehensive Reexamination of Phishing Research From the Security Perspective. IEEE Commun. Surv. Tutor. 2020, 22, 671–708. [Google Scholar] [CrossRef] [Green Version]
- Song, J.; Kim, H.; Gkelias, A. iVisher: Real-Time Detection of Caller ID Spoofing. ETRI J. 2014, 36, 865–875. [Google Scholar] [CrossRef] [Green Version]
- Kang, Y.; Kim, W.; Lim, S.; Kim, H.; Seo, H. DeepDetection: Privacy-Enhanced Deep Voice Detection and User Authentication for Preventing Voice Phishing. Appl. Sci. 2022, 12, 11109. [Google Scholar] [CrossRef]
- Derakhshan, A.; Harris, I.G.; Behzadi, M. Detecting Telephone-Based Social Engineering Attacks Using Scam Signatures. In Proceedings of the 2021 ACM Workshop on Security and Privacy Analytics (IWSPA ’21), New York, NY, USA, 26–28 April 2021; pp. 67–73. [Google Scholar] [CrossRef]
- Jeong, E.S.; Lim, J.I. Study on Intelligence (AI) Detection Model about Telecommunication Finance Fraud Accident. J. Korea Inst. Inf. Secur. Cryptol. 2019, 29, 149–164. [Google Scholar] [CrossRef]
- Zhao, Q.; Chen, K.; Li, T.; Yang, Y.; Wang, X. Detecting telecommunication fraud by understanding the contents of a call. Cybersecurity 2018, 1, 8. [Google Scholar] [CrossRef]
- Peng, L.; Lin, R. Fraud Phone Calls Analysis Based on Label Propagation Community Detection Algorithm. In Proceedings of the 2018 IEEE World Congress on Services (SERVICES), San Francisco, CA, USA, 2–7 July 2018; pp. 23–24. [Google Scholar] [CrossRef]
- Kim, W.W.; Kang, Y.J.; Kim, H.J.; Yang, Y.J.; Oh, Y.J.; Lee, M.W.; Lim, S.J.; Seo, H.J. Determination of voice phishing based on deep learning and sentiment analysis. In Proceedings of the Korea Information Processing Society Conference, Online, 14–15 May 2021; pp. 811–814. [Google Scholar] [CrossRef]
- Kale, N.; Kochrekar, S.; Mote, R.; Dholay, S. Classification of Fraud Calls by Intent Analysis of Call Transcripts. In Proceedings of the 2021 12th International Conference on Computing Communication and Networking Technologies (ICCCNT), Kharagpur, India, 6–8 July 2021; pp. 1–6. [Google Scholar] [CrossRef]
- Moussavou Boussougou, M.K.; Park, D.J. Exploiting Korean Language Model to Improve Korean Voice Phishing Detection. KIPS Trans. Softw. Data Eng. 2022, 11, 437–446. [Google Scholar] [CrossRef]
- Yang, J.; Lee, C.; Kim, S.B. Development and Utilization of Voice Phishing Prevention Service through KoBERT-based Voice Call Analysis. KIISE Trans. Comput. Pract. 2023, 29, 205–213. [Google Scholar] [CrossRef]
- Yoon, J.Y.; Choi, B.J. Privacy-Friendly Phishing Attack Detection Using Personalized Federated Learning. In Proceedings of the Intelligent Human Computer Interaction, Copenhagen, Denmark, 23–28 July 2023; Zaynidinov, H., Singh, M., Tiwary, U.S., Singh, D., Eds.; Springer: Cham, Switzerland, 2023; pp. 460–465. [Google Scholar] [CrossRef]
- Rothman, D. Transformers for Natural Language Processing, 2nd ed.; Packt: Birmingham, UK, 2022; p. 564. [Google Scholar]
- Kiranyaz, S.; Ince, T.; Gabbouj, M. Real-Time Patient-Specific ECG Classification by 1-D Convolutional Neural Networks. IEEE Trans. Biomed. Eng. 2016, 63, 664–675. [Google Scholar] [CrossRef]
- Kiranyaz, S.; Avci, O.; Abdeljaber, O.; Ince, T.; Gabbouj, M.; Inman, D.J. 1D convolutional neural networks and applications: A survey. Mech. Syst. Signal Process. 2021, 151, 107398. [Google Scholar] [CrossRef]
- Junior, R.F.R.; dos Santos Areias, I.A.; Campos, M.M.; Teixeira, C.E.; da Silva, L.E.B.; Gomes, G.F. Fault detection and diagnosis in electric motors using 1d convolutional neural networks with multi-channel vibration signals. Measurement 2022, 190, 110759. [Google Scholar] [CrossRef]
- Fang, Y.; Zhang, C.; Huang, C.; Liu, L.; Yang, Y. Phishing Email Detection Using Improved RCNN Model With Multilevel Vectors and Attention Mechanism. IEEE Access 2019, 7, 56329–56340. [Google Scholar] [CrossRef]
- Huang, Y.; Yang, Q.; Qin, J.; Wen, W. Phishing URL Detection via CNN and Attention-Based Hierarchical RNN. In Proceedings of the 2019 18th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/13th IEEE International Conference on Big Data Science and Engineering (TrustCom/BigDataSE), Rotorua, New Zealand, 5–8 August 2019; pp. 112–119. [Google Scholar] [CrossRef]
- Zhou, Y.; Xu, J.; Cao, J.; Xu, B.; Li, C.; Xu, B. Hybrid Attention Networks for Chinese Short Text Classification. Comput. Sist. 2018, 21, 759–769. [Google Scholar] [CrossRef]
- Hao, M.; Xu, B.; Liang, J.Y.; Zhang, B.W.; Yin, X.C. Chinese Short Text Classification with Mutual-Attention Convolutional Neural Networks. ACM Trans. Asian -Low-Resour. Lang. Inf. Process. 2020, 19, 1–13. [Google Scholar] [CrossRef]
- Deng, J.; Cheng, L.; Wang, Z. Attention-based BiLSTM fused CNN with gating mechanism model for Chinese long text classification. Comput. Speech Lang. 2021, 68, 101182. [Google Scholar] [CrossRef]
- Jang, B.; Kim, M.; Harerimana, G.; Kang, S.u.; Kim, J.W. Bi-LSTM Model to Increase Accuracy in Text Classification: Combining Word2vec CNN and Attention Mechanism. Appl. Sci. 2020, 10, 5841. [Google Scholar] [CrossRef]
- Kamyab, M.; Liu, G.; Rasool, A.; Adjeisah, M. ACR-SA: Attention-based deep model through two-channel CNN and Bi-RNN for sentiment analysis. PeerJ Comput. Sci. 2022, 8, e877. [Google Scholar] [CrossRef]
- Bojanowski, P.; Grave, E.; Joulin, A.; Mikolov, T. Enriching Word Vectors with Subword Information. Trans. Assoc. Comput. Linguist. 2017, 5, 135–146. [Google Scholar] [CrossRef] [Green Version]
- Grave, E.; Bojanowski, P.; Gupta, P.; Joulin, A.; Mikolov, T. Learning Word Vectors for 157 Languages. arXiv 2018, arXiv:1802.06893. [Google Scholar]
- Mikolov, T.; Grave, E.; Bojanowski, P.; Puhrsch, C.; Joulin, A. Advances in Pre-Training Distributed Word Representations. arXiv 2017, arXiv:1712.09405. [Google Scholar]
- Uysal, A.K.; Gunal, S. The impact of preprocessing on text classification. Inf. Process. Manag. 2014, 50, 104–112. [Google Scholar] [CrossRef]
- Raffel, C.; Shazeer, N.; Roberts, A.; Lee, K.; Narang, S.; Matena, M.; Zhou, Y.; Li, W.; Liu, P.J. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. J. Mach. Learn. Res. 2019, 21, 1–67. [Google Scholar]
- Google. Speech-to-Text: Automatic Speech Recognition. Available online: https://cloud.google.com/speech-to-text (accessed on 23 May 2022).
- Moussavou Boussougou, M.K. An Artificial Intelligent Approach to Detect Voice Phishing Crime by Analyzing the Call Content: A Case Study on Voice Phishing Crime in South Korea. Master’s Thesis, Soongsil University, Seoul, Republic of Korea, 2021. Available online: https://www.riss.kr/link?id=T15765846 (accessed on 23 May 2022).
- Kudo, T. MeCab: Yet Another Part-of-Speech and Morphological Analyzer. 2005. Available online: http://taku910.github.io/mecab/ (accessed on 23 May 2022).
- Graves, A.; Schmidhuber, J. Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw. 2005, 18, 602–610. [Google Scholar] [CrossRef]
- Yang, Z.; Yang, D.; Dyer, C.; He, X.; Smola, A.; Hovy, E. Hierarchical Attention Networks for Document Classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, CA, USA, 12–17 June 2016; pp. 1480–1489. [Google Scholar] [CrossRef] [Green Version]
Ref. | Year | Phishing Types | Strategies | Methods | Datasets | Advantages | Limitations |
---|---|---|---|---|---|---|---|
[26] | 2014 | Voice phishing, Spoofing | Caller ID verification | Trace back incoming call to its originating gateway | N/A | Real-time Caller ID Spoofing Detection, Minimal Call Setup Time Impact, Device Compatibility | End-User Response Dependence, Stakeholder Participation Dependence, Lack of Real-World Testing |
[5] | 2020 | Voice phishing | Blacklisting and whitelisting | ML | Global phone book | Caller Identification, Mobile Phishing Attack Prevention, Comprehensive Solution Against Mobile Phishing | Inefficiency Handling New Phone Numbers |
[27] | 2022 | Voice phishing, Deepfake | User authentication and Deep voice detection | AutoEncoder | ASVspoof 2019 | Synthetic Voice Detection, Deepfake Voice Identification, Sender Identity Validation | Generalizability on Non-English Datasets, Computational Demand, Lack of Comparative Analysis with Existing Methods |
[28] | 2021 | Voice phishing | Conversation semantic content analysis | K-Means | Human hand made scams conversations and CallHome dataset | Scam Signatures Introduction, Novel Concept for Scam Calls Detection | Unrepresentative Nature of Human-Generated Telephone Scam Conversation Dataset |
[29] | 2019 | Voice phishing (telecommunication finance fraud) | Hybrid (Blacklisting, Rule, CNN) | Filtering, CNN | Financial transaction data | Rule Models and AI Algorithms Integration, Synergistic Combination of Techniques | Lack of Comparative Analysis with Existing Methods, Lack of Real-World Testing |
[30] | 2018 | Voice phishing (telecommunication fraud) | Call content analysis | ML, NLP, rules | Fraudulent call description crawled on social medias | Superior Performance over Blacklisting Strategies, Development of Android Application | Small Dataset Size, Lack of Real-world Samples in Dataset, Inferior Performance of Local-based Speech Recognition vs. Cloud-based |
[31] | 2018 | Voice phishing | Call content analysis | TF-IDF 1, Label propagation community (LPA) | Call texts | Fraud Calls Detection in Community Network, Non-reliance on Passive Interception on Smartphone Terminals | Potential for Missed Fraudulent Calls, Analysis Indicating Vulnerabilities within Isolated Communities |
[8] | 2021 | Voice phishing | Call content analysis | Latent semantic analysis (LSA), K-means | FSS 2 | Comprehensive Comparison of Speech Recognition Techniques, Detailed Examination of Embedding Techniques | Small Dataset Size, Resulting Low Performance, Lack of Experiment Details, Issues with Replicability |
[7] | 2021 | Voice phishing | Call content analysis | NLP, Random Forest, XGBoost, LGBM, and CatBoost, Linear SVC, RNN, BiLSTM, GRU | KorCCVi v1 | Introduction of KorCCVi Dataset, Inclusion of Real-world Voice Phishing Data, Real-Time Voice Phishing Detection Efficiency | Small Dataset Size, Generalizability on Non-Korean Datasets, Lack of Real-World Testing |
[9] | 2021 | Voice phishing | Call content analysis | NLP, CatBoost, Gradient XGBoost, LGBM, Linear SVC | KorCCVi v1 | Rapid Training Time, Swift Inference Time, Efficiency of ML Model | Small Dataset Size, Lack of Deep Learning Architectures Explored |
[6] | 2021 | Voice phishing | Call content analysis | SVM, Logistic Regression, Decision Tree, Random Forest, XGB | FSS + NIKL 3 | Real-Time Detection Capability, Comparative Evaluation of Two Korean Morpheme Analyzers | Imbalanced Dataset, Lack of DL Architectures Explored, Lack of Comparative Analysis with Existing Methods |
[32] | 2021 | Voice phishing | Call content and sentiment analysis | CNN, BiLSTM | - | Inclusion of Sentiment Analysis for Enhanced Detection, Reliable and Efficient Model Implementation | Insufficient Dataset Details, Requirement of Domain-Independent Sentiment Lexicon |
[33] | 2021 | Voice phishing | Call content analysis | Naive bayes, CNN | Mix of conversational transcripts and human made fraud calls transcripts | Use of Oversampling Method for Dataset Skewness, High Performance of Intent Analysis Models | Highly Imbalanced Dataset, Lack of Real-world Samples in Dataset, Need for Additional Algorithm Testing |
[34] | 2022 | Voice phishing | Call content analysis | KoBERT | KorCCVi v1 | High Accuracy Achieved with KoBERT-based Model, Extensive Comparison with ML and DL Algorithms | Small Dataset Size, Imbalanced Dataset, Lack of Hyperparameter Optimization |
[35] | 2023 | Voice phishing | Call content analysis | KoBERT | FSS + AI Hub | Impressive Accuracy of KoBERT-based Model, Provision of Educational Content for Potential Victims, Risk Evaluation API Service | Lack of Dataset Details, Overemphasis on Model Accuracy Metric, Overfitting Issue Beyond 10 Epochs |
[36] | 2023 | Voice phishing | Call content analysis | Federated Learning | KorCCVi v2 | User Data Privacy Preservation, Communication Efficiency, Client Grouping Based on Characteristics, Personalized Data Requirement Recommendations | Overemphasis on Model Accuracy Metric, Lack of Comparative Analysis with Existing Methods |
Source | Class (Label) | Samples | Percentage |
---|---|---|---|
FSS 1 | Voice phishing (1) | 695 | 23.7% |
NIKL 2 | Non-voice phishing (0) | 2232 | 76.3% |
Total | 2927 | 100% |
Training Set | Validation Set | Test Set | Total |
---|---|---|---|
2370 | 264 | 293 | 2927 |
Hyperparameters | Values |
---|---|
Word embedding vector dimension | 300 |
Number of convolution filters | 32 |
Convolutional kernel size | 3 |
Number of Pooling | 1 |
Pooling size | 2 |
Number of Dropout | 2 |
Spatial dropout rate | 0.2 |
Dropout rate | 0.1 |
Number of LSTM’s hidden units | (64, 32) |
Number of attention mechanism | 1 |
Number of dense layers | 2 |
Activation function type | ReLu, tanh, Softmax |
Number of epochs | 10 |
Batch size | 64 |
Learning rate | |
Learning decay | |
Optimizer | Adam |
Models | Evaluation Metrics | Trainable Parameters | Training Time in Second | |||
---|---|---|---|---|---|---|
Precision | Recall | F1 Score | Accuracy | |||
1D CNN | 98.35% | 98.29% | 98.31% | 98.29% | 429,756 | 18.61 |
LSTM | 97.30% | 97.27% | 97.22% | 97.27% | 108,098 | 63.07 |
BiLSTM | 84.12% | 84.64% | 84.32% | 84.64% | 232,386 | 143.34 |
1D CNN-BiLSTM | 98.99% | 98.98% | 98.97% | 98.98% | 149,436 | 131.69 |
Attention-based 1D CNN-BiLSTM | 99.32% | 99.32% | 99.31% | 99.32% | 153,660 | 137.83 |
Ref. | Methods | Embeddings | Datasets (Total Samples) | Evaluation Metrics | |
---|---|---|---|---|---|
F1 Score | Accuracy | ||||
[5] | ML | None | Global phone book | - * | - * |
[36] | Federated Learning | None | KorCCVi v2 (2927) | - * | - * |
[32] | CNN, BiLSTM | None | - * | - * | - * |
[6] | SVM, Logistic Regression, Decision Tree, Random Forest, XGB | TF-IDF | FSS + NIKL (2847) | 100% | 100% |
[7] | Random Forest, XGBoost, LGBM, and CatBoost, Linear SVC, RNN, BiLSTM, GRU | TF-IDF, FastText | KorCCVi v1 (1218) | 99.43% | 99.45% |
[8] | LSA, K-means | Doc2Vec, TF-IDF | FSS | 74% | 61% |
[34] | KoBERT | KoBERT | KorCCVi v1 (1218) | 99.57% | 99.60% |
[35] | KoBERT | KoBERT | FSS + AI Hub | - * | 97.86% |
Ours | Attention-based 1D CNN-BiLSTM | FastText | KorCCVi v2 (2927) | 99.31% | 99.32% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Moussavou Boussougou, M.K.; Park, D.-J. Attention-Based 1D CNN-BiLSTM Hybrid Model Enhanced with FastText Word Embedding for Korean Voice Phishing Detection. Mathematics 2023, 11, 3217. https://doi.org/10.3390/math11143217
Moussavou Boussougou MK, Park D-J. Attention-Based 1D CNN-BiLSTM Hybrid Model Enhanced with FastText Word Embedding for Korean Voice Phishing Detection. Mathematics. 2023; 11(14):3217. https://doi.org/10.3390/math11143217
Chicago/Turabian StyleMoussavou Boussougou, Milandu Keith, and Dong-Joo Park. 2023. "Attention-Based 1D CNN-BiLSTM Hybrid Model Enhanced with FastText Word Embedding for Korean Voice Phishing Detection" Mathematics 11, no. 14: 3217. https://doi.org/10.3390/math11143217
APA StyleMoussavou Boussougou, M. K., & Park, D. -J. (2023). Attention-Based 1D CNN-BiLSTM Hybrid Model Enhanced with FastText Word Embedding for Korean Voice Phishing Detection. Mathematics, 11(14), 3217. https://doi.org/10.3390/math11143217