Learning and Fusing Multi-View Code Representations for Function Vulnerability Detection
Abstract
:1. Introduction
- A novel DL-based vulnerability detection method called SFVD is presented. To accommodate the distinct representations of the code, an adaptive learning model has been devised to capture the multi-faceted aspects of function semantics and fuse them together to ensure the extraction of comprehensive semantic features. This strategy effectively prevents the loss of critical features that are indicative of vulnerability patterns.
- An extended-tree-structured neural network called ERvNN has been designed, which can effectively encode the semantics implied in the abstract syntax tree. With a GRU-style aggregation optimization on the tree nodes, it supports the straightforward and efficient encoding of multi-way tree structures, which otherwise should be firstly converted to the binary tree form.
- Extensive experiments were conducted to evaluate the performance of SFVD. The results demonstrated that SFVD outperformed existing state-of-the-art DL-based methods in terms of accuracy, F score, precision, and recall when detecting the presence of vulnerabilities and pinpointing the specific vulnerability types. Moreover, ablation studies confirmed the effectiveness of the devised ERvNN for encoding AST and the strategy of representation fusion for enhancing the performance of SFVD.
- A new dataset has been constructed to facilitate vulnerability detection research. The dataset consists of 25,333 C functions, each of which is well labeled with either a specific CWE ID indicating a vulnerability or a non-vulnerable ground truth. The source implementation of the SFVD has also been made publicly available at https://github.com/lv-jiajun/S2FVD (accessed on 22 May 2023) to facilitate future benchmarking and comparisons.
2. Related Work
2.1. Code-Similarity-Based Methods
2.2. Rule-Based Methods
2.3. Learning-Based Methods
2.3.1. Conventional Machine Learning-Based Methods
2.3.2. Deep-Learning-Based Methods
3. The Approach
3.1. Semantic Encoding of Token Sequence
3.1.1. Token Sequence Preparation
3.1.2. Sequence Encoding Network
3.2. Semantic Encoding of ACFG
3.2.1. ACFG Preparation
3.2.2. Graph Encoding Network
3.3. Semantic Encoding of AST
3.3.1. AST Preparation
3.3.2. Tree Encoding Network
3.4. Multi-View Fusion
4. Experiments and Evaluations
- RQ1: Impacts of the Fusion Strategies—which fusion strategy, as discussed in Section 3.4, most effectively blends the semantic features collected from the distinct code perspectives for SFVD to deliver its best vulnerability detection performance?
- RQ2: Performance Comparison with Baseline Methods—how does the performance of SFVD compare to the baseline methods in detecting the presence of vulnerabilities, as well as pinpointing the specific vulnerability types?
- RQ3: Substitutional Study—how does SFVD behave when its constituent neural network structures are substituted with other typical neural networks?
- RQ4: Ablation Study—does fusing multiple semantic features captured from distinct code views help boost the vulnerability detection performance compared with using part of them?
4.1. Experimental Setup
4.1.1. Datasets
4.1.2. Experiment Settings
4.1.3. Baseline Methods
- VulDeePecker proposes to extract code gadgets, which are comprised of code statements that exhibit control dependency relationships with respect to certain code elements of interest (such as library/API calls and array usage), to represent programs. Recurrent neural networks are then trained on these gadgets to detect vulnerabilities.
- SySeVR further enriches the concept of code gadgets. It proposes SeVCs (semantic-based vulnerability candidates) to represent the code by taking into account the data dependencies among the code statements in addition to the control dependencies.
- Reveal is an approach that operates on the graph-based representation of code known as the code property graph (CPG). It uses a GGNN (gated graph neural network) to extract features that are indicative of vulnerabilities present in the code.
4.1.4. Evaluation Metrics
4.2. Experimental Results
4.2.1. RQ1: Impacts of the Fusion Strategies
4.2.2. RQ2: Performance Comparison with Baseline Methods
4.2.3. RQ3: Substitutional Analysis
4.2.4. RQ4: Ablation Study
5. Discussion
5.1. Threats to Validity
5.2. Limitations
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- CVE. Available online: https://cve.mitre.org/ (accessed on 24 April 2023).
- Scandariato, R.; Walden, J.; Hovsepyan, A. Predicting vulnerable software components via text mining. IEEE Trans. Softw. Eng. 2014, 40, 987–1001. [Google Scholar] [CrossRef]
- Shaukat, K.; Luo, S.; Varadharajan, V. A novel deep learning-based approach for malware detection. Eng. Appl. Artif. Intell. 2023, 122, 106030. [Google Scholar] [CrossRef]
- Tian, Z.; Wang, Q.; Gao, C.; Chen, L.; Wu, D. Plagiarism detection of multi-threaded programs via siamese neural networks. IEEE Access 2020, 8, 160802–160814. [Google Scholar] [CrossRef]
- Tian, Z.; Huang, Y.; Xie, B.; Chen, Y.; Chen, L.; Wu, D. Fine-grained compiler identification with sequence-oriented neural modeling. IEEE Access 2021, 9, 49160–49175. [Google Scholar] [CrossRef]
- Tian, Z.; Tian, J.; Wang, Z.; Chen, Y.; Xia, H.; Chen, L. Landscape estimation of solidity version usage on ethereum via version identification. Int. J. Intell. Syst. 2021, 37, 450–477. [Google Scholar] [CrossRef]
- Russel, R.; Kim, L.; Hamilton, L. Automated vulnerability detection in source code using deep representation learning. In Proceedings of the 2018 17th IEEE International Conference on Machine Learning and Applications, Orlando, FL, USA, 17–20 December 2018; pp. 757–762. [Google Scholar]
- Li, Z.; Zou, D.; Xu, S. SySeVR: A Framework for Using Deep Learning to Detect Software Vulnerabilities. IEEE Trans. Dependable Secur. Comput. 2021, 19, 2244–2258. [Google Scholar] [CrossRef]
- Zhou, Y.; Liu, S.; Siow, J. Devign: Effective vulnerability identification by learning comprehensive program semantics via graph neural networks. Adv. Neural Inf. Process. Syst. 2019, 32, 1–11. [Google Scholar]
- Sun, H.; Cui, L.; Li, L. VDSimilar: Vulnerability detection based on code similarity of vulnerabilities and patches. Comput. Secur. 2021, 110, 102417. [Google Scholar] [CrossRef]
- Jang, J.; Agrawal, A.; Brumley, D. ReDeBug: Finding Unpatched Code Clones in Entire OS Distributions. In Proceedings of the 2012 IEEE Symposium on Security and Privacy, San Francisco, CA, USA, 20–23 May 2012; pp. 48–62. [Google Scholar]
- FlawFinder. Available online: https://dwheeler.com/flawfinder/ (accessed on 10 April 2023).
- Younis, A.; Malaiya, Y.; Anderson, C. To fear or not to fear that is the question: Code characteristics of a vulnerable functionwith an existing exploit. In Proceedings of the Sixth ACM Conference on Data and Application Security and Privacy, New Orleans, IL, USA, 9–11 March 2016; pp. 97–104. [Google Scholar]
- Hin, D.; Kan, A.; Chen, H. LineVD: Statement-level vulnerability detection using graph neural networks. In Proceedings of the 19th International Conference on Mining Software Repositories, Pittsburgh, PA, USA, 23–24 May 2022; pp. 596–607. [Google Scholar]
- Yang, S.; Cheng, L.; Zeng, Y. Asteria: Deep Learning-based AST-Encoding for Cross-platform Binary Code Similarity Detection. In Proceedings of the 2021 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), Taipei, China, 21–24 June 2021; pp. 154–196. [Google Scholar]
- Vadayath, J.; Eckert, M.; Zeng, K.; Weideman, N.; Menon, G.P.; Fratantonio, Y.; Balzarotti, D.; Doupé, A.; Bao, T.; Wang, R.; et al. Arbiter: Bridging the static and dynamic divide in vulnerability discovery on binary programs. In 31st USENIX Security Symposium (USENIX Security 22); USENIX Association: Boston, MA, USA, 2022; pp. 413–430. [Google Scholar]
- Beaman, C.; Redbourne, M.; Mummery, J.D.; Hakak, S. Fuzzing vulnerability discovery techniques: Survey, challenges and future directions. Comput. Secur. 2022, 120, 102813. [Google Scholar] [CrossRef]
- Zheng, P.; Zheng, Z.; Luo, X. Park: Accelerating smart contract vulnerability detection via parallel-fork symbolic execution. In Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis; Ser. ISSTA 2022; Association for Computing Machinery: New York, NY, USA, 2022; pp. 740–751. [Google Scholar]
- D’Silva, V.; Kroening, D.; Weissenbacher, G. A survey of automated techniques for formal software verification. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 2008, 27, 1165–1178. [Google Scholar] [CrossRef]
- Li, Z.; Zou, D.Q.; Xu, S.H. VulPecker: An automated vulnerability detection system based on code similarity analysis. In Proceedings of the 32nd Annual Conference on Computer Security Applications (ACSAC ’16). Association for Computing Machinery, New York, NY, USA, 5–8 December 2016; pp. 201–213. [Google Scholar]
- Cui, L.; Hao, Z.; Jiao, Y. Vuldetector: Detecting vulnerabilities using weighted feature graph comparison. IEEE Trans. Inf. Forensics Secur. 2020, 16, 2004–2017. [Google Scholar] [CrossRef]
- Li, Z. Survey on static software vulnerability detection for source code. Chin. J. Netw. Inf. Secur. 2019, 5, 1–14. [Google Scholar]
- Kim, S.; Woo, S.; Lee, H.; Oh, H. Vuddy: A scalable approach for vulnerable code clone discovery. In Proceedings of the 2017 IEEE Symposium on Security and Privacy (SP), San Jose, CA, USA, 22–26 May 2017; pp. 595–614. [Google Scholar]
- Infer, Infer: A Tool to Detect Bugs in Java and c/c++/objective-c Code before It Ships. 2013. Available online: https://fbinfer.com (accessed on 12 April 2023).
- CodeChecker. 2013. Available online: https://codechecker.readthedocs.io/en/latest (accessed on 20 April 2023).
- Checkmarx, Checkmarx. 2022. Available online: https://www.checkmarx.com (accessed on 28 April 2023).
- Stephan, L.; Sebastian, B.; Alexander, P. An Empirical Study on the Effectiveness of Static C Code Analyzers for Vulnerability Detection. In Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis, Virtual, Republic of Korea, 18–22 July 2022; pp. 544–555. [Google Scholar]
- Goseva-Popstojanova, K.; Perhinschi, A. On the capability of static code analysis to detect security vulnerabilities. Inf. Softw. Tech. 2015, 68, 18–33. [Google Scholar] [CrossRef]
- Ghaffarian, S.M.; Shahriari, H.R. Software vulnerability analysis and discovery using machine-learning and data-mining techniques: A survey. ACM Comput. Surv. 2017, 50, 1–36. [Google Scholar] [CrossRef]
- Perl, H.; Dechand, S.; Smith, M. Vccfinder: Finding potential vulnerabilities in open-source projects to assist code audits. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, Denver, CO, USA, 12–16 October 2015; pp. 426–437. [Google Scholar]
- Bosu, A.; Carver, J.C.; Hafiz, M. Identifying the characteristics of vulnerable code changes: An empirical study. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, Hong Kong, China, 16–21 November 2014; pp. 257–268. [Google Scholar]
- Lin, G.; Wen, S.; Han, Q.L. Software Vulnerability Detection Using Deep Neural Networks: A Survey. Proc. IEEE 2020, 108, 1825–1848. [Google Scholar] [CrossRef]
- Li, Z.; Zou, D.; Xu, S. Vuldeepecker: A deep learning-based system for vulnerability detection. In Proceedings of the 2018 25th Annual Network and Distributed System Security Symposium (NDSS’18), San Diego, CA, USA, 18–21 February 2018; pp. 1–15. [Google Scholar]
- Dam, H.K.; Pham, T.; Ng, S.W. A deep tree-based model for software defect prediction. arXiv 2018, arXiv:1802.00921. [Google Scholar]
- Li, Y.; Wang, S.; Nguyen, T.N. Vulnerability detection with fine-grained interpretations. In Proceedings of the 2021 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Athens, Greece, 23–28 August 2021; pp. 292–303. [Google Scholar]
- Johnson, R.; Zhang, T. Deep pyramid convolutional neural networks for text categorization. In Proceedings of the 2017 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, WA, USA, 30 July–4 August 2017; pp. 562–570. [Google Scholar]
- Mikolov, T.; Sutskever, I.; Chen, K. Distributed representations of words and phrases and their compositionality. Adv. Neural Inf. Process. Syst. 2013, 26, 3111–3119. [Google Scholar]
- Wolf, L.; Hanani, Y.; Bar, K. Joint word2vec Networks for Bilingual Semantic Representations. Int. J. Comput. Linguist. Appl. 2014, 5, 27–42. [Google Scholar]
- He, K.; Zhang, X.; Ren, S. Identity mappings in deep residual networks. In Proceedings of the 2016 14th European Conference of the Computer Vision–ECCV, Amsterdam, The Netherlands, 11–14 October 2016; pp. 630–645. [Google Scholar]
- Joern. Available online: https://joern.readthedocs.io/en/latest/ (accessed on 20 April 2023).
- Wang, X.; Ji, H.; Shi, C. Heterogeneous graph attention network. In Proceedings of the 2019 the World Wide Web Conference, San Francisco, CA, USA, 13–17 May 2019; pp. 2022–2032. [Google Scholar]
- Defferrard, M.; Bresson, X.; Vandergheynst, P. Convolutional neural networks on graphs with fast localized spectral filtering. Adv. Neural Inf. Process. Syst. 2016, 29, 3844–3852. [Google Scholar]
- Baxter, I.D.; Yahin, A.; Moura, L. Clone detection using abstract syntax trees. In Proceedings of the 1998 International Conference on Software Maintenance, Bethesda, ML, USA, 16–19 March 1998; pp. 368–377. [Google Scholar]
- Tang, Z.; Shen, X.; Li, C. AST-trans: Code summarization with efficient tree-structured attention. In Proceedings of the 2022 44th International Conference on Software Engineering, Pittsburgh, PA, USA, 21–29 May 2022; pp. 150–162. [Google Scholar]
- Zhang, J.; Wang, X.; Zhang, H. A novel neural source code representation based on abstract syntax tree. In Proceedings of the 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), Montreal, QC, Canada, 25–31 May 2019; pp. 783–794. [Google Scholar]
- Pycparser. Available online: https://pypi.org/project/pycparser/ (accessed on 1 April 2023).
- Ma, J.; Gao, W.; Wong, K.F. Rumor detection on twitter with tree-structured recursive neural networks. In Proceedings of the 2018 the Association for Computational Linguistics, Melbourne, Australia, 15–20 July 2018; pp. 1980–1986. [Google Scholar]
- SARD. Available online: https://samate.nist.gov/SARD/ (accessed on 5 September 2022).
- ANTLR. Available online: https://www.antlr.org/ (accessed on 5 October 2022).
- Zheng, Y.; Pujar, S.; Lewis, B. D2A: A dataset built for ai-based vulnerability detection methods using differential analysis. In Proceedings of the 43rd International Conference on Software Engineering: Software Engineering in Practice, Virtual Event, Spain, 25–28 May 2021; pp. 111–120. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 6000–6010. [Google Scholar]
- Mou, L.; Li, G.; Zhang, L. Convolutional neural networks over tree structures for programming language processing. In Proceedings of the 2019 the Thirtieth AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 29–32 January 2019; pp. 1287–1293. [Google Scholar]
- Croft, R.; Babar, M.A.; Kholoosi, M. Data quality for software vulnerability datasets. In Proceedings of the 2023 IEEE/ACM International Conference on Software Engineering (ICSE’23), Melbourne, Australia, 14–20 May 2023; pp. 1–13. [Google Scholar]
- Jimenez, M.; Rwemalika, R.; Papadakis, M.; Sarro, F.; Traon, Y.L.; Harman, M. The importance of accounting for real-world labelling when predicting software vulnerabilities. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, San Francisco, CA, USA, 1–8 December 2023; pp. 695–705. [Google Scholar]
- Shaukat, K.; Luo, S.; Chen, S.; Liu, D. Cyber threat detection using machine learning techniques: A performance evaluation perspective. In Proceedings of the 2020 International Conference on Cyber Warfare and Security (ICCWS), Islamabad, Pakistan, 20–21 October 2020; pp. 1–6. [Google Scholar]
- Shaukat, K.; Luo, S.; Varadharajan, V.; Hameed, I.A.; Xu, M. A survey on machine learning techniques for cyber security in the last decade. IEEE Access 2020, 8, 222310–222354. [Google Scholar] [CrossRef]
- Shaukat, K.; Luo, S.; Varadharajan, V.; Hameed, I.A.; Chen, S.; Liu, D.; Li, J. Performance comparison and current challenges of using machine learning techniques in cybersecurity. Energies 2020, 13, 2509. [Google Scholar] [CrossRef]
- Shaukat, K.; Luo, S.; Varadharajan, V. A novel method for improving the robustness of deep learning-based malware detectors against adversarial attacks. Eng. Appl. Artif. Intell. 2022, 116, 105461. [Google Scholar] [CrossRef]
- Cheng, X.; Nie, X.; Li, N.; Wang, H.; Zheng, Z.; Sui, Y. How about bug-triggering paths?—Understanding and characterizing learning-based vulnerability detectors. IEEE Trans. Dependable Secur. Comput. 2022, 1–18. [Google Scholar] [CrossRef]
CWE ID | Number | Label | CWE ID | Number | Label |
---|---|---|---|---|---|
CWE78 | 1243 | 0 | CWE197 | 231 | 13 |
CWE90 | 142 | 1 | CWE252 | 205 | 14 |
CWE114 | 166 | 2 | CWE253 | 329 | 15 |
CWE121 | 1599 | 3 | CWE369 | 233 | 16 |
CWE122 | 1135 | 4 | CWE400 | 195 | 17 |
CWE124 | 485 | 5 | CWE401 | 361 | 18 |
CWE126 | 407 | 6 | CWE427 | 136 | 19 |
CWE127 | 485 | 7 | CWE457 | 259 | 20 |
CWE134 | 530 | 8 | CWE590 | 233 | 21 |
CWE190 | 1138 | 9 | CWE606 | 160 | 22 |
CWE191 | 881 | 10 | CWE690 | 320 | 23 |
CWE194 | 297 | 11 | CWE761 | 190 | 24 |
CWE195 | 297 | 12 | CWE789 | 135 | 25 |
(a) Performance on Vulnerability Presence Detection | |||||
Method | Dataset | Accuracy | F | Precision | Recall |
SFVD | 96.04 | 96.11 | 96.34 | 95.88 | |
SFVD | 96.58 | 96.58 | 96.99 | 96.17 | |
SFVD | SARD | 96.30 | 96.30 | 96.63 | 95.98 |
SFVD | 97.28 | 97.28 | 97.63 | 96.93 | |
SFVD | 98.07 | 98.14 | 98.41 | 97.88 | |
SFVD | 58.97 | 58.12 | 56.46 | 59.88 | |
SFVD | 58.87 | 61.04 | 60.00 | 62.12 | |
SFVD | D2A | 58.10 | 55.99 | 52.85 | 59.53 |
SFVD | 60.69 | 66.17 | 71.61 | 61.50 | |
SFVD | 63.02 | 68.99 | 76.30 | 62.95 | |
(b) Performance on Vulnerability Type Detection | |||||
Method | Dataset | Accuracy | Weighted F | Weighted Precision | Weighted Recall |
SFVD | 95.19 | 95.04 | 95.26 | 95.19 | |
SFVD | 96.49 | 96.49 | 96.54 | 96.49 | |
SFVD | SARD | 96.02 | 96.02 | 96.19 | 96.02 |
SFVD | 97.11 | 97.11 | 97.15 | 97.11 | |
SFVD | 97.93 | 97.94 | 97.97 | 97.93 |
(a) Performance regarding vulnerability presence detection | |||||
Method | Dataset | Accuracy | F | Precision | Recall |
VulDeePecker | SARD | 95.06 | 95.01 | 95.41 | 94.62 |
SySeVR | 96.57 | 96.67 | 96.88 | 96.46 | |
Reveal | 96.55 | 96.42 | 96.64 | 96.21 | |
SFVD | 98.07 | 98.14 | 98.41 | 97.88 | |
VulDeePecker | D2A | 56.47 | 60.86 | 66.11 | 56.39 |
SySeVR | 60.50 | 60.76 | 59.74 | 61.82 | |
Reveal | 60.12 | 58.65 | 55.24 | 62.50 | |
SFVD | 63.02 | 68.99 | 76.30 | 62.95 | |
(b) Performance on Vulnerability Type Detection | |||||
Method | Dataset | Accuracy | Weighted F | Weighted Precision | Weighted Recall |
VulDeePecker | SARD | 95.38 | 95.34 | 95.60 | 95.38 |
SySeVR | 96.20 | 96.08 | 96.34 | 96.20 | |
Reveal | 95.79 | 95.69 | 95.97 | 95.79 | |
SFVD | 97.93 | 97.94 | 97.97 | 97.93 |
(a) Results for vulnerability presence detection | |||||
Dataset | Method Pair | Wilcoxon Random-sum | t-test | ||
Acc. p-value | F p-value | Acc. p-value | F p-value | ||
SARD | SFVD vs. VulDeePecker | 0.0090 | 0.0090 | 0.0003 | 0.0008 |
SFVD vs. SySeVR | 0.0163 | 0.0163 | 0.0123 | 0.0139 | |
SFVD vs Reveal | 0.0090 | 0.0090 | 0.0002 | 0.0004 | |
D2A | SFVD vs. VulDeePecker | 0.0090 | 0.0162 | 0.0001 | 0.0308 |
SFVD vs. SySeVR | 0.0090 | 0.0472 | 0.0001 | 0.0280 | |
SFVD vs. Reveal | 0.0090 | 0.0125 | 0.0004 | 0.0362 | |
(b) Results on Vulnerability Type Detection | |||||
SARD | SFVD vs. VulDeePecker | 0.0090 | 0.0090 | 0.0002 | 0.0002 |
SFVD vs. SySeVR | 0.0080 | 0.0080 | 0.0004 | 0.0003 | |
SFVD vs. Reveal | 0.0090 | 0.0090 | 0.0002 | 0.0002 |
(a) Performance for vulnerability presence detection | |||||
Method | Dataset | Accuracy | F | Precision | Recall |
SFVD | SARD | 96.53 | 96.61 | 96.79 | 96.44 |
SFVD | 96.96 | 96.69 | 97.03 | 96.35 | |
SFVD | 97.18 | 97.04 | 97.26 | 96.82 | |
SFVD | 96.26 | 96.35 | 96.69 | 96.02 | |
SFVD | 97.58 | 97.78 | 97.94 | 97.62 | |
SFVD | 98.07 | 98.14 | 98.41 | 97.88 | |
SFVD | D2A | 60.57 | 63.73 | 67.56 | 60.31 |
SFVD | 60.41 | 63.96 | 67.97 | 60.40 | |
SFVD | 61.05 | 64.13 | 66.07 | 62.31 | |
SFVD | 61.00 | 62.92 | 64.09 | 61.79 | |
SFVD | 62.15 | 63.15 | 63.90 | 62.42 | |
SFVD | 63.02 | 68.99 | 76.30 | 62.95 | |
(b) Performance for vulnerability type detection | |||||
Method | Dataset | Accuracy | Weighted F | Weighted Prec. | Weighted Rec. |
SFVD | SARD | 96.20 | 96.24 | 96.39 | 96.20 |
SFVD | 95.94 | 95.99 | 96.16 | 95.94 | |
SFVD | 96.32 | 96.35 | 96.53 | 96.32 | |
SFVD | 96.09 | 96.14 | 96.29 | 96.09 | |
SFVD | 96.24 | 96.25 | 96.48 | 96.24 | |
SFVD | 97.93 | 97.94 | 97.97 | 97.93 |
(a) Performance for vulnerability presence detection. | |||||
Method | Dataset | Accuracy | F | Precision | Recall |
SFVD | 87.95 | 89.15 | 92.31 | 86.20 | |
SFVD | 88.60 | 88.58 | 90.65 | 86.60 | |
SFVD | 89.06 | 90.35 | 97.01 | 84.55 | |
SFVD | SARD | 92.12 | 91.99 | 91.66 | 92.33 |
SFVD | 92.67 | 92.47 | 92.21 | 92.74 | |
SFVD | 94.33 | 94.20 | 93.88 | 94.53 | |
SFVD | 98.07 | 98.14 | 98.41 | 97.88 | |
SFVD | 56.95 | 49.94 | 42.59 | 60.38 | |
SFVD | 57.62 | 58.22 | 58.56 | 57.90 | |
SFVD | 57.24 | 60.04 | 63.69 | 56.78 | |
SFVD | D2A | 58.20 | 59.70 | 57.68 | 61.88 |
SFVD | 59.25 | 65.08 | 74.16 | 57.98 | |
SFVD | 60.11 | 61.27 | 61.61 | 60.93 | |
SFVD | 63.02 | 68.99 | 76.30 | 62.95 | |
(b) Performance for vulnerability type detection. | |||||
Method | Dataset | Accuracy | Weighted F | Weighted Prec. | Weighted Rec. |
SFVD | 87.56 | 86.82 | 89.78 | 87.56 | |
SFVD | 88.25 | 88.40 | 88.76 | 88.25 | |
SFVD | 91.74 | 91.77 | 91.91 | 91.74 | |
SFVD | SARD | 93.31 | 93.31 | 93.40 | 93.31 |
SFVD | 92.41 | 92.18 | 92.82 | 92.41 | |
SFVD | 94.78 | 94.79 | 94.86 | 94.78 | |
SFVD | 97.93 | 97.94 | 97.97 | 97.93 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Tian, Z.; Tian, B.; Lv, J.; Chen, L. Learning and Fusing Multi-View Code Representations for Function Vulnerability Detection. Electronics 2023, 12, 2495. https://doi.org/10.3390/electronics12112495
Tian Z, Tian B, Lv J, Chen L. Learning and Fusing Multi-View Code Representations for Function Vulnerability Detection. Electronics. 2023; 12(11):2495. https://doi.org/10.3390/electronics12112495
Chicago/Turabian StyleTian, Zhenzhou, Binhui Tian, Jiajun Lv, and Lingwei Chen. 2023. "Learning and Fusing Multi-View Code Representations for Function Vulnerability Detection" Electronics 12, no. 11: 2495. https://doi.org/10.3390/electronics12112495
APA StyleTian, Z., Tian, B., Lv, J., & Chen, L. (2023). Learning and Fusing Multi-View Code Representations for Function Vulnerability Detection. Electronics, 12(11), 2495. https://doi.org/10.3390/electronics12112495