An Innovative Framework for Coincidental Correctness Impacting on Fault Localization
Abstract
:1. Introduction
- We developed a function theoretical framework, which compared the derivative value of a formula between the original test suite and the test suite removing coincidental correctness to determine the suspiciousness value of a formula.
- Using this framework, we investigated 30 SBFL formulas when removing coincidental correctness. We are able to find that 23 of these 30 formulas are improved formulas, five of them are unaffected formulas, and the remaining two of them are uncertain.
- We conducted the experiments with four C open-source programs to evaluate coincidental correctness on the effectiveness of SBFL. The results show that the effectiveness of some suspiciousness formulas was enhanced indeed and some other suspiciousness formulas were unchanged.
2. Background
2.1. Spectrum-Based Fault Localization (SBFL)
2.2. Coincidental Correctness (CC)
2.3. Assumptions
- We assume that the SBFL formulas are applied to the program with test oracle; that is to say, the execution result of the program is failed or passed for any test case in premise.
- We assume that the program only contains one fault. In other words, we investigated 30 suspiciousness formulas under the single-fault scenario because multiple faults will interfere with each other.
- The test suite is assumed to contain at least one failed test case because there existed a failed test case for triggering the fault in the program. For any program statement s, we have .
3. Our Framework
3.1. Motivation
3.2. Theoretical Analysis
- “—improved typ”. When , is a monotonically decreasing function to discriminate , i.e., . That is, when removing coincidental correctness test cases, the suspiciousness value of the SBFL formulas is increasing; therefore, the effectiveness of the SBFL formulas is improved.
- “—invariant type”. When is equal to 0, is to discriminate , i.e., . That is, when removing coincidental correctness test cases, the suspiciousness value of SBFL formulas is unchanged; therefore, the effectiveness of the SBFL formulas is unchanged.
- “—uncertain type”. It is uncertain that the value of is more than 0 or less than 0. The value of needs to be discussed inter-partition. That is, when removing coincidental correctness test cases, the suspiciousness value of SBFL formulas is uncertain; therefore, the effectiveness of SBFL formulas is uncertain.
3.2.1. The Improved Formulas by Removing Coincidental Correctness Test Cases
3.2.2. The Unaffected Formulas by Removing Coincidental Correctness Test Cases
3.2.3. Uncertain of Formulas Affected by Removing Coincidental Correctness Test Cases
- (1)
- Suppose , we have , i.e., , thus, , that is, ;
- (2)
- Suppose , we have , i.e., , thus, , that is, ;
- (3)
- Suppose , we have , i.e., , thus, , that is, .
- (1)
- Assume , then , i.e., , we have , that is, ;
- (2)
- Assume , then , i.e., , we have , that is, ;
- (3)
- Assume ro , then , i.e., , we have , that is, .
3.3. Empirical Study
3.3.1. Subject Programs
3.3.2. Results and Analysis
3.3.3. Discussion
- (1)
- We assume that the program only contains one fault in the proof. In addition, we investigated 30 suspiciousness formulas under the single-fault scenario because of multiple faults interfering with each other.
- (2)
- We conducted the experiments to verify the effectiveness of suspiciousness formulas, such as “—improved type” formulas and “—invariant type” formulas. However, we could not verify the effectiveness of the “—uncertain type” formulas in Table 3, because the value of depends on the number of passed test cases and further discuss the inter-partition of .
- (3)
- The number of lines of code for the four programs, including gzip, grep, sed and flex, is from 5000 to 10,000. We apply the whole test suite as the input to individual subject programs. They have been adopted to evaluate fault localization techniques in previous work [1,29]. Some of them, which are relatively large, are real-world programs and have real-life scales [30]. This allows us to integrate our approach into a larger-scale code analysis tool.
3.3.4. Summary
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Vessey, I. Expertise in debugging computer programs: A process analysis. Int. J. Man-Mach. Stud. 1985, 23, 459–494. [Google Scholar] [CrossRef]
- Voas, J.M. PIE: A Dynamic Failure-based Technique. IEEE Trans. Softw. Eng. 1992, 18, 717–727. [Google Scholar] [CrossRef] [Green Version]
- Masri, W.; Abou-Assi, R.; El-Ghali, M.; Fatairi, N. An empirical study of the factors that reduce the effectiveness of coverage-based fault localization. In Proceedings of the 2nd International Workshop on Defects in Large Software Systems, Chicago, IL, USA, 19 July 2009; ACM: New York, NY, USA, 2009; pp. 1–5. [Google Scholar]
- Hierons, R.M. Avoiding Coincidental Correctness in Boundary Value Analysis. ACM Trans. Softw. Eng. Methodol. 2006, 15, 227–241. [Google Scholar] [CrossRef]
- Wang, X.; Cheung, S.C.; Chan, W.K.; Zhang, Z. Taming coincidental correctness: Coverage refinement with context patterns to improve fault localization. In Proceedings of the 31th International Conference on Software Engineering, Vancouver, BC, Canada, 16–24 May 2009; pp. 45–55. [Google Scholar]
- Feyzi, F.; Parsa, S. A program slicing-based method for effective detection of coincidentally correct test cases. Computing 2018, 100, 927–969. [Google Scholar] [CrossRef]
- Assi, R.A.; Masri, W.; Trad, C. How detrimental is coincidental correctness to coverage-based fault detection and localization? An empirical study. Softw. Testing Verif. Reliab. 2021, 31, 113–124. [Google Scholar]
- Masri, W.; Podgurski, A. An empirical study of the strength of information flows in programs. In Proceedings of the International Workshop on Dynamic Systems Analysis, Shanghai, China, 20–28 May 2006; ACM: New York, NY, USA, 2006; pp. 73–80. [Google Scholar]
- Masri, W.; Assi, R.A. Prevalence of Coincidental Correctness and Mitigation of its Impact o n Fault Localization. ACM Trans. Softw. Eng. Methodol. 2014, 23, 1–28. [Google Scholar] [CrossRef]
- Feyzi, F. CGT-FL: Using cooperative game theory to effective fault localization in presence of coincidental correctness. Empir. Softw. Eng. 2020, 25, 3873–3927. [Google Scholar] [CrossRef]
- Hofer, B. Removing Coincidental Correctness in Spectrum-Based Fault Localization for Circuit and Spreadsheet Debugging. In Proceedings of the 2017 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW), Toulouse, France, 23–26 October 2017; pp. 23–33. [Google Scholar]
- Lee, H.J.; Naish, L.; Ramanohanrao, K. Study of the relationship of bug consistency with respect to performance of spectra metrics. In Proceedings of the 2nd IEEE International Conference on Computer Science and Information Technology, Beijing, China, 8–11 August 2009; pp. 501–508. [Google Scholar]
- Nais, L.; Lee, H.J.; Ramamohanarao, K. A model for spectra-based software diagnosis. ACM Trans. Softw. Engining Methodol. 2011, 20, 11–32. [Google Scholar]
- Xie, X.Y.; Chen, T.Y.; Kuo, F.C.; Xu, B.W. A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization. ACM Trans. Softw. Eng. Methodol. 2013, 22, 31–70. [Google Scholar] [CrossRef]
- Golagha, M.; Pretschner, A.; Briand, L.C. Can we predict the quality of spectrum-based fault localization? In Proceedings of the 2020 IEEE 13th International Conference on Software Testing, Validation and Verification, Porto, Portugal, 24–28 October 2020; pp. 4–15. [Google Scholar]
- Dutta, A.; Kunal, K.; Srivastava, S.S.; Shankar, S.; Mall, R. FTFL: A Fisher’s test-based approach for fault localization. Innov. Syst. Softw. Eng. 2021, 17, 1–25. [Google Scholar] [CrossRef]
- Ghosh, D.; Singh, J. Spectrum-based multi-fault localization using chaotic genetic algorithm. Inf. Softw. Technol. 2021, 133, 106512–106524. [Google Scholar] [CrossRef]
- Zhao, G.; He, H.; Huang, Y. Fault centrality: Boosting spectrum-based fault localization via local influence calculation. Appl. Intell. 2022, 52, 7113–7135. [Google Scholar] [CrossRef]
- Wu, Y.H.; Li, Z.; Liu, Y.; Chen, X. Fatoc: Bug isolation based multi-fault localization by using optics clustering. J. Comput. Sci. Technol. 2020, 35, 979–998. [Google Scholar] [CrossRef]
- Jones, J.A.; Harrold, M.J. Empirical evaluation of the tarantula automatic fault-localization technique. In Proceedings of the 20th IEEE/ACM International Conference on Automated Software Engineering, Long Beach, CA, USA, 7–11 November 2005; pp. 273–282. [Google Scholar]
- Jones, J.A.; Harrold, M.J.; Stasko, J. Visualization of test information to assist fault localization. In Proceedings of the 24th International Conference on Software Engineering, Orlando, FL, USA, 25 May 2002; pp. 467–477. [Google Scholar]
- Chen, M.Y.; Kiciman, E.; Fratkin, E.; Fox, A.; Brewer, E. Pinpoint: Problem determination in large, dynamic internet services. In Proceedings of the International Conference on Dependable Systems and Networks, Goteborg, Sweden, 1–4 July 2002; pp. 595–604. [Google Scholar]
- Abreu, R.; Zoeteweij, P.; Van Gemund, A.J.C. An evaluation of similarity coefficients for software fault localization. In Proceedings of the 12th Pacific Rim International Symposium on Dependable Computing, Riverside, CA, USA, 18–20 December 2006; pp. 39–46. [Google Scholar]
- Masri, W.; Assi, R.A.; Zaraket, F.; Fatairi, N. Enhancing fault localization via multivariate visualization. In Proceedings of the 5th International Conference onin Software Testing, Verification and Validation, Montreal, QC, Canada, 17–21 April 2012; pp. 737–741. [Google Scholar]
- Rui, A.; Zoeteweij, P.; Golsteijn, R.; Van Gemund, A.J. A practical evaluation of spectrum-based fault localization. J. Syst. Softw. 2009, 82, 1780–1792. [Google Scholar]
- Software Infrastructure Repository. Available online: http://sir.unl.edu/php/index.php (accessed on 29 April 2022).
- Gcc Compiler. Available online: http://gcc.gnu.org/onlinedocs/gcc/Gcov.html (accessed on 29 April 2022).
- Ju, X.; Jiang, S.; Chen, X.; Wang, X.; Zhang, Y.; Cao, H. HSFal: Effective fault localization using hybrid spectrum of full slices and execution slices. J. Syst. Softw. 2014, 90, 3–17. [Google Scholar] [CrossRef]
- Zhang, L.; Yan, L.; Zhang, Z.; Zhang, J.; Chan, W.K.; Zheng, Z. A theoretical analysis on cloning the failed test cases to improve spectrum-based fault localization. J. Syst. Softw. 2017, 129, 35–57. [Google Scholar] [CrossRef]
- Yoo, S.; Xie, X.; Kuo, F.C.; Chen, T.Y.; Harman, M. Human competitiveness of genetic programming in spectrum-based fault localisation: Theoretical and empirical analysis. ACM Trans. Softw. Eng. Meth. 2017, 26, 1–30. [Google Scholar] [CrossRef]
Statements | Coverage | Tarantula | Naish1 | Coverage (CC) | Tarantula | Naish1 | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
sus. | Rank | sus. | Rank | sus. | Rank | sus. | Rank | |||||||||||
• | • | • | • | • | • | 0.5 | 4 | 0 | 3 | • | • | • | • | 0.5 | 4 | 0 | 3 | |
• | • | • | • | • | • | 0.5 | 4 | 0 | 3 | • | • | • | • | 0.5 | 4 | 0 | 3 | |
• | • | • | • | • | • | 0.5 | 4 | 0 | 3 | • | • | • | • | 0.5 | 4 | 0 | 3 | |
• | • | • | • | • | • | 0.5 | 4 | 0 | 3 | • | • | • | • | 0.5 | 4 | 0 | 3 | |
(fault) | • | • | • | • | • | 0.6 | 2 | 1 | 1 | • | • | • | 1 | 1 | 1 | 1 | ||
• | • | • | • | • | 0.6 | 2 | 1 | 1 | • | • | • | 1 | 1 | 1 | 1 | |||
• | • | 1 | 1 | −1 | 8 | • | • | 1 | 1 | −1 | 8 | |||||||
• | • | • | • | • | • | 0.5 | 4 | 0 | 3 | • | • | • | • | 0.5 | 4 | 0 | 3 | |
Result | F | T | T | F | F | T | F | T | F | F | ||||||||
Fault rank | 2–3 | 1–2 | 1–3 | 1–2 |
Function | Variable | Derivative | Monotonic | Function Relationship | Formula Type |
---|---|---|---|---|---|
less than 0 | Monotonically decreasing | ||||
— | — | ||||
less than 0 | Monotonically decreasing | ||||
or more than 0 | or increasing |
Name | Formula Expression | Type |
---|---|---|
Jaccard | ||
Anderberg | ||
S∅rensen-Dice | ||
Dice | ||
qe | ||
Simple Matching | ||
Sokal | ||
Rogers&Tanimoto | ||
Russel&Rao | ||
M2 | ||
Kulczynski2 | ||
Rogot1 | ||
Goodman | ||
Hamann | ||
Naish2 | ||
AMPLE2 | ||
Wong3 | ||
Wong2 | ||
Fleiss | ||
Tarantula | ||
Ochiai | ||
Arithmetic Mean | ||
Mean | ||
Naish1 | ||
Wong1 | ||
Binary | ||
Hamming etc. | ||
Euclid | ||
CBI Inc. | ||
Cohen |
Program Name | Description Information | Lines of Code | Number of Faulty Version | Test Suite Size | Fault Type |
---|---|---|---|---|---|
gzip | data compression/decompression | 5365 | 20 | 211 | real |
grep | search for a pattern in a file | 9205 | 14 | 806 | real |
sed | a stream text editor | 6763 | 22 | 360 | real, seeded |
flex | a fast lexical analyzer generator | 9766 | 20 | 567 | real |
Program | Trace and Cleaning(s) | Susp. Computation(s) | Total(s) | #SBFL(CC) | |||
---|---|---|---|---|---|---|---|
SBFL(CC) | SBFL | SBFL(CC) | SBFL | SBFL(CC) | SBFL | #SBFL | |
gzip | 6912.36 | 6738.39 | 303.27 | 291.06 | 2405.21 | 2343.15 | 1.03 |
grep | 33,617.52 | 28,025.28 | 1512.96 | 1386.80 | 4391.31 | 3676.51 | 1.19 |
sed | 6922.23 | 6541.32 | 288.15 | 282.06 | 2403.46 | 2274.46 | 1.06 |
flex | 44,072.20 | 35,001.90 | 1950.20 | 1752.40 | 4602.24 | 3675.43 | 1.25 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Cao, H.; Li, L.; Sun, Y. An Innovative Framework for Coincidental Correctness Impacting on Fault Localization. Symmetry 2022, 14, 1267. https://doi.org/10.3390/sym14061267
Cao H, Li L, Sun Y. An Innovative Framework for Coincidental Correctness Impacting on Fault Localization. Symmetry. 2022; 14(6):1267. https://doi.org/10.3390/sym14061267
Chicago/Turabian StyleCao, Heling, Lei Li, and Yigui Sun. 2022. "An Innovative Framework for Coincidental Correctness Impacting on Fault Localization" Symmetry 14, no. 6: 1267. https://doi.org/10.3390/sym14061267
APA StyleCao, H., Li, L., & Sun, Y. (2022). An Innovative Framework for Coincidental Correctness Impacting on Fault Localization. Symmetry, 14(6), 1267. https://doi.org/10.3390/sym14061267