Integrated Clinical Environment Security Analysis Using Reinforcement Learning
Abstract
:1. Introduction
1.1. Related Work
1.1.1. RL in the Medical Field
1.1.2. RL for Cybersecurity
2. Preliminaries
2.1. Attack Graph
2.2. Common Vulnerability Scoring System
- -
- For the (Attack Vector), our input was Local, which means that the vulnerable part is not linked to the network stack, and the attacker’s direction is via read/write/execute capabilities.
- -
- For the (Attack Complexity), the Low input was fed to the calculator, indicating that there are no special access requirements or mitigating circumstances. When attacking the vulnerable component, an attacker might expect to have consistent success.
- -
- The input for the (Privileges Required) is None, which means that the attacker is unauthorized before preceding the attack, he or she does not need access to the vulnerable system’s settings or data to carry it out.
- -
- Required was the input for the (User Interaction field), meaning that the user must take some action before the vulnerability to be successfully exploited.
- -
- The (Scope) field was answered as Unchanged, indicating that only resources managed by the same security authority can be impacted by an exploited vulnerability.
- -
- The Low answer was fed to the two fields (Confidentiality) and (Integrity), indicating the following: Confidentiality has been compromised, and the attacker has access to some protected information, but he or she has no control over what information is gained or how much of it is obtained. Additionally, for Integrity, it means that the modification of data is possible, but the attacker does not have control over the consequence of a modification, or the amount of modification is limited.
- -
- The (Availability) is None because, within the damaged component, there is no impact on availability.
- -
- (Exploit Code Maturity) input is a Proof-of-Concept, which means for most systems, a proof-of-concept exploit code is available, or an attack demonstration is impractical. The code or technique may not work in all circumstances, and a competent attacker may need to make significant changes.
- -
- Not Defined was the answer for both (Remediation Level) and (Report Confidence); this input shows that there is insufficient information to choose one of the other values and has no effect on the overall Temporal Score.
- -
- For the (Confidentiality Requirement), our input was Medium because the loss of (Confidentiality, Integrity, Availability) will almost certainly have serious consequences on the organization.
2.3. Reinforcement Learning
Algorithm 1. Used Approach |
Result: Best Solution (Route) and Cumulative Reward Initialization; |
1. Generate Attack Graph Using Architecture Analysis and Design Language, JKind checker tool and Graphviz |
2. Convert Attack Graph to Refinement Graph; |
3. Formulate the RL problem. Define environment, agent, states, actions, and rewards; |
4. Train RL Agent in MDP Environment; |
Algorithm 2. Train RL Agent in MDP Environment |
Result: The Agent Successfully Finds The Optimal Path Which Results In Cumulative Reward |
Initialization; |
Create MDP Environment; |
1. Create MDP Model With Identified States And Actions; |
2. Specify The State Transition And Reward Matrices For The MDP; |
3. Specify The Terminal States Of The MDP; |
4. Create The RL MDP Environment For This Process Model; |
5. Specify The Initial State Of The Agent By Specifying A Reset Function; |
Create Q-Learning Agent; |
1. Create A Q Table Using The Observation And Action Specifications From The MDP Environment; |
2. Set The Learning Rate Of The Representation; |
3. Create A Q-learning Agent; |
Train Q-Learning Agent; |
1. Specify The Training Options (Episode, Stop Training Criteria); |
2. Train The Agent Using The ‘train’ Function; |
Validate Q-Learning Results; |
1. Simulate The Agent In The Training Environment Using The ‘sim’ Function. |
3. Results and Discussion
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Kotenko, I.; Stepashkin, M. Attack graph based evaluation of network security. In IFIP International Conference on Communications and Multimedia Security; Springer: Berlin/Heidelberg, Germany, 2006; pp. 216–227. [Google Scholar]
- Kotenko, I.; Chechulin, A. Attack modeling and security evaluation in SIEM systems. Int. Trans. Syst. Sci. Appl. 2012, 8, 129–147. [Google Scholar]
- Ou, X.; Singhal, A. Quantitative Security Risk Assessment of Enterprise Networks; Springer: Berlin/Heidelberg, Germany, 2011. [Google Scholar]
- Watkins, C.J.C.H. Learning from Delayed Rewards. Ph.D. Thesis, University of Cambridge, Cambridge, UK, 1989. [Google Scholar]
- Ibrahim, M.; Okasha, H.; Elhafiz, R. Security analysis of integrated clinical environment using attack graph. In Intelligent Sustainable Systems; Springer: Singapore, 2022; pp. 75–83. [Google Scholar]
- Yu, C.; Liu, J.; Nemati, S.; Yin, G. Reinforcement learning in healthcare: A survey. ACM Comput. Surv. CSUR 2021, 55, 5. [Google Scholar] [CrossRef]
- Datta, S.; Li, Y.; Ruppert, M.M.; Ren, Y.; Shickel, B.; Ozrazgat-Baslanti, T.; Rashidi, P.; Bihorac, A. Reinforcement learning in surgery. Surgery 2021, 170, 329–332. [Google Scholar] [CrossRef] [PubMed]
- Sahba, F.; Tizhoosh, H.R.; Salama, M.M. A reinforcement learning framework for medical image segmentation. In Proceedings of the 2006 IEEE International Joint Conference on Neural Network, Vancouver, BC, Canada, 16–21 July 2006; pp. 511–517. [Google Scholar]
- Stember, J.; Shalu, H. Deep reinforcement learning to detect brain lesions on MRI: A proof-of-concept application of reinforcement learning to medical images. arXiv 2008, arXiv:2008.02708. [Google Scholar]
- Allioui, H.; Mohammed, M.A.; Benameur, N.; Al-Khateeb, B.; Abdulkareem, K.H.; Garcia-Zapirain, B.; Damaševičius, R.; Maskeliūnas, R. A multi-agent deep reinforcement learning approach for enhancement of COVID-19 CT image segmentation. J. Pers. Med. 2022, 12, 309. [Google Scholar] [CrossRef] [PubMed]
- Oh, S.H.; Lee, S.J.; Park, J. Precision medicine for hypertension patients with type 2 diabetes via reinforcement learning. J. Pers. Med. 2022, 12, 87. [Google Scholar] [CrossRef] [PubMed]
- Kissel, R. (Ed.) Glossary of Key Information Security Terms; Diane Publishing: Collingdale, PA, USA, 2011. [Google Scholar]
- Ghanem, M.C.; Chen, T.M. Reinforcement learning for efficient network penetration testing. Information 2019, 11, 6. [Google Scholar] [CrossRef] [Green Version]
- Feltus, C. Reinforcement learning’s contribution to the cyber security of distributed systems: Systematization of knowledge. Int. J. Distrib. Artif. Intell. IJDAI 2020, 12, 35–55. [Google Scholar] [CrossRef]
- Ibrahim, M.; Alsheikh, A.; Elhafiz, R. Resiliency assessment of power systems using deep reinforcement learning. Comput. Intell. Neurosci. 2022, 2022, 2017366. [Google Scholar] [CrossRef] [PubMed]
- Alavizadeh, H.; Alavizadeh, H.; Jang-Jaccard, J. Deep Q-learning based reinforcement learning approach for network intrusion detection. Computers 2022, 11, 41. [Google Scholar] [CrossRef]
- Wang, L.; Islam, T.; Long, T.; Singhal, A.; Jajodia, S. An attack graph-based probabilistic security metric. In Proceedings of the IFIP Annual Conference on Data and Applications Security and Privacy, London, UK, 13–16 July 2008; pp. 283–296. [Google Scholar]
- Ingols, K.; Lippmann, R.; Piwowarski, K. 2006, December. Practical attack graph generation for network defense. In Proceedings of the 22nd Annual Computer Security Applications Conference (ACSAC’06), Washington, DC, USA, 11–15 December 2006; pp. 121–130. [Google Scholar]
- Ramakrishnan, C.R.; Sekar, R. Model-based analysis of configuration vulnerabilities 1. J. Comput. Secur. 2002, 10, 189–209. [Google Scholar] [CrossRef] [Green Version]
- Ritchey, R.W.; Ammann, P. Using model checking to analyze network vulnerabilities. In Proceedings of the 2000 IEEE Symposium on Security and Privacy, Berkeley, CA, USA, 14–17 May 2000; pp. 156–165. [Google Scholar]
- Sheyner, O.; Haines, J.; Jha, S.; Lippmann, R.; Wing, J.M. Automated generation and analysis of attack graphs. In Proceedings of the 2002 IEEE Symposium on Security and Privacy, Berkeley, CA, USA, 12–15 May 2002; pp. 273–284. [Google Scholar]
- Zerkle, D.; Levitt, K.N. NetKuang—A multi-host configuration vulnerability checker. In Proceedings of the USENIX Security Symposium, San Jose, CA, USA, 22–25 July 1996. [Google Scholar]
- Phillips, C.; Swiler, L.P. A graph-based system for network-vulnerability analysis. In Proceedings of the 1998 Workshop on New Security Paradigms, Charlottesville, VA, USA, 22–25 September 1998; pp. 71–79. [Google Scholar]
- Swiler, L.P.; Phillips, C.; Ellis, D.; Chakerian, S. Computer-attack graph generation tool. In Proceedings of the DARPA Information Survivability Conference and Exposition II, DISCEX’01, Anaheim, CA, USA, 12–14 June 2001; Volume 2, pp. 307–321. [Google Scholar]
- Noel, S.; Jajodia, S. Managing attack graph complexity through visual hierarchical aggregation. In Proceedings of the 2004 ACM Workshop on Visualization and Data Mining for Computer Security, Washington, DC, USA, 29 October 2004; pp. 109–118. [Google Scholar]
- Homer, J.; Varikuti, A.; Ou, X.; McQueen, M.A. Improving attack graph visualization through data reduction and attack grouping. In Proceedings of the International Workshop on Visualization for Computer Security, Cambridge, MA, USA, 15 September 2008; pp. 68–79. [Google Scholar]
- Noel, S.; Jacobs, M.; Kalapa, P.; Jajodia, S. Multiple coordinated views for network attack graphs. In Proceedings of the IEEE Workshop on Visualization for Computer Security, VizSEC 05, Minneapolis, MN, USA, 26 October 2005; pp. 99–106. [Google Scholar]
- Williams, L.; Lippmann, R.; Ingols, K. An interactive attack graph cascade and reachability display. In Proceedings of the VizSEC 2007, Sacramento, CA, USA, 29 October 2007; pp. 221–236. [Google Scholar]
- Mell, P.; Scarfone, K.; Romanosky, S. A complete guide to the common vulnerability scoring system version 2.0. In Proceedings of the 19th Annual FIRST Conference “Private Lives and Corporate Risk”, Seville, Spain, 17–22 June 2007; Volume 1, p. 23. [Google Scholar]
- Le, N.T.; Hoang, D.B. Security threat probability computation using Markov chain and common vulnerability scoring system. In Proceedings of the 28th International Telecommunication Networks and Applications Conference, Sydney, Australia, 21–23 November 2018; pp. 1–6. [Google Scholar]
- Feutrill, A.; Ranathunga, D.; Yarom, Y.; Roughan, M. The effect of common vulnerability scoring system metrics on vulnerability exploit delay. In Proceedings of the 6th International Symposium on Computing and Networking (CANDAR), Takayama, Japan, 27–30 November 2018; pp. 1–10. [Google Scholar]
- Singh, U.K.; Joshi, C. Quantitative security risk evaluation using CVSS metrics by estimation of frequency and maturity of exploit. In Proceedings of the World Congress on Engineering and Computer Science, San Francisco, CA, USA, 19–21 October 2016; Volume 1, pp. 19–21. [Google Scholar]
- Mell, P.; Scarfone, K.; Romanosky, S. Common vulnerability scoring system. IEEE Secur. Priv. 2006, 4, 85–89. [Google Scholar] [CrossRef]
- Cheng, Y.; Deng, J.; Li, J.; DeLoach, S.A.; Singhal, A.; Ou, X. Metrics of security. In Cyber Defense and Situational Awareness; Springer: Cham, Switzerland, 2014; pp. 263–295. [Google Scholar]
- National Vulnerability Database. Common Vulnerability Scoring System Calculator. Available online: https://nvd.nist.gov/vuln-metrics/cvss/v3-calculator (accessed on 30 April 2022).
- El-Tantawy, S.; Abdulhai, B.; Abdelgawad, H. Multiagent reinforcement learning for integrated network of adaptive traffic signal controllers (MARLIN-ATSC): Methodology and large-scale application on downtown Toronto. IEEE Trans. Intell. Transp. Syst. 2013, 14, 1140–1150. [Google Scholar] [CrossRef]
- Fu, J.; Kumar, A.; Soh, M.; Levine, S. Diagnosing bottlenecks in deep q-learning algorithms. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; pp. 2021–2030. [Google Scholar]
- Kaelbling, L.P.; Littman, M.L.; Moore, A.W. Reinforcement learning: A survey. J. Artif. Intell. Res. 1996, 4, 237–285. [Google Scholar] [CrossRef] [Green Version]
- Yan, J.; He, H.; Zhong, X.; Tang, Y. Q-learning-based vulnerability analysis of smart grid against sequential topology attacks. IEEE Trans. Inf. Forensics Secur. 2016, 12, 200–210. [Google Scholar] [CrossRef]
- Chung, K.; Kamhoua, C.A.; Kwiat, K.A.; Kalbarczyk, Z.T.; Iyer, R.K. Game theory with learning for cyber security monitoring. In Proceedings of the 2016 IEEE 17th International Symposium on High Assurance Systems Engineering (HASE), Orlando, FL, USA, 7–9 January 2016; pp. 1–8. [Google Scholar]
- Baird, L.C. Reinforcement learning in continuous time: Advantage updating. In Proceedings of the 1994 IEEE International Conference on Neural Networks (ICNN’94), Orlando, FL, USA, 28 June–2 July 1994; Volume 4, pp. 2448–2453. [Google Scholar]
- Fu, W.T.; Anderson, J.R. From recurrent choice to skill learning: A reinforcement-learning model. J. Exp. Psychol. Gen. 2006, 135, 184. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zhang, Q.; Li, M.; Wang, X.; Zhang, Y. Reinforcement learning in robot path optimization. J. Softw. 2012, 7, 657–662. [Google Scholar] [CrossRef] [Green Version]
- Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction; MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
- Yousefi, M.; Mtetwa, N.; Zhang, Y.; Tianfield, H. A reinforcement learning approach for attack graph analysis. In Proceedings of the 2018 17th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/12th IEEE International Conference on Big Data Science and Engineering (TrustCom/BigDataSE), New York, NY, USA, 1–3 August 2018; pp. 212–217. [Google Scholar]
- Gandhia, N.; Mishraa, S. Applications of reinforcement learning for medical decision making. In Proceedings of the RTA-CSIT 2021, Tirana, Albania, 21–22 May 2021. [Google Scholar]
Attack Name | Base Score | Temporal Score | Environmental Score | Overall Score |
---|---|---|---|---|
SP_APC | 4.4 | 4.2 | 3.6 | 3.6 |
IG_APS | 3.5 | 3.1 | 3.1 | 3.1 |
SP_APS | 4.4 | 4.2 | 3.6 | 3.6 |
IG_CS | 3.5 | 3 | 4 | 4 |
TH_CS | 8 | 7.5 | 7.6 | 7.6 |
TH_APS | 7.6 | 7.1 | 7.2 | 7.2 |
BOF_SNC | 8 | 8.1 | 8.1 | 8.1 |
DoS_SNC | 8 | 8 | 8.1 | 8.1 |
DoS_NCMD | 7.5 | 7.5 | 10 | 10 |
R | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
---|---|---|---|---|---|---|---|
1 | −1 | 3.6 | 3.1 | 3.6 | −1 | −1 | −1 |
2 | 0 | −1 | 4 | −1 | −1 | −1 | −1 |
3 | 0 | 0 | −1 | 7.6 | −1 | −1 | −1 |
3 | 0 | 0 | −1 | 7.2 | −1 | −1 | −1 |
4 | 0 | −1 | 0 | −1 | 8.1 | 8.1 | −1 |
5 | −1 | −1 | −1 | 0 | −1 | −1 | 10 |
6 | −1 | −1 | −1 | 0 | −1 | −1 | 10 |
7 | −1 | −1 | −1 | −1 | 0 | 0 | −1 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ibrahim, M.; Elhafiz, R. Integrated Clinical Environment Security Analysis Using Reinforcement Learning. Bioengineering 2022, 9, 253. https://doi.org/10.3390/bioengineering9060253
Ibrahim M, Elhafiz R. Integrated Clinical Environment Security Analysis Using Reinforcement Learning. Bioengineering. 2022; 9(6):253. https://doi.org/10.3390/bioengineering9060253
Chicago/Turabian StyleIbrahim, Mariam, and Ruba Elhafiz. 2022. "Integrated Clinical Environment Security Analysis Using Reinforcement Learning" Bioengineering 9, no. 6: 253. https://doi.org/10.3390/bioengineering9060253
APA StyleIbrahim, M., & Elhafiz, R. (2022). Integrated Clinical Environment Security Analysis Using Reinforcement Learning. Bioengineering, 9(6), 253. https://doi.org/10.3390/bioengineering9060253