The Data Heterogeneity Issue Regarding COVID-19 Lung Imaging in Federated Learning: An Experimental Study
Abstract
:1. Introduction
- Defining data heterogeneity: We provide mathematical definitions and illustrate the effects of various types of skewness on both generalization and personalization metrics.
- Interpreting results: We highlight the implications of real-world data heterogeneity for FL model performance across all participating institutions and evaluation metrics.
- Identifying research opportunities: We outline areas requiring further investigation for each type of skewness to guide future research in optimizing FL systems for medical imaging.
2. Related Work
2.1. Evaluation-Metric-Based Studies
2.2. Skewness-Study-Based Research
3. Background
3.1. Federated Learning
3.2. Data Heterogeneity
3.2.1. Quantity Skew/Label Distribution Skew
3.2.2. Extreme Label Skew
3.2.3. Data Acquisition Skew
3.2.4. Modality Skew
4. System Design
- Central server
- Hospital nodes
- Models
- Distributed local datasets
- The system’s scenario begins with sending simple CNN model architecture and configuration settings to hospitals or medical institutions in parallel.
- Once the participant node receives the model, the CNN model is trained on local data, and the weights of the received model are updated using the number of concurrent local training epochs.
- Each participating node evaluates the last version of the local model after updating it for the last epoch to gauge locality performance.
- All distributed nodes send the locality evaluation metrics, local training sizes, and the latest versions of their models back to the central server.
- After the training session, the central manager provides a new update of the global model using FedAvg and evaluates it against the central testing data to determine the generalization metric.
- All participating nodes share the computed global model.
- Finally, each local site evaluates the global model using local testing data to measure the personalization metric. Then, we increase the round number by one, starting a new training session from step 2.
5. Methodology
5.1. Experiment Settings
5.1.1. Data Settings
- Data Quantity Skew Experiment
- 2.
- Extreme-Label-Skew Experiment
- Experiments Involving Acquisition Skew with and without Extreme Label Skew
- a.
- Extreme label skew: Each site contains varying numbers of labels. The data for each label are split into 90% for training and 10% for local validation, as shown in Figure 7. For central testing, an external dataset of COVID-19 Pakistani patients, comprising two labels, was utilized, as illustrated in Figure 8.
- b.
- Fixed label distribution: The same experiment was repeated with a standardized label set across all sites, retaining only “normal” and “COVID” cases. This fixed-label setup ensured a consistent evaluation of the model’s ability to handle reduced label variability, as shown in Figure 9 and Figure 10.
- 2.
- Experiments regarding Modality Skew with Internal and External Data for Central Test Set
- a.
- Internal data testing: In the first step of this experiment, the four datasets mentioned were distributed across four hospital sites, as shown in Figure 11. Subsequently, 20% of each dataset was reserved for testing at the central server using internal data settings, allowing the evaluation of the global model’s performance, as depicted in Figure 12.
- b.
- External data testing: Here, the same experiment setup was repeated, but this time, the training data across the four sites consisted of the full datasets with common labels (normal and COVID-19), as shown in Figure 13. External datasets—comprising images obtained via both X-ray and CT modalities—were then used for testing at the central server, as illustrated in Figure 14. This approach evaluated the model’s ability to generalize across modalities, with the global model being tested on external data sources.
5.1.2. Hyperparameters Settings
5.2. Evaluation Metrics
- TP—correctly predicted result;
- FP—incorrectly predicted result;
- TN—correctly predicted no-event value;
- FN—incorrectly predicted no-event value.
6. Results
6.1. Group a Results (IID Setting)
6.2. Group B Results (Simple Skew)
6.2.1. Results Regarding Quantity/Label Distribution Skew
6.2.2. Results Regarding Extreme Label Skew
6.3. Group C Results (Hyper-Skewness)
6.3.1. Results Regarding Acquisition Skew with and Without Extreme Labels
6.3.2. Results Regarding Modality Skew with Internal and External Data
7. Discussion
8. Conclusions and Future Work
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Halawa, S.; Pullamsetti, S.S.; Bangham, C.R.M.; Stenmark, K.R.; Dorfmüller, P.; Frid, M.G.; Butrous, G.; Morrell, N.W.; de Jesus Perez, V.A.; Stuart, D.I.; et al. Potential Long-Term Effects of SARS-CoV-2 Infection on the Pulmonary Vasculature: A Global Perspective. Nat. Rev. Cardiol. 2022, 19, 314–331. [Google Scholar] [CrossRef]
- Dayan, I.; Roth, H.R.; Zhong, A.; Harouni, A.; Gentili, A.; Abidin, A.Z.; Liu, A.; Costa, A.B.; Wood, B.J.; Tsai, C.S.; et al. Federated Learning for Predicting Clinical Outcomes in Patients with COVID-19. Nat. Med. 2021, 27, 1735–1743. [Google Scholar] [CrossRef] [PubMed]
- Hryniewska, W.; Bombiński, P.; Szatkowski, P.; Tomaszewska, P.; Przelaskowski, A.; Biecek, P. Checklist for Responsible Deep Learning Modeling of Medical Images Based on COVID-19 Detection Studies. Pattern Recognit. 2021, 118, 108035. [Google Scholar] [CrossRef]
- Mondal, M.R.H.; Bharati, S.; Podder, P.; Kamruzzaman, J. Deep Learning and Federated Learning for Screening COVID-19: A Review. BioMedInformatics 2023, 3, 691–713. [Google Scholar] [CrossRef]
- Kaissis, G.A.; Makowski, M.R.; Rückert, D.; Braren, R.F. Secure Privacy-Preserving and Federated Machine Learning in Medical Imaging. Nat. Mach. Intell. 2020, 2, 305–311. [Google Scholar] [CrossRef]
- Thompson, P.M.; Stein, J.L.; Medland, S.E.; Hibar, D.P.; Vasquez, A.A.; Renteria, M.E.; Toro, R.; Jahanshad, N.; Schumann, G.; Franke, B.; et al. The ENIGMA Consortium: Large-Scale Collaborative Analyses of Neuroimaging and Genetic Data. Brain Imaging Behav. 2014, 8, 153–182. [Google Scholar] [CrossRef] [PubMed]
- Florescu, L.M.; Streba, C.T.; Şerbănescu, M.S.; Mămuleanu, M.; Florescu, D.N.; Teică, R.V.; Nica, R.E.; Gheonea, I.A. Federated Learning Approach with Pre-Trained Deep Learning Models for COVID-19 Detection from Unsegmented CT Images. Life 2022, 12, 958. [Google Scholar] [CrossRef] [PubMed]
- Feki, I.; Ammar, S.; Kessentini, Y.; Muhammad, K. Federated Learning for COVID-19 Screening from Chest X-Ray Images. Appl. Soft Comput. 2020, 106, 107330. [Google Scholar] [CrossRef] [PubMed]
- Nguyen, D.C.; Ding, M.; Pathirana, P.N.; Jin, Y. Federated Learning for COVID-19 Detection with Generative Adversarial Networks in Edge Cloud Computing. IEEE Internet Things J. 2020, 9, 10257–10271. [Google Scholar] [CrossRef]
- Kaissis, G.; Ziller, A.; Passerat-Palmbach, J.; Ryffel, T.; Usynin, D.; Trask, A.; Lima, I.; Mancuso, J.; Jungmann, F.; Steinborn, M.M.; et al. End-to-End Privacy-Preserving Deep Learning on Multi-Institutional Medical Imaging. Nat. Mach. Intell. 2021, 3, 473–484. [Google Scholar] [CrossRef]
- Zhou, J.; Zhou, L.; Wang, D.; Xu, X.; Li, H.; Chu, Y.; Han, W.; Gao, X. Personalized and Privacy-Preserving Federated Heterogeneous Medical Image Analysis with PPPML-HMI. Comput. Biol. Med. 2024, 169, 107861. [Google Scholar] [CrossRef]
- Bai, X.; Wang, H.; Ma, L.; Xu, Y.; Gan, J.; Fan, Z.; Yang, F.; Ma, K.; Yang, J.; Bai, S.; et al. Advancing COVID-19 Diagnosis with Privacy-Preserving Collaboration in Artificial Intelligence. Nat. Mach. Intell. 2021, 3, 1081–1089. [Google Scholar] [CrossRef]
- Siddique, A.A.; Talha, S.U.; Aamir, M.; Algarni, A.D.; Soliman, N.F.; El-Shafai, W. COVID-19 Classification from X-Ray Images: An Approach to Implement Federated Learning on Decentralized Dataset. Comput. Mater. Contin. 2023, 75, 3883–3901. [Google Scholar]
- Bhattacharya, A.; Gawali, M.; Seth, J.; Kulkarni, V. Application of Federated Learning in Building a Robust COVID-19 Chest X-Ray Classification Model. arXiv 2022, arXiv:2204.10505. [Google Scholar]
- Peng, L.; Luo, G.; Walker, A.; Zaiman, Z.; Jones, E.K.; Gupta, H.; Kersten, K.; Burns, J.L.; Harle, C.A.; Magoc, T.; et al. Evaluation of Federated Learning Variations for COVID-19 Diagnosis Using Chest Radiographs from 42 US and European Hospitals. J. Am. Med. Inform. Assoc. 2023, 30, 54–63. [Google Scholar] [CrossRef]
- Kumar, R.; Kumar, J.; Aman, A.; Ali, H.; Bernard, C.M.; Ullah, R.; Zeng, S. Blockchain and Homomorphic Encryption Based Privacy-Preserving Model Aggregation for Medical Images. Comput. Med. Imaging Graph. 2022, 102, 102139. [Google Scholar] [CrossRef]
- Xu, Y.; Ma, L.; Yang, F.; Chen, Y.; Ma, K.; Yang, J.; Yang, X.; Chen, Y.; Shu, C.; Fan, Z.; et al. A Collaborative Online AI Engine for CT-Based COVID-19 Diagnosis. medRxiv 2020. [Google Scholar] [CrossRef]
- Dong, N.; Voiculescu, I. Federated Contrastive Learning for Decentralized Unlabeled Medical Images. In Proceedings of the Medical Image Computing and Computer Assisted Intervention–MICCAI 2021, Strasbourg, France, 27 September–October 2021; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2021; Volume 12903, pp. 378–387. [Google Scholar] [CrossRef]
- Yang, D.; Xu, Z.; Li, W.; Myronenko, A.; Roth, H.R.; Harmon, S.; Xu, S.; Turkbey, B.; Turkbey, E.; Wang, X.; et al. Federated Semi-Supervised Learning for COVID Region Segmentation in Chest CT Using Multi-National Data from China, Italy, Japan R. Med. Image Anal. 2021, 70, 101992. [Google Scholar] [CrossRef] [PubMed]
- Yan, B.; Wang, J.; Cheng, J.; Zhou, Y.; Zhang, Y.; Yang, Y.; Liu, L.; Zhao, H.; Wang, C.; Liu, B. Experiments of Federated Learning for COVID-19 Chest X-Ray Images. Commun. Comput. Inf. Sci. 2021, 1423, 41–53. [Google Scholar] [CrossRef]
- Duan, M.; Liu, D.; Ji, X.; Liu, R.; Liang, L.; Chen, X.; Tan, Y. FedGroup: Efficient Clustered Federated Learning via Decomposed Data-Driven Measure. In Proceedings of the 2021 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom), New York, NY, USA, 30 September–3 October 2021. [Google Scholar] [CrossRef]
- Li, X.; Jiang, M.; Zhang, X.; Kamp, M.; Dou, Q. FedBN: Federated Learning on Non-IID Features via Local Batch Normalization. arXiv 2021, arXiv:2102.07623. [Google Scholar]
- Zhang, L.; Shen, B.; Barnawi, A.; Xi, S.; Kumar, N.; Wu, Y. FedDPGAN: Federated Differentially Private Generative Adversarial Networks Framework for the Detection of COVID-19 Pneumonia. Inf. Syst. Front. 2021, 23, 1403–1415. [Google Scholar] [CrossRef]
- Prayitno; Shyu, C.R.; Putra, K.T.; Chen, H.C.; Tsai, Y.Y.; Tozammel Hossain, K.S.M.; Jiang, W.; Shae, Z.Y.; Hossain, K.S.M.T.; Jiang, W.; et al. A Systematic Review of Federated Learning in the Healthcare Area: From the Perspective of Data Properties and Applications. Appl. Sci. 2021, 11, 11191. [Google Scholar] [CrossRef]
- Aich, S.; Sinai, N.K.; Kumar, S.; Ali, M.; Choi, Y.R.; Joo, M., II; Kim, H.C. Protecting Personal Healthcare Record Using Blockchain Federated Learning Technologies. In Proceedings of the International Conference on Advanced Communication Technology (ICACT), PyeongChang, Republic of Korea, 13–16 February 2021; pp. 109–112. [Google Scholar] [CrossRef]
- Dou, Q.; So, T.Y.; Jiang, M.; Liu, Q.; Vardhanabhuti, V.; Kaissis, G.; Li, Z.; Si, W.; Lee, H.H.C.; Yu, K.; et al. Federated Deep Learning for Detecting COVID-19 Lung Abnormalities in CT: A Privacy-Preserving Multinational Validation Study. NPJ Digit. Med. 2012, 4, 60. [Google Scholar] [CrossRef]
- Ho, T.T.; Tran, K.D.; Huang, Y.; Differential, L.; Using, P.; Images, C.X.; Information, S. FedSGDCOVID: Federated SGD COVID-19 Detection under Local Differential Privacy Using Chest X-Ray Images and Symptom Information. Sensors 2022, 22, 3728. [Google Scholar] [CrossRef]
- Lo, S.K.; Liu, Y.; Lu, Q.; Wang, C.; Xu, X.; Paik, H.-Y.; Zhu, L. Blockchain-Based Trustworthy Federated Learning Architecture. arXiv 2021, arXiv:2108.06912. [Google Scholar]
- Jabłecki, P.; Ślazyk, F.; Malawski, M. Federated Learning in the Cloud for Analysis of Medical Images–Experience with Open Source Frameworks. In Proceedings of the Clinical Image-Based Procedures, Distributed and Collaborative Learning, Artificial Intelligence for Com-bating COVID-19 and Secure and Privacy-Preserving Machine Learning. (DCL 2021, PPML 2021, LL-COVID19 2021, CLIP 2021), Strasbourg, France, 27 September–1 October 2021; Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Berlin/Heidelberg, Germany, 2021; Volume 12969, pp. 111–119. [Google Scholar] [CrossRef]
- Malik, H.; Naeem, A.; Naqvi, R.A.; Loh, W.K. DMFL_Net: A Federated Learning-Based Framework for the Classification of COVID-19 from Multiple Chest Diseases Using X-Rays. Sensors 2023, 23, 743. [Google Scholar] [CrossRef] [PubMed]
- Adhikari, R.; Settles, C. Secure Federated Learning Approaches to Diagnosing COVID-19. arXiv 2024, arXiv:2401.12438. [Google Scholar]
- Qayyum, A.; Ahmad, K.; Ahsan, M.A.; Al-Fuqaha, A.; Qadir, J. Collaborative Federated Learning for Healthcare: Multi-Modal COVID-19 Diagnosis at the Edge. IEEE Open J. Comput. Soc. 2022, 3, 172–184. [Google Scholar] [CrossRef]
- Zhang, W.; Zhou, T.; Lu, Q.; Wang, X.; Zhu, C. Dynamic Fusion-Based Federated Learning for COVID-19 Detection. IEEE Internet Things J. 2021, 8, 15884–15891. [Google Scholar] [CrossRef]
- McMahan, B.; Moore, E.; Ramage, D.; Hampson, S.; y Arcas, B.A. Communication-Efficient Learning of Deep Networks from Decentralized Data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, Fort Lauderdale, FL, USA, 20–22 April 2017; Volume 54, p. 10. [Google Scholar]
- Zhou, T.; Lin, Z.; Zhang, J.; Tsang, D.H.K. Understanding and Improving Model Averaging in Federated Learning on Heterogeneous Data. IEEE Trans. Mob. Comput. 2024, 23, 12131–12145. [Google Scholar] [CrossRef]
- Ma, T.; Chen, J.; Hoang, T.N. Federated Learning of Models Pretrained on Different Features with Consensus Graphs. Springer Optim. Its Appl. 2023, 213, 289–319. [Google Scholar] [CrossRef]
- Loddo, A.; Pili, F.; di Ruberto, C. Deep Learning for COVID-19 Diagnosis from CT Images. Appl. Sci. 2021, 11, 8227. [Google Scholar] [CrossRef]
- Naz, S.; Phan, K.T.; Chen, Y.P.P. A Comprehensive Review of Federated Learning for COVID-19 Detection. Int. J. Intell. Syst. 2022, 37, 2371–2392. [Google Scholar] [CrossRef] [PubMed]
- Chowdhury, M.E.H.; Rahman, T.; Khandakar, A.; Mazhar, R.; Kadir, M.A.; Mahbub, Z.B.; Islam, K.R.; Khan, M.S.; Iqbal, A.; Emadi, N.A.; et al. Can AI Help in Screening Viral and COVID-19 Pneumonia? IEEE Access 2020, 8, 132665–132676. [Google Scholar] [CrossRef]
- Rahman, T.; Khandakar, A.; Qiblawey, Y.; Tahir, A.; Kiranyaz, S.; Kashem, S.B.A.; Islam, M.T.; Al Maadeed, S.; Zughaier, S.M.; Khan, M.S.; et al. Exploring the Effect of Image Enhancement Techniques on COVID-19 Detection Using Chest X-Ray Images. Comput. Biol. Med. 2021, 132, 104319. [Google Scholar] [CrossRef]
- Vantaggiato, E.; Paladini, E.; Bougourzi, F.; Distante, C.; Hadid, A.; Taleb-Ahmed, A. COVID-19 Recognition Using Ensemble-Cnns in Two New Chest X-Ray Databases. Sensors 2021, 21, 1742. [Google Scholar] [CrossRef] [PubMed]
- Tahir, A.M.; Chowdhury, M.E.H.; Khandakar, A.; Qiblawey, Y.; Khurshid, U.; Kiranyaz, S.; Ibtehaz, N.; Rahman, M.S.; Al-Madeed, S.; Mahmud, S.; et al. COVID-QU-Ex. Kaggle. 2021. Available online: https://www.kaggle.com/datasets/anasmohammedtahir/covidqu (accessed on 10 January 2025).
- Umair, M.; Khan, M.S.; Ahmed, F.; Baothman, F.; Alqahtani, F.; Alian, M.; Ahmad, J. Detection of COVID-19 Using Transfer Learning and Grad-Cam Visualization on Indigenously Collected X-Ray Dataset. Sensors 2021, 21, 5813. [Google Scholar] [CrossRef] [PubMed]
- Maftouni, M.; Law AC, C.; Shen, B.; Grado ZJ, K.; Zhou, Y.; Yazdi, N.A. A Robust Ensemble-Deep Learning Model for COVID-19 Diagnosis Based on an Integrated CT Scan Images Database. In Proceedings of the 2021 IISE Annual Conference, Montreal, QC, Canada, 22–25 May 2021. [Google Scholar]
- Soares, E.; Angelov, P.; Biaso, S.; Froes, M.H.; Abe, D.K. SARS-CoV-2 CT-Scan Dataset: A Large Dataset of Real Patients CT Scans for SARS-CoV-2 Identification. medRxiv 2020. [Google Scholar] [CrossRef]
- Ng, D.; Lan, X.; Yao, M.M.S.; Chan, W.P.; Feng, M. Federated Learning: A Collaborative Effort to Achieve Better Medical Imaging Models for Individual Sites That Have Small Labelled Datasets. Quant. Imaging Med. Surg. 2021, 11, 852–857. [Google Scholar] [CrossRef] [PubMed]
- Hernandez-cruz, N.; Saha, P.; Sarker, M.K.; Noble, J.A. Review of Federated Learning and Machine Learning-Based Methods for Medical Image Analysis. Big Data Cogn. Comput. 2024, 8, 99. [Google Scholar] [CrossRef]
- Li, Q.; Diao, Y.; Chen, Q.; He, B. Federated Learning on Non-IID Data Silos: An Experimental Study. In Proceedings of the 2022 IEEE 38th International Conference on Data Engineering (ICDE), Kuala Lumpur, Malaysia, 9–12 May 2022; pp. 965–978. [Google Scholar] [CrossRef]
- Abdul, M.; Id, S.; Taha, S.; Ramadan, M. COVID-19 Detection Using Federated Machine Learning. PLoS ONE 2021, 16, e0252573. [Google Scholar] [CrossRef]
- Kumar, R.; Khan, A.A.; Kumar, J.; Zakria; Golilarz, N.A.; Zhang, S.; Ting, Y.; Zheng, C.; Wang, W.; Zakria; et al. Blockchain-Federated-Learning and Deep Learning Models for COVID-19 Detection Using CT Imaging. IEEE Sens. J. 2021, 21, 16301–16314. [Google Scholar] [CrossRef] [PubMed]
- Rao, A.; Wang, X.; Wen, Y. Challenges in Medical Imaging Analysis with Heterogeneous Datasets. Med. Image Anal. 2021, 72, 102101. [Google Scholar]
- Alhafiz, F.S.; Basuhail, A.A. Non-IID Medical Imaging Data on COVID-19 in the Federated Learning Framework: Impact and Directions. COVID 2024, 4, 1985–2016. [Google Scholar] [CrossRef]
Ref. | Heterogeneity Type | Evaluated Factor | Findings |
---|---|---|---|
[1] | Acquisition skew | Personalization metric | Focused on a personalized metric fixed by an adaptive local epoch is an effective method. |
[2] | Acquisition skew | Generalization and personalization metrics | Distributed data used for training were made visible by pretrained CNN models. |
[3,4,5,6] | Quantity and label distribution skew | Generalization metric | Maximizing the size of data is an effective way of improving the generalization metric. |
[7] | Extreme label skew | Generalization metric | Skew can be managed via hyperparameter settings. |
[8] | Feature skew | Generalization metric | Non-IID data had a significant negative impact. |
[9] | Data acquisition and modality skew | Generalization and localization metrics | The global model could be successful regardless of the image modality |
Our work | IID vs. 6 different skewness types | Generalization, personalization, and localization metrics | FL exhibited effective performance when there was a maximum of one skew type. However, a mixture of different skew types caused high divergence of the training model for all metrics. |
Parameters | Values |
---|---|
Round number | 15 |
Local epoch | 10 |
Train/Test batch size | 32 for distributed training/64 for central testing |
Learning rate (ƛ) | 0.001 |
Optimizer function | SGD |
Number of participants | 4, assumed to be working in parallel |
Aggregation strategy | FedAvg |
Image size | 224 × 224 |
Augmentation methods | Training set: Images undergo random horizontal flips and are normalized with the specified mean and standard deviation for each channel (RGB). Testing set: The procedures applied are similar to those used in training but exclude random horizontal flips. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Alhafiz, F.; Basuhail, A. The Data Heterogeneity Issue Regarding COVID-19 Lung Imaging in Federated Learning: An Experimental Study. Big Data Cogn. Comput. 2025, 9, 11. https://doi.org/10.3390/bdcc9010011
Alhafiz F, Basuhail A. The Data Heterogeneity Issue Regarding COVID-19 Lung Imaging in Federated Learning: An Experimental Study. Big Data and Cognitive Computing. 2025; 9(1):11. https://doi.org/10.3390/bdcc9010011
Chicago/Turabian StyleAlhafiz, Fatimah, and Abdullah Basuhail. 2025. "The Data Heterogeneity Issue Regarding COVID-19 Lung Imaging in Federated Learning: An Experimental Study" Big Data and Cognitive Computing 9, no. 1: 11. https://doi.org/10.3390/bdcc9010011
APA StyleAlhafiz, F., & Basuhail, A. (2025). The Data Heterogeneity Issue Regarding COVID-19 Lung Imaging in Federated Learning: An Experimental Study. Big Data and Cognitive Computing, 9(1), 11. https://doi.org/10.3390/bdcc9010011