Machine Learning in Data Science

A special issue of Machine Learning and Knowledge Extraction (ISSN 2504-4990). This special issue belongs to the section "Data".

Deadline for manuscript submissions: closed (31 December 2024) | Viewed by 6956

Special Issue Editors


E-Mail Website
Guest Editor
Industrial Systems Institute (ISI), Athena Research and Innovation Center, 26504 Patras, Greece
Interests: artificial intelligence; big data; data analysis; databases; data mining; data structures; machine learning; privacy; security; trust
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Informatics and Computer Engineering, University of West Attica, 12243 Egaleo, Greece
Interests: knowledge management; context representation and analysis; knowledge-assisted multimedia analysis
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Data science is a field of study that focuses on the extraction of valuable information from noisy data, and incorporates various disciplines, such as data engineering, data preprocessing, visualization, predictive analytics, data mining, machine learning, and statistics. In recent years, there has been a rapidly growing interest in the mathematical and theoretical aspects of data science. This manifests in deterministic and non-deterministic models (i.e., probabilistic and a family of probabilistic known as statistical) in order to provide performance guarantee, robustness, reusable, and interpretable results.

The digital transformation of information systems has made feasible the effective use of data science techniques such as artificial intelligence (AI) and machine learning (ML) for various applications. In addition, the use of sensor technology and AI/ML will inevitably lead to more objective and improved performance, lower cost, and more effective system management overall.

The aim of this Special Issue is to provide original, high-quality innovative ideas and research solutions (for both theoretical and practical challenges) for data analysis and modeling with the aid of artificial intelligence and machine learning in various domains and applications.

Dr. Elias Dritsas
Dr. Phivos Mylonas
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Machine Learning and Knowledge Extraction is an international peer-reviewed open access quarterly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • data science
  • data mining
  • artificial intelligence
  • machine learning
  • statistics
  • predictive modeling
  • monitoring
  • data analytics

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review, Other

19 pages, 3484 KiB  
Article
Efficient Visual-Aware Fashion Recommendation Using Compressed Node Features and Graph-Based Learning
by Umar Subhan Malhi, Junfeng Zhou, Abdur Rasool and Shahbaz Siddeeq
Mach. Learn. Knowl. Extr. 2024, 6(3), 2111-2129; https://doi.org/10.3390/make6030104 - 15 Sep 2024
Cited by 1 | Viewed by 1266
Abstract
In fashion e-commerce, predicting item compatibility using visual features remains a significant challenge. Current recommendation systems often struggle to incorporate high-dimensional visual data into graph-based learning models effectively. This limitation presents a substantial opportunity to enhance the precision and effectiveness of fashion recommendations. [...] Read more.
In fashion e-commerce, predicting item compatibility using visual features remains a significant challenge. Current recommendation systems often struggle to incorporate high-dimensional visual data into graph-based learning models effectively. This limitation presents a substantial opportunity to enhance the precision and effectiveness of fashion recommendations. In this paper, we present the Visual-aware Graph Convolutional Network (VAGCN). This novel framework helps improve how visual features can be incorporated into graph-based learning systems for fashion item compatibility predictions. The VAGCN framework employs a deep-stacked autoencoder to convert the input image’s high-dimensional raw CNN visual features into more manageable low-dimensional representations. In addition to improving feature representation, the GCN can also reason more intelligently about predictions, which would not be possible without this compression. The GCN encoder processes nodes in the graph to capture structural and feature correlation. Following the GCN encoder, the refined embeddings are input to a multi-layer perceptron (MLP) to calculate compatibility scores. The approach extends to using neighborhood information only during the testing phase to help with training efficiency and generalizability in practical scenarios, a key characteristic of our model. By leveraging its ability to capture latent visual features and neighborhood-based learning, VAGCN thoroughly investigates item compatibility across various categories. This method significantly improves predictive accuracy, consistently outperforming existing benchmarks. These contributions tackle significant scalability and computational efficiency challenges, showcasing the potential transformation of recommendation systems through enhanced feature representation, paving the way for further innovations in the fashion domain. Full article
(This article belongs to the Special Issue Machine Learning in Data Science)
Show Figures

Figure 1

16 pages, 1999 KiB  
Article
Insights from Augmented Data Integration and Strong Regularization in Drug Synergy Prediction with SynerGNet
by Mengmeng Liu, Gopal Srivastava, J. Ramanujam and Michal Brylinski
Mach. Learn. Knowl. Extr. 2024, 6(3), 1782-1797; https://doi.org/10.3390/make6030087 - 29 Jul 2024
Cited by 1 | Viewed by 1213
Abstract
SynerGNet is a novel approach to predicting drug synergy against cancer cell lines. In this study, we discuss in detail the construction process of SynerGNet, emphasizing its comprehensive design tailored to handle complex data patterns. Additionally, we investigate a counterintuitive phenomenon when integrating [...] Read more.
SynerGNet is a novel approach to predicting drug synergy against cancer cell lines. In this study, we discuss in detail the construction process of SynerGNet, emphasizing its comprehensive design tailored to handle complex data patterns. Additionally, we investigate a counterintuitive phenomenon when integrating more augmented data into the training set results in an increase in testing loss alongside improved predictive accuracy. This sheds light on the nuanced dynamics of model learning. Further, we demonstrate the effectiveness of strong regularization techniques in mitigating overfitting, ensuring the robustness and generalization ability of SynerGNet. Finally, the continuous performance enhancements achieved through the integration of augmented data are highlighted. By gradually increasing the amount of augmented data in the training set, we observe substantial improvements in model performance. For instance, compared to models trained exclusively on the original data, the integration of the augmented data can lead to a 5.5% increase in the balanced accuracy and a 7.8% decrease in the false positive rate. Through rigorous benchmarks and analyses, our study contributes valuable insights into the development and optimization of predictive models in biomedical research. Full article
(This article belongs to the Special Issue Machine Learning in Data Science)
Show Figures

Figure 1

Review

Jump to: Research, Other

55 pages, 2140 KiB  
Review
A Review on Machine Learning Deployment Patterns and Key Features in the Prediction of Preeclampsia
by Louise Pedersen, Magdalena Mazur-Milecka, Jacek Ruminski and Stefan Wagner
Mach. Learn. Knowl. Extr. 2024, 6(4), 2515-2569; https://doi.org/10.3390/make6040123 - 5 Nov 2024
Viewed by 970
Abstract
Previous reviews have investigated machine learning (ML) models used to predict the risk of developing preeclampsia. However, they have not addressed the intended deployment of these models throughout pregnancy, nor have they detailed feature performance. This study aims to provide an overview of [...] Read more.
Previous reviews have investigated machine learning (ML) models used to predict the risk of developing preeclampsia. However, they have not addressed the intended deployment of these models throughout pregnancy, nor have they detailed feature performance. This study aims to provide an overview of existing ML models and their intended deployment patterns and performance, along with identified features of high importance. This review followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 guidelines. The search was performed in January and February 2024. It included all papers published before March 2024 obtained from the scientific databases: PubMed, Engineering Village, the Association for Computing Machinery, Scopus, and Web of Science. Of a total of 198 identified studies, 18 met the inclusion criteria. Among these, 11 showed the intent to use the ML model as a single-use tool, two intended a dual-use, and two intended multiple-use. Ten studies listed the features of the highest importance, with systolic and diastolic blood pressure, mean arterial pressure, and hypertension frequently mentioned as critical predictors. Notably, three of the four studies proposing dual or multiple-use models were conducted in 2023 and 2024, while the remaining study is from 2009. No single ML model emerged as superior across the subgroups of PE. Incorporating body mass index alongside hypertension and either mean arterial pressure, diastolic blood pressure, or systolic blood pressure as features may enhance performance. The deployment patterns mainly focused on single use during gestational weeks 11+0 to 14+1. Full article
(This article belongs to the Special Issue Machine Learning in Data Science)
Show Figures

Figure 1

Other

Jump to: Research, Review

21 pages, 748 KiB  
Systematic Review
Tertiary Review on Explainable Artificial Intelligence: Where Do We Stand?
by Frank van Mourik, Annemarie Jutte, Stijn E. Berendse, Faiza A. Bukhsh and Faizan Ahmed
Mach. Learn. Knowl. Extr. 2024, 6(3), 1997-2017; https://doi.org/10.3390/make6030098 - 30 Aug 2024
Cited by 2 | Viewed by 1969
Abstract
Research into explainable artificial intelligence (XAI) methods has exploded over the past five years. It is essential to synthesize and categorize this research and, for this purpose, multiple systematic reviews on XAI mapped out the landscape of the existing methods. To understand how [...] Read more.
Research into explainable artificial intelligence (XAI) methods has exploded over the past five years. It is essential to synthesize and categorize this research and, for this purpose, multiple systematic reviews on XAI mapped out the landscape of the existing methods. To understand how these methods have developed and been applied and what evidence has been accumulated through model training and analysis, we carried out a tertiary literature review that takes as input systematic literature reviews published between 1992 and 2023. We evaluated 40 systematic literature review papers and presented binary tabular overviews of researched XAI methods and their respective characteristics, such as the scope, scale, input data, explanation data, and machine learning models researched. We identified seven distinct characteristics and organized them into twelve specific categories, culminating in the creation of comprehensive research grids. Within these research grids, we systematically documented the presence or absence of research mentions for each pairing of characteristic and category. We identified 14 combinations that are open to research. Our findings reveal a significant gap, particularly in categories like the cross-section of feature graphs and numerical data, which appear to be notably absent or insufficiently addressed in the existing body of research and thus represent a future research road map. Full article
(This article belongs to the Special Issue Machine Learning in Data Science)
Show Figures

Figure 1

Back to TopTop