Advances in Machine Learning and Applications

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "Mathematics and Computer Science".

Deadline for manuscript submissions: closed (31 October 2024) | Viewed by 83625

Special Issue Editors


E-Mail Website
Guest Editor
1. Department of Computer Technologies and Natural Sciences, ISMA University, Latvia, Riga
2. Department of Software Engineering, Institute of Automation and Information Technologies, Satbayev University, Satpayev str., 22A, Almaty 050013, Kazakhstan
Interests: applications of machine learning; data processing; scientometrics and decision support systems

E-Mail Website
Guest Editor
Department of Digital Technologies of Data Processing, MIREA – Russian Technological University, 119454 Moscow, Russia
Interests: symmetry groups; lie groups; dynamic systems modeling; experimental processing; artificial intellectual technologies; information systems
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Machine learning realizes the potential inherent in the idea of artificial intelligence. The main expectation associated with machine learning is the realization of flexible, adaptive, “teachable” computational methods or algorithms. These methods provide new functions of programs and systems.

Machine learning is widely used in various practical applications. A far from complete list includes medicine, biology, chemistry, agriculture, mining, finance, industry, natural language processing, astronomy, etc. Along with applications, this field of knowledge is characterized by high dynamics of theoretical research, especially in the field of deep learning. Machine learning methods and algorithms can be divided into classical and new ones. Classical algorithms and methods are described in sufficient detail and are widely used in practice. Where researchers have access to large amounts of data, impressive results are emerging with the use of deep learning methods. New architectures of deep neural networks and their modifications for various applications appear almost daily. At the same time, despite the significant differences in algorithms and methods, many practical applications are developed using similar techniques.

The purpose of this Special Issue is to gather a collection of articles reflecting the similarities and differences of the latest applied implementations of machine learning in different areas. This will allow researchers to apply the developed machine learning cases to obtain new results in various application areas. We look forward to receiving your contributions.

Prof. Dr. Ravil Muhamedyev
Prof. Dr. Evgeny Nikulchev
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • machine learning
  • deep learning
  • regression
  • classification
  • unsupervised learning
  • supervisor learning
  • semi supervisor learning
  • reinforcement learning
  • transfer learning
  • transformers
  • natural test processing
  • speech processing
  • image processing
  • machine vision
  • convolution neural network
  • recurrent neural networks

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (19 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

17 pages, 1832 KiB  
Article
Improving Systematic Generalization of Linear Transformer Using Normalization Layers and Orthogonality Loss Function
by Taewon Park and Hyun-Chul Kim
Mathematics 2024, 12(21), 3390; https://doi.org/10.3390/math12213390 - 30 Oct 2024
Viewed by 450
Abstract
A Linear Transformer linearizes the attention mechanism of the vanilla Transformer architecture, significantly improving efficiency and achieving linear theoretical complexity with respect to sequence length. However, few studies have explored the capabilities of the Linear Transformer beyond its efficiency. In this work, we [...] Read more.
A Linear Transformer linearizes the attention mechanism of the vanilla Transformer architecture, significantly improving efficiency and achieving linear theoretical complexity with respect to sequence length. However, few studies have explored the capabilities of the Linear Transformer beyond its efficiency. In this work, we investigate the systematic generalization capability of the Linear Transformer, a crucial property for strong generalization to unseen data. Through preliminary experiments, we identify two major issues contributing to its unstable systematic generalization performance: (i) unconstrained norms of Queries and Keys, and (ii) high correlation among Values across the sequence. To address these issues, we propose two simple yet effective methods: normalization layers for Queries and Keys, and an orthogonality loss function applied to Values during training. In experiments, we demonstrate that applying these methods to the Linear Transformer significantly improves its stability and systematic generalization performance across several well-known tasks. Furthermore, our proposed methods outperform the vanilla Transformer on specific systematic generalization tasks, such as the sort-of-CLEVR and SCAN tasks. Full article
(This article belongs to the Special Issue Advances in Machine Learning and Applications)
Show Figures

Figure 1

11 pages, 5336 KiB  
Article
State-of-the-Art Results with the Fashion-MNIST Dataset
by Ravil I. Mukhamediev
Mathematics 2024, 12(20), 3174; https://doi.org/10.3390/math12203174 - 11 Oct 2024
Viewed by 1150
Abstract
In September 2024, the Fashion-MNIST dataset will be 7 years old. Proposed as a replacement for the well-known MNIST dataset, it continues to be used to evaluate machine learning model architectures. This paper describes new results achieved with the Fashion-MNIST dataset using classical [...] Read more.
In September 2024, the Fashion-MNIST dataset will be 7 years old. Proposed as a replacement for the well-known MNIST dataset, it continues to be used to evaluate machine learning model architectures. This paper describes new results achieved with the Fashion-MNIST dataset using classical machine learning models and a relatively simple convolutional network. We present the state-of-the-art results obtained using the CNN-3-128 convolutional network and data augmentation. The developed CNN-3-128 model containing three convolutional layers achieved an accuracy of 99.65% in the Fashion-MNIST test image set. In addition, this paper presents the results of computational experiments demonstrating the dependence between the number of adjustable parameters of the convolutional network and the maximum acceptable classification quality, which allows us to optimise the computational cost of model training. Full article
(This article belongs to the Special Issue Advances in Machine Learning and Applications)
Show Figures

Figure 1

27 pages, 3028 KiB  
Article
A New Predictive Method for Classification Tasks in Machine Learning: Multi-Class Multi-Label Logistic Model Tree (MMLMT)
by Bita Ghasemkhani, Kadriye Filiz Balbal and Derya Birant
Mathematics 2024, 12(18), 2825; https://doi.org/10.3390/math12182825 - 12 Sep 2024
Viewed by 1434
Abstract
This paper introduces a novel classification method for multi-class multi-label datasets, named multi-class multi-label logistic model tree (MMLMT). Our approach supports multi-label learning to predict multiple class labels simultaneously, thereby enhancing the model’s capacity to capture complex relationships within the data. The primary [...] Read more.
This paper introduces a novel classification method for multi-class multi-label datasets, named multi-class multi-label logistic model tree (MMLMT). Our approach supports multi-label learning to predict multiple class labels simultaneously, thereby enhancing the model’s capacity to capture complex relationships within the data. The primary goal is to improve the accuracy of classification tasks involving multiple classes and labels. MMLMT integrates the logistic regression (LR) and decision tree (DT) algorithms, yielding interpretable models with high predictive performance. By combining the strengths of LR and DT, our method offers a flexible and powerful framework for handling multi-class multi-label data. Extensive experiments demonstrated the effectiveness of MMLMT across a range of well-known datasets with an average accuracy of 85.90%. Furthermore, our method achieved an average of 9.87% improvement compared to the results of state-of-the-art studies in the literature. These results highlight MMLMT’s potential as a valuable approach to multi-label learning. Full article
(This article belongs to the Special Issue Advances in Machine Learning and Applications)
Show Figures

Figure 1

25 pages, 19187 KiB  
Article
A Graph-Based Keyword Extraction Method for Academic Literature Knowledge Graph Construction
by Lin Zhang, Yanan Li and Qinru Li
Mathematics 2024, 12(9), 1349; https://doi.org/10.3390/math12091349 - 29 Apr 2024
Viewed by 1509
Abstract
In this paper, we construct an academic literature knowledge graph based on the relationship between documents to facilitate the storage and research of academic literature data. Keywords are an important type of node in the knowledge graph. To solve the problem that there [...] Read more.
In this paper, we construct an academic literature knowledge graph based on the relationship between documents to facilitate the storage and research of academic literature data. Keywords are an important type of node in the knowledge graph. To solve the problem that there are no keywords in some documents for several reasons in the process of knowledge graph construction, an improved keyword extraction algorithm called TP-CoGlo-TextRank is proposed by using word frequency, position, word co-occurrence frequency, and a word embedding model. By combining the word frequency and position in the document, the importance of words is distinguished. By introducing the GloVe word-embedding model, which brings the external knowledge of documents into the TextRank algorithm, and combining the internal word co-occurrence frequency in the documents, the word-adjacency relationship is transferred non-uniformly. Finally, the words with the highest scores are combined into phrases if they are adjacent in the original text. The validity of the TP-CoGlo-TextRank algorithm is verified by experiments. On this basis, the Neo4j graph database is used to store and display the academic literature knowledge graph, to provide data support for research tasks such as text clustering, automatic summarization, and question-answering systems. Full article
(This article belongs to the Special Issue Advances in Machine Learning and Applications)
Show Figures

Figure 1

20 pages, 4473 KiB  
Article
Determination of Reservoir Oxidation Zone Formation in Uranium Wells Using Ensemble Machine Learning Methods
by Ravil I. Mukhamediev, Yan Kuchin, Yelena Popova, Nadiya Yunicheva, Elena Muhamedijeva, Adilkhan Symagulov, Kirill Abramov, Viktors Gopejenko, Vitaly Levashenko, Elena Zaitseva, Natalya Litvishko and Sergey Stankevich
Mathematics 2023, 11(22), 4687; https://doi.org/10.3390/math11224687 - 17 Nov 2023
Cited by 2 | Viewed by 1199
Abstract
Approximately 50% of the world’s uranium is mined in a closed way using underground well leaching. In the process of uranium mining at formation-infiltration deposits, an important role is played by the correct identification of the formation of reservoir oxidation zones (ROZs), within [...] Read more.
Approximately 50% of the world’s uranium is mined in a closed way using underground well leaching. In the process of uranium mining at formation-infiltration deposits, an important role is played by the correct identification of the formation of reservoir oxidation zones (ROZs), within which the uranium content is extremely low and which affect the determination of ore reserves and subsequent mining processes. The currently used methodology for identifying ROZs requires the use of highly skilled labor and resource-intensive studies using neutron fission logging; therefore, it is not always performed. At the same time, the available electrical logging measurements data collected in the process of geophysical well surveys and exploration well data can be effectively used to identify ROZs using machine learning models. This study presents a solution to the problem of detecting ROZs in uranium deposits using ensemble machine learning methods. This method provides an index of weighted harmonic measure (f1_weighted) in the range from 0.72 to 0.93 (XGB classifier), and sufficient stability at different ratios of objects in the input dataset. The obtained results demonstrate the potential for practical use of this method for detecting ROZs in formation-infiltration uranium deposits using ensemble machine learning. Full article
(This article belongs to the Special Issue Advances in Machine Learning and Applications)
Show Figures

Figure 1

12 pages, 1926 KiB  
Article
Particle Swarm Training of a Neural Network for the Lower Upper Bound Estimation of the Prediction Intervals of Time Series
by Alexander Gusev, Alexander Chervyakov, Anna Alexeenko and Evgeny Nikulchev
Mathematics 2023, 11(20), 4342; https://doi.org/10.3390/math11204342 - 19 Oct 2023
Cited by 1 | Viewed by 1283
Abstract
Many time series forecasting applications use ranges rather than point forecasts. Producing forecasts in the form of Prediction Intervals (PIs) is natural, since intervals are an important component of many mathematical models. The LUBE (Lower Upper Bound Estimation) method is aimed at finding [...] Read more.
Many time series forecasting applications use ranges rather than point forecasts. Producing forecasts in the form of Prediction Intervals (PIs) is natural, since intervals are an important component of many mathematical models. The LUBE (Lower Upper Bound Estimation) method is aimed at finding ranges based on solving optimization problems taking into account interval width and coverage. Using the Particle Swarm Training of simple neural network, we look for a solution to the optimization problem of the Coverage Width-Based Criterion (CWC), which is the exponential convolution of conflicting criteria PICP (Prediction Interval Coverage Probability) and PINRW (Prediction Interval Normalized Root-mean-square Width). Based on the concept of the Pareto compromise, it is introduced as a Pareto front in the space of specified criteria. The Pareto compromise is constructed as a relationship between conflicting criteria based on the found solution to the optimization problem. The data under consideration are the financial time series of the MOEX closing prices. Our findings reveal that a relatively simple neural network, comprising eight neurons and their corresponding 26 parameters (weights of neuron connections and neuron signal biases), is sufficient to yield reliable PIs for the investigated financial time series. The novelty of our approach lies in the use of a simple network structure (containing fewer than 100 parameters) to construct PIs for a financial time series. Additionally, we offer an experimental construction of the Pareto frontier, formed by the PICP and PINRW criteria. Full article
(This article belongs to the Special Issue Advances in Machine Learning and Applications)
Show Figures

Figure 1

18 pages, 6674 KiB  
Article
Real-Time Video Smoke Detection Based on Deep Domain Adaptation for Injection Molding Machines
by Ssu-Han Chen, Jer-Huan Jang, Meng-Jey Youh, Yen-Ting Chou, Chih-Hsiang Kang, Chang-Yen Wu, Chih-Ming Chen, Jiun-Shiung Lin, Jin-Kwan Lin and Kevin Fong-Rey Liu
Mathematics 2023, 11(17), 3728; https://doi.org/10.3390/math11173728 - 30 Aug 2023
Viewed by 1386
Abstract
Leakage with smoke is often accompanied by fire and explosion hazards. Detecting smoke helps gain time for crisis management. This study aims to address this issue by establishing a video smoke detection system, based on a convolutional neural network (CNN), with the help [...] Read more.
Leakage with smoke is often accompanied by fire and explosion hazards. Detecting smoke helps gain time for crisis management. This study aims to address this issue by establishing a video smoke detection system, based on a convolutional neural network (CNN), with the help of smoke synthesis, auto-annotation, and an attention mechanism by fusing gray histogram image information. Additionally, the study incorporates the domain adversarial training of neural networks (DANN) to investigate the effect of domain shifts when adapting the smoke detection model from one injection molding machine to another on-site. It achieves the function of domain confusion without requiring labeling, as well as the automatic extraction of domain features and automatic adversarial training, using target domain data. Compared to deep domain confusion (DDC), naïve DANN, and the domain separation network (DSN), the proposed method achieves the highest accuracy rates of 93.17% and 91.35% in both scenarios. Furthermore, the experiment employs t-distributed stochastic neighbor embedding (t-SNE) to facilitate fast training and smoke detection between machines by leveraging domain adaption features. Full article
(This article belongs to the Special Issue Advances in Machine Learning and Applications)
Show Figures

Figure 1

21 pages, 4970 KiB  
Article
Cyberbullying Detection on Twitter Using Deep Learning-Based Attention Mechanisms and Continuous Bag of Words Feature Extraction
by Suliman Mohamed Fati, Amgad Muneer, Ayed Alwadain and Abdullateef O. Balogun
Mathematics 2023, 11(16), 3567; https://doi.org/10.3390/math11163567 - 17 Aug 2023
Cited by 18 | Viewed by 4534 | Correction
Abstract
Since social media platforms are widely used and popular, they have given us more opportunities than we can even imagine. Despite all of the known benefits, some users may abuse these opportunities to humiliate, insult, bully, and harass other people. This issue explains [...] Read more.
Since social media platforms are widely used and popular, they have given us more opportunities than we can even imagine. Despite all of the known benefits, some users may abuse these opportunities to humiliate, insult, bully, and harass other people. This issue explains why there is a need to reduce such negative activities and create a safe cyberspace for innocent people by detecting cyberbullying activity. This study provides a comparative analysis of deep learning methods used to test and evaluate their effectiveness regarding a well-known global Twitter dataset. To recognize abusive tweets and overcome existing challenges, attention-based deep learning methods are introduced. The word2vec with CBOW concatenated formed the weights included in the embedding layer and was used to extract the features. The feature vector was input into a convolution and pooling mechanism, reducing the feature dimensionality while learning the position-invariant of the offensive words. A SoftMax function predicts feature classification. Using benchmark experimental datasets and well-known evaluation measures, the convolutional neural network model with attention-based long- and short-term memory was found to outperform other DL methods. The proposed cyberbullying detection methods were evaluated using benchmark experimental datasets and well-known evaluation measures. Finally, the results demonstrated the superiority of the attention-based 1D convolutional long short-term memory (Conv1DLSTM) classifier over the other implemented methods. Full article
(This article belongs to the Special Issue Advances in Machine Learning and Applications)
Show Figures

Figure 1

36 pages, 3838 KiB  
Article
GERPM: A Geographically Weighted Stacking Ensemble Learning-Based Urban Residential Rents Prediction Model
by Guang Hu and Yue Tang
Mathematics 2023, 11(14), 3160; https://doi.org/10.3390/math11143160 - 18 Jul 2023
Cited by 1 | Viewed by 1534
Abstract
Accurate prediction of urban residential rents is of great importance for landlords, tenants, and investors. However, existing rents prediction models face challenges in meeting practical demands due to their limited perspectives and inadequate prediction performance. The existing individual prediction models often lack satisfactory [...] Read more.
Accurate prediction of urban residential rents is of great importance for landlords, tenants, and investors. However, existing rents prediction models face challenges in meeting practical demands due to their limited perspectives and inadequate prediction performance. The existing individual prediction models often lack satisfactory accuracy, while ensemble learning models that combine multiple individual models to improve prediction results often overlook the impact of spatial heterogeneity on residential rents. To address these issues, this paper proposes a novel prediction model called GERPM, which stands for Geographically Weighted Stacking Ensemble Learning-Based Urban Residential Rents Prediction Model. GERPM comprehensively analyzes the influencing factors of residential rents from multiple perspectives and leverages a geographically weighted stacking ensemble learning approach. The model combines multiple machine learning and deep learning models, optimizes parameters to achieve optimal predictions, and incorporates the geographically weighted regression (GWR) model to consider spatial heterogeneity. By combining the strengths of deep learning and machine learning models and taking into account geographical factors, GERPM aims to improve prediction accuracy and provide robust predictions for urban residential rents. The model is evaluated using housing data from Nanjing, a major city in China, and compared with representative individual prediction models, the equal weight combination model, and the ensemble learning model. The experimental results demonstrate that GERPM outperforms other models in terms of prediction performance. Furthermore, the model’s effectiveness and robustness are validated by applying it to other major cities in China, such as Shanghai and Hangzhou. Overall, GERPM shows promising potential in accurately predicting urban residential rents and contributing to the advancement of the rental market. Full article
(This article belongs to the Special Issue Advances in Machine Learning and Applications)
Show Figures

Figure 1

26 pages, 4383 KiB  
Article
Early Identification of Risk Factors in Non-Alcoholic Fatty Liver Disease (NAFLD) Using Machine Learning
by Luis Rolando Guarneros-Nolasco, Giner Alor-Hernández, Guillermo Prieto-Avalos and José Luis Sánchez-Cervantes
Mathematics 2023, 11(13), 3026; https://doi.org/10.3390/math11133026 - 7 Jul 2023
Cited by 1 | Viewed by 2980
Abstract
Liver diseases are a widespread and severe health concern, affecting millions worldwide. Non-alcoholic fatty liver disease (NAFLD) alone affects one-third of the global population, with some Latin American countries seeing rates exceeding 50%. This alarming trend has prompted researchers to explore new methods [...] Read more.
Liver diseases are a widespread and severe health concern, affecting millions worldwide. Non-alcoholic fatty liver disease (NAFLD) alone affects one-third of the global population, with some Latin American countries seeing rates exceeding 50%. This alarming trend has prompted researchers to explore new methods for identifying those at risk. One promising approach is using Machine Learning Algorithms (MLAs), which can help predict critical factors contributing to liver disease development. Our study examined nine different MLAs across four datasets to determine their effectiveness in predicting this condition. We analyzed each algorithm’s performance using five important metrics: accuracy, precision, recall, f1-score, and roc_auc. Our results showed that these algorithms were highly effective when used individually and as part of an ensemble modeling technique such as bagging or boosting. We identified alanine aminotransferase (ALT), aspartate aminotransferase (AST), alkaline phosphatase (ALP), and albumin as the top four attributes most strongly associated with non-alcoholic fatty liver disease risk across all datasets. Gamma-glutamyl transpeptidase (GGT), hemoglobin, age, and prothrombin time also played significant roles. In conclusion, this research provides valuable insights into how we can better detect and prevent non-alcoholic fatty liver diseases by leveraging advanced machine learning techniques. As such, it represents an exciting opportunity for healthcare professionals seeking more accurate diagnostic tools while improving patient outcomes globally. Full article
(This article belongs to the Special Issue Advances in Machine Learning and Applications)
Show Figures

Figure 1

19 pages, 346 KiB  
Article
Some Modified Ridge Estimators for Handling the Multicollinearity Problem
by Nusrat Shaheen, Ismail Shah, Amani Almohaimeed, Sajid Ali and Hana N. Alqifari
Mathematics 2023, 11(11), 2522; https://doi.org/10.3390/math11112522 - 31 May 2023
Cited by 5 | Viewed by 1689
Abstract
Regression analysis is a statistical process that utilizes two or more predictor variables to predict a response variable. When the predictors included in the regression model are strongly correlated with each other, the problem of multicollinearity arises in the model. Due to this [...] Read more.
Regression analysis is a statistical process that utilizes two or more predictor variables to predict a response variable. When the predictors included in the regression model are strongly correlated with each other, the problem of multicollinearity arises in the model. Due to this problem, the model variance increases significantly, leading to inconsistent ordinary least-squares estimators that may lead to invalid inferences. There are numerous existing strategies used to solve the multicollinearity issue, and one of the most used methods is ridge regression. The aim of this work is to develop novel estimators for the ridge parameter “γ” and compare them with existing estimators via extensive Monte Carlo simulation and real data sets based on the mean squared error criterion. The study findings indicate that the proposed estimators outperform the existing estimators. Full article
(This article belongs to the Special Issue Advances in Machine Learning and Applications)
Show Figures

Figure 1

28 pages, 6177 KiB  
Article
Optimized-Weighted-Speedy Q-Learning Algorithm for Multi-UGV in Static Environment Path Planning under Anti-Collision Cooperation Mechanism
by Yuanying Cao and Xi Fang
Mathematics 2023, 11(11), 2476; https://doi.org/10.3390/math11112476 - 27 May 2023
Cited by 6 | Viewed by 1976
Abstract
With the accelerated development of smart cities, the concept of a “smart industrial park” in which unmanned ground vehicles (UGVs) have wide application has entered the industrial field of vision. When faced with multiple tasks and heterogeneous tasks, the task execution efficiency of [...] Read more.
With the accelerated development of smart cities, the concept of a “smart industrial park” in which unmanned ground vehicles (UGVs) have wide application has entered the industrial field of vision. When faced with multiple tasks and heterogeneous tasks, the task execution efficiency of a single UGV is inefficient, thus the task planning research under multi-UGV cooperation has become more urgent. In this paper, under the anti-collision cooperation mechanism for multi-UGV path planning, an improved algorithm with optimized-weighted-speedy Q-learning (OWS Q-learning) is proposed. The slow convergence speed of the Q-learning algorithm is overcome to a certain extent by changing the update mode of the Q function. By improving the selection mode of learning rate and the selection strategy of action, the relationship between exploration and utilization is balanced, and the learning efficiency of multi-agent in complex environments is improved. The simulation experiments in static environment show that the designed anti-collision coordination mechanism effectively solves the coordination problem of multiple UGVs in the same scenario. In the same experimental scenario, compared with the Q-learning algorithm and other reinforcement learning algorithms, only the OWS Q-learning algorithm achieves the convergence effect, and the OWS Q-learning algorithm has the shortest collision-free path for UGVS and the least time to complete the planning. Compared with the Q-learning algorithm, the calculation time of the OWS Q-learning algorithm in the three experimental scenarios is improved by 53.93%, 67.21%, and 53.53%, respectively. This effectively improves the intelligent development of UGV in smart parks. Full article
(This article belongs to the Special Issue Advances in Machine Learning and Applications)
Show Figures

Figure 1

14 pages, 1963 KiB  
Article
A Machine Learning Approach for Improving Wafer Acceptance Testing Based on an Analysis of Station and Equipment Combinations
by Chien-Chih Wang and Yi-Ying Yang
Mathematics 2023, 11(7), 1569; https://doi.org/10.3390/math11071569 - 23 Mar 2023
Viewed by 2711
Abstract
Semiconductor manufacturing is a complex and lengthy process. Even with their expertise and experience, engineers often cannot quickly identify anomalies in an extensive database. Most research into equipment combinations has focused on the manufacturing process’s efficiency, quality, and cost issues. There has been [...] Read more.
Semiconductor manufacturing is a complex and lengthy process. Even with their expertise and experience, engineers often cannot quickly identify anomalies in an extensive database. Most research into equipment combinations has focused on the manufacturing process’s efficiency, quality, and cost issues. There has been little consideration of the relationship between semiconductor station and equipment combinations and throughput. In this study, a machine learning approach that allows for the integration of control charts, clustering, and association rules were developed. This approach was used to identify equipment combinations that may harm production processes by analyzing the effect on Vt parameters of the equipment combinations used in wafer acceptance testing (WAT). The results showed that when the support is between 70% and 80% and the confidence level is 85%, it is possible to quickly select the specific combinations of 13 production stations that significantly impact the Vt values of all 39 production stations. Stations 046000 (EH308), 049200 (DW005), 049050 (DI303), and 060000 (DC393) were found to have the most abnormal equipment combinations. The results of this research will aid the detection of equipment errors during semiconductor manufacturing and assist the optimization of production scheduling. Full article
(This article belongs to the Special Issue Advances in Machine Learning and Applications)
Show Figures

Figure 1

28 pages, 911 KiB  
Article
Ensemble Methods in Customer Churn Prediction: A Comparative Analysis of the State-of-the-Art
by Matthias Bogaert and Lex Delaere
Mathematics 2023, 11(5), 1137; https://doi.org/10.3390/math11051137 - 24 Feb 2023
Cited by 14 | Viewed by 7350
Abstract
In the past several single classifiers, homogeneous and heterogeneous ensembles have been proposed to detect the customers who are most likely to churn. Despite the popularity and accuracy of heterogeneous ensembles in various domains, customer churn prediction models have not yet been picked [...] Read more.
In the past several single classifiers, homogeneous and heterogeneous ensembles have been proposed to detect the customers who are most likely to churn. Despite the popularity and accuracy of heterogeneous ensembles in various domains, customer churn prediction models have not yet been picked up. Moreover, there are other developments in the performance evaluation and model comparison level that have not been introduced in a systematic way. Therefore, the aim of this study is to perform a large scale benchmark study in customer churn prediction implementing these novel methods. To do so, we benchmark 33 classifiers, including 6 single classifiers, 14 homogeneous, and 13 heterogeneous ensembles across 11 datasets. Our findings indicate that heterogeneous ensembles are consistently ranked higher than homogeneous ensembles and single classifiers. It is observed that a heterogeneous ensemble with simulated annealing classifier selection is ranked the highest in terms of AUC and expected maximum profits. For accuracy, F1 measure and top-decile lift, a heterogenous ensemble optimized by non-negative binomial likelihood, and a stacked heterogeneous ensemble are, respectively, the top ranked classifiers. Our study contributes to the literature by being the first to include such an extensive set of classifiers, performance metrics, and statistical tests in a benchmark study of customer churn. Full article
(This article belongs to the Special Issue Advances in Machine Learning and Applications)
Show Figures

Figure 1

14 pages, 9632 KiB  
Article
Transformer-Based Seq2Seq Model for Chord Progression Generation
by Shuyu Li and Yunsick Sung
Mathematics 2023, 11(5), 1111; https://doi.org/10.3390/math11051111 - 23 Feb 2023
Cited by 5 | Viewed by 4497
Abstract
Machine learning is widely used in various practical applications with deep learning models demonstrating advantages in handling huge data. Treating music as a special language and using deep learning models to accomplish melody recognition, music generation, and music analysis has proven feasible. In [...] Read more.
Machine learning is widely used in various practical applications with deep learning models demonstrating advantages in handling huge data. Treating music as a special language and using deep learning models to accomplish melody recognition, music generation, and music analysis has proven feasible. In certain music-related deep learning research, recurrent neural networks have been replaced with transformers. This has achieved significant results. In traditional approaches with recurrent neural networks, input sequences are limited in length. This paper proposes a method to generate chord progressions for melodies using a transformer-based sequence-to-sequence model, which is divided into a pre-trained encoder and decoder. A pre-trained encoder extracts contextual information from melodies, whereas a decoder uses this information to produce chords asynchronously and finally outputs chord progressions. The proposed method addresses length limitation issues while considering the harmony between chord progressions and melodies. Chord progressions can be generated for melodies in practical music composition applications. Evaluation experiments are conducted using the proposed method and three baseline models. The baseline models included the bidirectional long short-term memory (BLSTM), bidirectional encoder representation from transformers (BERT), and generative pre-trained transformer (GPT2). The proposed method outperformed the baseline models in Hits@k (k = 1) by 25.89, 1.54, and 2.13 %, respectively. Full article
(This article belongs to the Special Issue Advances in Machine Learning and Applications)
Show Figures

Figure 1

17 pages, 2434 KiB  
Article
Unsupervised Representation Learning with Task-Agnostic Feature Masking for Robust End-to-End Speech Recognition
by June-Woo Kim, Hoon Chung and Ho-Young Jung
Mathematics 2023, 11(3), 622; https://doi.org/10.3390/math11030622 - 26 Jan 2023
Cited by 1 | Viewed by 2192
Abstract
Unsupervised learning-based approaches for training speech vector representations (SVR) have recently been widely applied. While pretrained SVR models excel in relatively clean automatic speech recognition (ASR) tasks, such as those recorded in laboratory environments, they are still insufficient for practical applications with various [...] Read more.
Unsupervised learning-based approaches for training speech vector representations (SVR) have recently been widely applied. While pretrained SVR models excel in relatively clean automatic speech recognition (ASR) tasks, such as those recorded in laboratory environments, they are still insufficient for practical applications with various types of noise, intonation, and dialects. To cope with this problem, we present a novel unsupervised SVR learning method for practical end-to-end ASR models. Our approach involves designing a speech feature masking method to stabilize SVR model learning and improve the performance of the ASR model in a downstream task. By introducing a noise masking strategy into diverse combinations of the time and frequency regions of the spectrogram, the SVR model becomes a robust representation extractor for the ASR model in practical scenarios. In pretraining experiments, we train the SVR model using approximately 18,000 h of Korean speech datasets that included diverse speakers and were recorded in environments with various amounts of noise. The weights of the pretrained SVR extractor are then frozen, and the extracted speech representations are used for ASR model training in a downstream task. The experimental results show that the ASR model using our proposed SVR extractor significantly outperforms conventional methods. Full article
(This article belongs to the Special Issue Advances in Machine Learning and Applications)
Show Figures

Figure 1

20 pages, 3814 KiB  
Article
Implementing Magnetic Resonance Imaging Brain Disorder Classification via AlexNet–Quantum Learning
by Naif Alsharabi, Tayyaba Shahwar, Ateeq Ur Rehman and Yasser Alharbi
Mathematics 2023, 11(2), 376; https://doi.org/10.3390/math11020376 - 10 Jan 2023
Cited by 14 | Viewed by 3841
Abstract
The classical neural network has provided remarkable results to diagnose neurological disorders against neuroimaging data. However, in terms of efficient and accurate classification, some standpoints need to be improved by utilizing high-speed computing tools. By integrating quantum computing phenomena with deep neural network [...] Read more.
The classical neural network has provided remarkable results to diagnose neurological disorders against neuroimaging data. However, in terms of efficient and accurate classification, some standpoints need to be improved by utilizing high-speed computing tools. By integrating quantum computing phenomena with deep neural network approaches, this study proposes an AlexNet–quantum transfer learning method to diagnose neurodegenerative diseases using magnetic resonance imaging (MRI) dataset. The hybrid model is constructed by extracting an informative feature vector from high-dimensional data using a classical pre-trained AlexNet model and further feeding this network to a quantum variational circuit (QVC). Quantum circuit leverages quantum computing phenomena, quantum bits, and different quantum gates such as Hadamard and CNOT gate for transformation. The classical pre-trained model extracts the 4096 features from the MRI dataset by using AlexNet architecture and gives this vector as input to the quantum circuit. QVC generates a 4-dimensional vector and to transform this vector into a 2-dimensional vector, a fully connected layer is connected at the end to perform the binary classification task for a brain disorder. Furthermore, the classical–quantum model employs the quantum depth of six layers on pennyLane quantum simulators, presenting the classification accuracy of 97% for Parkinson’s disease (PD) and 96% for Alzheimer’s disease (AD) for 25 epochs. Besides this, pre-trained classical neural models are implemented for the classification of disorder and then, we compare the performance of the classical transfer learning model and hybrid classical–quantum transfer learning model. This comparison shows that the AlexNet–quantum learning model achieves beneficial results for classifying PD and AD. So, this work leverages the high-speed computational power using deep network learning and quantum circuit learning to offer insight into the practical application of quantum computers that speed up the performance of the model on real-world data in the healthcare domain. Full article
(This article belongs to the Special Issue Advances in Machine Learning and Applications)
Show Figures

Figure 1

29 pages, 8936 KiB  
Article
XAI for Churn Prediction in B2B Models: A Use Case in an Enterprise Software Company
by Gabriel Marín Díaz, José Javier Galán and Ramón Alberto Carrasco
Mathematics 2022, 10(20), 3896; https://doi.org/10.3390/math10203896 - 20 Oct 2022
Cited by 15 | Viewed by 4130
Abstract
The literature related to Artificial Intelligence (AI) models and customer churn prediction is extensive and rich in Business to Customer (B2C) environments; however, research in Business to Business (B2B) environments is not sufficiently addressed. Customer churn in the business environment and more so [...] Read more.
The literature related to Artificial Intelligence (AI) models and customer churn prediction is extensive and rich in Business to Customer (B2C) environments; however, research in Business to Business (B2B) environments is not sufficiently addressed. Customer churn in the business environment and more so in a B2B context is critical, as the impact on turnover is generally greater than in B2C environments. On the other hand, the data used in the context of this paper point to the importance of the relationship between customer and brand through the Contact Center. Therefore, the recency, frequency, importance and duration (RFID) model used to obtain the customer’s assessment from the point of view of their interactions with the Contact Center is a novelty and an additional source of information to traditional models based on purchase transactions, recency, frequency, and monetary (RFM). The objective of this work consists of the design of a methodological process that contributes to analyzing the explainability of AI algorithm predictions, Explainable Artificial Intelligence (XAI), for which we analyze the binary target variable abandonment in a B2B environment, considering the relationships that the partner (customer) has with the Contact Center, and focusing on a business software distribution company. The model can be generalized to any environment in which classification or regression algorithms are required. Full article
(This article belongs to the Special Issue Advances in Machine Learning and Applications)
Show Figures

Figure 1

Review

Jump to: Research

25 pages, 6391 KiB  
Review
Review of Artificial Intelligence and Machine Learning Technologies: Classification, Restrictions, Opportunities and Challenges
by Ravil I. Mukhamediev, Yelena Popova, Yan Kuchin, Elena Zaitseva, Almas Kalimoldayev, Adilkhan Symagulov, Vitaly Levashenko, Farida Abdoldina, Viktors Gopejenko, Kirill Yakunin, Elena Muhamedijeva and Marina Yelis
Mathematics 2022, 10(15), 2552; https://doi.org/10.3390/math10152552 - 22 Jul 2022
Cited by 101 | Viewed by 34570
Abstract
Artificial intelligence (AI) is an evolving set of technologies used for solving a wide range of applied issues. The core of AI is machine learning (ML)—a complex of algorithms and methods that address the problems of classification, clustering, and forecasting. The practical application [...] Read more.
Artificial intelligence (AI) is an evolving set of technologies used for solving a wide range of applied issues. The core of AI is machine learning (ML)—a complex of algorithms and methods that address the problems of classification, clustering, and forecasting. The practical application of AI&ML holds promising prospects. Therefore, the researches in this area are intensive. However, the industrial applications of AI and its more intensive use in society are not widespread at the present time. The challenges of widespread AI applications need to be considered from both the AI (internal problems) and the societal (external problems) perspective. This consideration will identify the priority steps for more intensive practical application of AI technologies, their introduction, and involvement in industry and society. The article presents the identification and discussion of the challenges of the employment of AI technologies in the economy and society of resource-based countries. The systematization of AI&ML technologies is implemented based on publications in these areas. This systematization allows for the specification of the organizational, personnel, social and technological limitations. This paper outlines the directions of studies in AI and ML, which will allow us to overcome some of the limitations and achieve expansion of the scope of AI&ML applications. Full article
(This article belongs to the Special Issue Advances in Machine Learning and Applications)
Show Figures

Figure 1

Back to TopTop