1. Introduction
Artificial intelligence (AI) is transforming our world by advancing systems’ capabilities to perform tasks traditionally demanding human intelligence, such as analyzing data, understanding patterns, and predicting complex behaviors. Machine learning (ML), a key discipline within AI, empowers systems to derive insights from data and continually enhance their performance. Rooted in mathematics, ML utilizes statistics, algebra, and optimization to develop robust algorithms that identify patterns, make predictions, and learn from extensive data. It contributes to solving a wide array of problems and driving innovation across several industries [
1].
Supervised learning (SL) in ML is used when labeled data are available, enabling applications such as medical diagnostics, image recognition, and fraud detection. Within the realm of supervised learning, classification aims at assigning labels or categories to input data instances based on their features. The process involves training a model on a labeled dataset, where the correct output is known, allowing the model to learn patterns and relationships within the data. Once trained, the model can predict labels for new, unseen data instances. Classification is broadly used in several applications, such as image recognition [
2], medical diagnosis [
3], crop identification [
4], fault detection [
5], and quality control [
6]. Common algorithms for classification tasks include the decision tree, support vector machine (SVM), neural network (NN), logistic regression (LR), k-nearest neighbors (KNN), reduced error pruning tree (REPTree), and ensemble methods like random forest (RF) and extreme gradient boosting (XGBoost), each offering unique strengths. Additionally, the convolutional neural network (CNN), recurrent neural network (RNN), long short-term memory network (LSTM), and deep belief network (DBN) are used, along with random tree (RT) and k-star. Probabilistic graphical models such as the Bayesian network (BN) and Gaussian mixture model (GMM) are applied for tasks requiring the capture of relationships between variables and handling mixed data types, as well as naïve Bayes (NB). For instance-based learning, locally weighted learning (LWL) is utilized, and the Hoeffding tree (HT) is employed for various scenarios.
The logistic model tree (LMT) [
7] is another significant classification algorithm in machine learning, which employs logistic regression models at the leaves of a decision tree. This integration allows LMT to benefit from the transparency of decision trees and the probabilistic modeling proficiency of logistic regression simultaneously. The hybrid structure of LMT contributes to achieving both linear and nonlinear relationships between inputs and the target feature. Additionally, LMT has the capability to generate probabilistic outputs rather than simple class labels, which is suitable for problems that require such predictions and threshold adjustments. LMT inherently deals with both continuous and categorical data, without the need for encoding during preprocessing. Furthermore, LMT typically applies regularization techniques, such as pruning to reduce overfitting and increase generalization. These advantages make the LMT algorithm noteworthy in practical applications across multiple domains, e.g., hydrology [
8], seismology [
9], geography [
10], healthcare [
11], forestry [
12], biometrics [
13], energy [
14], cybersecurity [
15], agriculture [
16], and more.
Classification tasks in machine learning can also be broadly categorized based on the number of targets and the range of possible values for each target. Binary classification deals with assigning instances to one of two possible categories for a single target variable. Multi-class classification extends this by allowing the single target to belong to one of several distinct categories. Multi-label classification is characterized by the assignment of multiple binary labels to each instance, reflecting that an instance can belong to several categories simultaneously. Multi-class multi-label classification (also known as multi-target classification or multi-output classification) is a task in which the dataset has multiple target attributes, where each target attribute in the dataset can have two or more different values. The versatility of multi-label classification proves valuable in actual circumstances where objects belong to multiple categories or exhibit diverse attributes concurrently [
17].
The multi-label paradigm is widely applied in domains where instances possess multiple inherent characteristics, such as physical activity recognition [
18], image identification [
19], text classification [
20], mathematics [
21], educational application [
22], recommendation systems [
23], sentiment analysis [
24], radiology [
25], and more. For example, genomics aids in classifying genes into multiple functional categories based on their sequences, reflecting their involvement in diverse biological processes. Medical imaging assists in diagnosing diseases by analyzing images that might contain multiple abnormalities or conditions, each requiring a separate label concurrently. In text classification, multiple tags or categories are assigned to documents, optimizing the precision of content classification across numerous topics or themes. In recommendation systems, multi-label classification plays a pivotal role in suggesting multiple items or objects adapted to the multi-dimensional preferences and interests of users. In e-commerce platforms, it recommends a variety of products based on customers’ browsing history and past purchases, involving multiple product categories simultaneously. These applications highlight the efficacy of multi-label classification in different environments, where the richness of the data cannot be captured by a single label.
Combining multi-class multi-label classification with logistic model trees enriches the classification process by leveraging the strengths of both approaches. Multi-class multi-label classification involves assigning multiple classes–labels to each instance. Logistic model trees enhance the predictive performance and robustness of each classification problem by integrating the decision tree’s ability to model complex interactions with logistic regression’s capacity to manage various data types. Each logistic model tree provides comprehensible models for individual labels, facilitating easier understanding and trust in predictions. LMTs usually achieve high predictive accuracy, with their pruning mechanism contributing significantly to the optimization of classification performance. This combination offers flexibility in model design, allowing customization of the logistic model trees to match the specific characteristics of the data distribution for each label. This suitability for different scenarios provides an efficient solution for multi-label classification tasks.
This study introduces a new method, multi-class multi-label logistic model tree (MMLMT), designed to address complex classification problems. MMLMT offers a promising avenue for advancing the field of multi-label classification. This study is distinguished from the previous studies through the following key contributions:
- (i)
Novel method: It proposes a novel approach (MMLMT) that integrates a logistic model tree with a multi-class multi-label classification technique, marking its first introduction in the literature.
- (ii)
Comprehensive integration: MMLMT incorporates both multi-class and multi-label classifications, offering a more comprehensive approach to handling multiple target attributes and predicting one class among the multiple potential classes for each target attribute. This integration offers a more detailed categorization, enhancing the model’s ability to address complex multi-label problems effectively. The proposed method enhances the model’s capability to manage a wide range of label combinations within multi-class multi-label datasets.
- (iii)
Performance improvement over counterparts: Achieving an average accuracy of 85.90% across a range of well-known datasets, our method demonstrated a 3.91% improvement compared to the counterpart methods, based on mathematical evaluations.
- (iv)
Higher accuracy results than the state-of-the-art studies: Experiments on the same datasets showed that MMLMT achieved a 9.87% improvement in average accuracy compared to the results of state-of-the-art studies in the literature.
- (v)
Methodological advantages: It implements the LMT classifier to provide model interpretability, understandability, and explainability while maintaining high predictive accuracy. Since it constructs a decision tree-based model like a flowchart, it can be considered a highly explainable artificial intelligence (XAI) method.
- (vi)
Practical implications: Demonstrating MMLMT’s effectiveness across a diverse set of eight datasets spanning multiple domains underscores its applicability and reliability in undertaking multi-class multi-label classification challenges in various fields. This highlights its potential for advanced multi-label learning applications.
This paper is structured as follows:
Section 2 covers a review of related works, followed by
Section 3 detailing the methods and materials.
Section 4 presents the experimental studies that were conducted on eight publicly available datasets, while
Section 5 discusses the comparative results.
Section 6 outlines the conclusions drawn from the findings and provides future research directions.
2. Related Works
Although many successful algorithms have been proposed for single-label learning problems in the literature, different algorithms should be implemented to provide effective solutions for multi-label problems, so research on multi-label learning (MLL) problems is needed [
26]. MLL has been applied in different areas such as disease prediction [
27,
28], drug repurposing [
29], biomedical applications [
30], image classification [
31,
32,
33,
34], natural language processing [
35,
36,
37,
38], education [
39], industry [
40], and transportation [
41].
In [
27], researchers applied multi-label active learning algorithms on a heart disease dataset. The adaptive synthetic data-based multi-label classification (ASDMLC) approach was proposed in [
28] and it was tested with three different disease datasets. Although the computation time was longer, higher performance was achieved than other models. In another study [
29], a multi-label learning framework was proposed for drug repurposing. Their results showed that it could generalize well in a huge drug space without the information of drug target protein and chemical structures. In [
30], researchers used multi-label extreme learning machines for taxonomic categorization of DNA sequences. Their study drew attention to the deep learning methods to increase classification performance in biological taxonomy.
In the field of image classification, a number of multi-label learning methods have been successfully developed. For example, the classification task was performed on noisy multi-label food images [
31]. The data were evaluated with three different deep neural networks: ResNet-50, attentive feature mixup (AFM), and the proposed attentive feature cam-driven mixup (AFCM). With the AFCM method, more successful results were obtained compared to other methods in noisy multi-label image data. In [
32], the researchers focused on the problem of classifying multi-label chest X-ray images. In their study, MLL was performed with the EfficientNet transfer learning technique. In another study [
33], deep learning frameworks were used to classify movie genres. Successful results were achieved with VGG16, ResNet, DenseNet, Inception, MobileNet, and ConvNeXt models that were trained on a dataset consisting of multi-label movie posters. In [
34], researchers proposed an ensemble model based on gradient-weighted class activation mapping (Grad-CAM) and they applied it to the retinal fundus multi-disease image dataset (RFMiD). It was stated in the study that the Grad-CAM method was effective in lesion detection.
One of the MLL studies conducted in the field of natural language processing is by [
35], which was carried out to evaluate comments with more than one tag. In [
36], MLL techniques were used to analyze people’s emotional states based on their posts on social media. A deep learning-based approach was used as a solution to the multiple emotion classification problem. In [
37], an effective MLL approach was proposed for multi-label Arabic text classification with ensemble learning and the genetic algorithm-based metaheuristic feature selection method.
MLL has also been used in the field of education to predict students’ learning styles. For example, in [
39], the authors used a multi-label classification method by considering that a student may have more than one dominant learning style. In [
40], the problem of classifying multi-label unbalanced big data in the industrial field has been addressed and evaluated with the extreme gradient boosting classifier and the histogram gradient boosting classifier. It was stated that their study provided practical insights into solving problems that may be encountered in real industrial data. Deep learning studies have been conducted on multi-label datasets to support transportation in smart cities. For instance, in [
41], which was carried out with nine different datasets, it was concluded that YOLO versions (YOLOv8, v7, v6, and v5) were more successful than other deep learning methods, with an overall accuracy rate of 90%.
Multi-class multi-label classification tasks have garnered significant attention in recent research due to their ability to handle complex datasets that have multiple target attributes, each of which can have two or more different values. Traditional classification methods often fall short in these scenarios, necessitating the development of specialized approaches in various domains such as natural disaster management [
42], ophthalmology [
43], medical diagnosis [
44], and pedestrian attribute recognition [
45]. In [
42], a two-stage BERT-based model for multi-class multi-label classification of typhoon damage in social media texts. The first stage identifies damage-related texts using sentence vectors, and the second stage classifies these texts into damage categories with word matrices. In [
43], CNN-based models were developed for multi-class multi-label classification of ophthalmological diseases using fundus images, and the experiments showed that the VGG16 architecture with a stochastic gradient descent optimizer yielded the best performance.
In [
44], the Stacked Dark COVID-Net was introduced, which preprocesses images using contrast-limited adaptive histogram equalization (CLAHE) and classifies them to assist in COVID-19 detection. The model addresses challenges like overfitting and computational overhead, achieving high accuracy in diagnosing COVID-19 from chest X-rays. In [
45], a CNN-based approach was represented for recognizing pedestrian attributes from images. It experimented with various convolutional layer depths, demonstrating that deeper models achieved higher performance on the dataset with a range of features.
In studies comparing the performances of machine learning algorithms in the literature, it has been indicated that the LMT algorithm for classification tasks was more successful than other machine learning techniques [
46,
47,
48,
49]. To create a landslide susceptibility map, it was suggested to use the LMT algorithm in [
46], where the performances of five different machine learning algorithms, namely SVM, ANN, LR, NBT, and LMT, were compared.
Similarly, the authors emphasized that the LMT approach in landslide susceptibility mapping is more promising than other machine learning approaches, according to the results obtained from the AUC metric [
47]. In [
48], where four different machine learning algorithms were compared to create flash flood susceptibility maps, it was stated that the LMT method displayed the highest performance. In [
49], where the LMT algorithm was evaluated to determine underground column stability, more successful results were obtained compared to the other machine learning algorithms in the literature. Based on the success of the LMT method in these previous studies, we preferred to use this method in our study.
In summary,
Figure 1 illustrates the timeline of the investigated related works from 2019 to 2024. This figure includes the references, highlighting the authors’ names and the domain of study for each, providing a comprehensive overview of the developments and contributions in this field over the specified period. Despite advancements in classification techniques, developing methods for multi-class multi-label datasets is still limited in the literature. There is a need for novel approaches that provide satisfactory results in different domains. The combination of multi-class multi-label classification with LMT remains unexplored in the existing literature. Implementing LMT for multi-class multi-label classification tasks could offer significant improvements in accuracy and insights for complex datasets.
3. Materials and Methods
3.1. Proposed Method
This paper proposes a new method for multi-class multi-label datasets, named the multi-class multi-label logistic model tree (MMLMT). Our approach leverages the strengths of multi-label learning, which permits the simultaneous prediction of multiple classes–labels, enhancing the ability of the model to capture complex relationships within the data. The core of MMLMT is the logistic model tree (LMT) algorithm, which combines logistic regression and decision tree algorithms. LMT offers the advantage of producing understandable models for explainable artificial intelligence with high predictive accuracy. By combining the strengths of decision trees and logistic regression, our method provides a flexible and powerful framework for handling multi-class multi-label data, resulting in an efficient model for classification tasks. MMLMT, supported by strong mathematical foundations, is crafted to make precise predictions. Extensive experiments demonstrated the efficiency of MMLMT across a range of well-known datasets, showcasing its potential as a valuable approach in multi-label learning.
The general structure of the proposed MMLMT method is illustrated in
Figure 2 and described as follows. The multi-class multi-label dataset, consisting of
samples with
features and
target attributes, undergoes a data preparation phase such as data cleaning. After that, it is converted into
individual datasets where each of these includes
features and one target attribute. Each of these datasets is then used to train an LMT model. Specifically, a logistic model tree is constructed for each dataset, resulting in
distinct trees corresponding to the
target attributes. These trees are then aggregated to form the final predictive model. The model aggregation step combines the outputs of the individual trees to generate the final prediction. MMLMT tends to make accurate multi-label classifications, benefiting from the strengths of both logistic regression and decision trees. The next step evaluates the aggregated model’s performance, confirming the capability of the MMLMT in dealing with complex multi-class multi-label datasets and making accurate predictions. In this step, diverse mathematical metrics such as accuracy, sensitivity, and precision–recall curve area can be used to comprehensively evaluate performance. The final step involves the usage of the model to make predictions based on the query input.
The logistic model tree (LMT) algorithm can be effectively employed in multi-label classification tasks by applying it individually to each set of class labels. The MMLMT approach allows LMT to efficiently address multi-class multi-label datasets and potentially can improve classification accuracies across multiple labels. The LMT algorithm integrates logistic regression with decision tree principles, forming a flexible tree structure that adapts well to various data types, including binary, nominal, and numeric data. LMT is capable of predicting class labels using both qualitative and quantitative predictors. Additionally, it allows for the extraction of rule sequences from the tree to generate predictions based on input values, making it a versatile approach to handling complex datasets. Moreover, LMT can be considered a highly explainable artificial intelligence (XAI) since it constructs a decision tree-based model like a flowchart that can be easily interpretable. The limitation of the LMT method lies in addressing missing values since it utilizes a simple global imputation scheme for filling in those blanks [
7]. Even though our experimental studies did not show any drawback in this regard, a more advanced scheme for handling missing values might be used for further analysis in the different domains where those frequently appear.
3.2. Formal Expression
In this section, we detail the theoretical foundation of the proposed MMLMT method for classification tasks in machine learning. Traditional supervised learning algorithms are designed for single-label scenarios, where each training sample is associated with only one label that defines its characteristics. Conversely, multi-label learning algorithms deal with training samples linked to multiple labels and the aim is to accurately predict the set of labels for new samples. The final goal of multi-label learning algorithms is to develop a machine learning classifier that, for a given unlabeled instance , predicts its subset of labels accurately. This prediction is indicated as where consists of the labels associated with instance . From the perspective of mathematics, a multi-label classification algorithm aims to learn the function that maps feature vectors to subsets of labels such that function predicts the correct subset for unseen instances. The concept of the proposed MMLMT method is formally defined as follows.
Let
denote the multi-class multi-label dataset, consisting of
instances
where
. Each instance
includes a feature vector
with
elements and is associated with a subset of labels
. Here,
symbolizes a total set of
possible class labels organized into
target attributes. This data representation is illustrated in
Table 1. It shows a multi-class multi-label dataset in which each instance
is linked to a subset of labels represented by
. For example, the
instance is linked to the label set
, indicating that this instance has labels
and
. Here, the label set
comprises the concatenation of labels, including
,
, and
. These representations underscore the dataset’s multi-label characteristics, illustrating instances that can be assigned multiple labels concurrently. Moreover, the classes of target attributes involve
and
. Therefore,
is the total set of classes of labels for all
instances, where each triple subset of
, e.g.,
, is assigned to the first instance,
is assigned to the second instance, and so on. Here,
is the total set of labels. In this example,
and
are equal to 3 and 10, respectively (
= 3 and
= 10).
In the MMLMT method, the original task, which has class-labels in set organized into target attributes is decomposed into multiple classification tasks. Essentially, this method transforms the original multi-class multi-label training dataset into individual datasets , for . Each dataset includes all instances from the original dataset but with the corresponding target attribute ().
Table 2 presents three individual training datasets obtained after converting the multi-class multi-label dataset given in
Table 1. Each dataset
includes a different target attribute
from the original dataset. Each row in
Table 2 pertains to an instance
from the initial dataset, while each outcome column is a distinct class–label
. For each dataset
, the instances are labeled with the corresponding class values in
. For example, in
Table 1,
is linked to the label set
, indicating that this instance has labels
in
,
in
, and
in
. This is reflected in
Table 2 by the presence of
,
, and
, in
, and
output columns, respectively. This transformation allows traditional classification algorithms to be applied to each individual dataset, simplifying the multi-class multi-label classification task.
After the data transformation process,
independent classifiers
for
are established using the datasets
. These models are then combined to form the general classifier
, as presented in Equation (1).
LMT employs the LogitBoost [
50] algorithm for building additive logistic regression functions at each tree node by selecting the most relevant attributes in the data. LMT also uses the classification and regression tree (CART) algorithm for pruning, leading to improved classification performance. A key advantage of LMT is its combination of logistic regression and classification with a validation technique to determine the optimal number of LogitBoost iterations, thereby preventing overfitting. The algorithm employs a least-squares
for each class
, as mathematically detailed in Equation (2).
where
: the input vector containing features or predictors;
: the coefficient associated with the ith feature ;
the intercept term added to the weighted sum;
: the logit or linear combination of input features weighted by coefficients for class .
The algorithm also employs logistic regression to calculate the probabilities assigned after observing the data at each node of the tree. This process is defined mathematically by Equation (3).
where
: the total number of possible labels;
: the probability of the instance belonging to class given the input vector , normalized by the sum of the exponential values of the logits across all possible labels .
This theoretical framework is straightforwardly applicable for configuring the prediction process in machine learning models. It is supported by strong mathematical foundations to make precise predictions.
3.3. Algorithm
The proposed multi-class multi-label logistic model tree (MMLMT) algorithm is designed to tackle multi-label classification problems by transforming the problem into multiple classification tasks. This approach leverages logistic model trees to handle each classification task separately, thus efficiently managing the complexity associated with multi-label learning. The detailed process of the MMLMT algorithm is outlined in Algorithm 1. The algorithm operates on dataset , which consists of instances , where each instance has a feature vector and a corresponding set of labels. The labels are subsets of a larger set of all possible labels , which are grouped into target attributes. The main goal of the MMLMT algorithm is to predict the class labels for a test set. This is achieved by first generating an individual dataset for each target attribute. Each dataset corresponds to a classification problem focused on predicting the different possible values of the label. The algorithm proceeds as follows:
Dataset generation: The algorithm iterates through each of thetarget attributes to generate individual datasets . This is accomplished by scanning the entire dataset and, for each instance, adding the feature vector to along with the corresponding label . This step effectively transforms the original multi-label problem intoseparate classification tasks, each focusing on a specific target attribute.
Model training: For each individual dataset, the algorithm builds a logistic model tree classifier. By training a separate classifier for each target attribute, the algorithm ensures that each label is predicted independently.
Classification phase: During the classification phase, the algorithm applies each classifierto predict the corresponding label for each instance in the test set. Specifically, for each test instance, each classifiergenerates a predicted label . The outputs of all classifiers are aggregated to form a vector for each test instance. This aggregation step effectively reconstructs the multi-label nature of the problem by combining the individual predictions into a final set of predicted labels.
Final output: The final set of predicted labels for the test setis stored in a list. This list represents the algorithm’s best estimate of the true labels for each test instance, considering the independencies between the labels through the individual classification tasks.
Overall, the MMLMT algorithm provides a structured approach to managing the complexity of multi-label classification by decomposing it into more manageable sub-problems. This method not only simplifies the problem but also utilizes the strength of logistic model trees to provide accurate and interpretable predictions.
Algorithm 1: Multi-class Multi-label Logistic Model Tree (MMLMT) |
Inputs: |
= : dataset withinstances, containing featuresand labels |
: set of all possible class labels, grouped intotarget attributes
|
: the number of target attributes |
: the number of total class labels |
: testing set to be predicted |
Outputs: |
predicted labels for the instances in | |
Begin: |
for each in | // Generation of individual datasets |
for = 1 to |
|
end for |
end for each |
for = 1 to |
= | // Build classifiers |
end for |
for each in | // Classification |
for = 1 to |
= |
= |
end for |
= |
end for each |
End |
In practice, with number of samples, the scalability of MMLMT increases linearly with the size of the label set . Since the complexity is limited to the base classifier LMT, denoted by , the overall complexity of MMLMT is therefore, from a mathematical perspective. Hence, the MMLMT method is particularly well suited for scenarios where the number of labels is relatively moderate.
4. Experimental Studies
4.1. Dataset Description
This research utilizes eight real-world multi-class multi-label datasets [
51,
52,
53,
54,
55,
56,
57,
58] to showcase the functionalities of the presented MMLMT method.
Table 3 provides an overview of the characteristics of these datasets. Each dataset—Drug-Consumption, Enron, HackerEarth-Adopt-A-Buddy, Music-Emotions, Scene, Solar-Flare-2, Thyroid-L7, and Yeast—is publicly available from various machine learning repositories. These datasets encompass a wide range of features, spanning from 10 to 1001, with instances varying from 593 to 18,834, and label numbers from 2 to 53. They originate from diverse domains including drugs, text, animals, music, image, physics, healthcare, and biology. The datasets contain categorical, numerical, and mixed-type values, reflecting their varied nature and suitability for different analytical machine-learning techniques.
4.1.1. Drug-Consumption Dataset
The Drug-Consumption dataset consists of 1885 records, each representing an individual respondent characterized by 12 real-valued features. These features primarily include personality measurements to evaluate the risk factors associated with drug consumption. The dataset addresses 18 classification tasks, targeting different drug usage behaviors. Each classification task involves categorizing respondents into one of seven classes based on their reported drug usage frequencies such as never used, used in last year, used in last week, and so on.
4.1.2. Enron Dataset
The Enron dataset is primarily concerned with classifying emails into various categories. This dataset comprises 1702 instances and 53 labels with a cardinality of 3.78. The dataset has 1001 attributes, making it a valuable resource for research in machine learning and natural language processing. It encompasses the content and metadata of emails, making it useful for tasks like text classification, social network analysis, anomaly detection, and studying organizational communication patterns.
4.1.3. HackerEarth-Adopt-A-Buddy
The HackerEarth-Adopt-A-Buddy dataset, sourced from Kaggle, is a multi-label dataset consisting of 18,834 instances with 11 features. This dataset was designed to support virtual pet adoption efforts during the pandemic, helping to engage potential pet owners by providing a virtual experience of animals available for adoption. Machine learning methods can utilize this dataset to predict pet types and breeds based on various attributes. This comprehensive dataset offers a robust foundation for developing and evaluating predictive models.
4.1.4. Music-Emotions Dataset
The Music-Emotions dataset was designed to study the classification of emotions elicited by music. It comprises 593 instances, each characterized by 72 attributes that are divided into 64 timbre features and 8 rhythmic features. Each instance in the dataset has been labeled by a group of experts based on the emotions the music produces such as amazed–surprised, relaxing–calm, happy–pleased, sad–lonely, and so on. This dataset is an outstanding resource in the field of music information retrieval and affective computing. It supports applications in automatic music emotion recognition, personalized music recommendation, and adaptive music therapy systems.
4.1.5. Scene Dataset
The Scene dataset consists of 2407 instances representing 6 distinct classes, including beach, mountain, urban, field, fall foliage, and sunset. Each image is subdivided into 49 blocks, and the dataset includes 294 numeric attributes derived from spatial color moments. With a cardinality of 1074, this dataset is utilized for multi-label classification tasks in computer vision and pattern recognition. Researchers preprocessed the dataset by normalizing feature values to ensure consistency across images and optimize classification accuracy. Its applications span various domains, such as remote sensing, image retrieval, and environmental monitoring, emphasizing its relevance in academic research and practical implementations.
4.1.6. Solar-Flare-2 Dataset
The Solar-Flare-2 dataset is a multivariate dataset used to predict solar flare activities. It consists of 1066 instances, each representing features extracted from active regions on the sun. These features, categorized into ten attributes, include sunspot area, magnetic field complexity, and historical flare occurrences data. Each categorical feature records the frequency of specific types of solar flares within 24 h. Researchers utilize this dataset to explore correlations between solar activities and flare occurrences, contributing to advancement in solar physics and space weather forecasting capabilities.
4.1.7. Thyroid-L7 Dataset
The Thyroid-L7 focuses on various aspects of thyroid health, involving a range of medical attributes and conditions. It consists of 29 attributes, 9172 instances, and 7 seven labels for thyroid-related conditions, including hyperthyroid, hypothyroid, binding protein disorders, general health issues, replacement therapy needs, anti-thyroid treatments, and discordant results. The dataset also encompasses various demographic and medical attributes such as age, sex, and measurements of thyroid-related hormones. The referral source for each instance is categorized into six possible values, including WEST, STMW, and SVI. The dataset structure supports comprehensive analysis and prediction of thyroid disorders, focusing on identifying multiple co-occurring issues within individual patients.
4.1.8. Yeast Dataset
The Yeast dataset aims to predict functional classes in the genome of the yeast Saccharomyces cerevisiae and has been used for physiological data modeling contests. It contains a total of 2417 instances, each representing a gene characterized by 103 numeric feature attributes. In this dataset, biological functions are represented by 14 target variables that indicate various gene functional groups, such as metabolism and energy. This dataset provides information about several types of genes within particular organisms, facilitating multi-label classification tasks and enabling researchers to identify and categorize the functional roles of different genes within the yeast genome.
4.2. Experiment Details
This study introduces a novel method, named MMLMT, specifically designed for multi-class multi-label classification tasks. This method integrates insights from logistic model trees to handle the inherent structure of multi-class multi-label classification models, making them efficient to implement. The potency of the MMLMT method was demonstrated through validation on specialized datasets, including Drug-Consumption, Enron, HackerEarth-Adopt-A-Buddy, Music-Emotions, Scene, Solar-Flare-2, Thyroid-L7, and Yeast. Our method was implemented in the C# programming language by using the Weka library [
59]. To ensure the reproducibility of our results, the source codes are publicly available in the GitHub repository (
https://github.com/BitaGhasemkhani/MMLMT, accessed on 4 July 2024). This repository includes all relevant codes and documentation, providing the complete implementation of the algorithm along with detailed instructions for replicating the experiments. The LMT classifier was constructed in our experiments with hyperparameters set, as represented in
Table 4.
One of the hyperparameters of the LMT algorithm is the “MinNumInstances”, which refers to the minimum number of instances at which a tree node is considered for splitting. For instance, if it is set to 15, this means that a node is not split if it includes less than 15 samples. Therefore, changing this value can affect the size of the tree (i.e., the number of nodes). Another hyperparameter is “NumBoostingIterations”, which identifies the number of times that the process will be performed. It is reported that it seems that this hyperparameter does depend on the domain; however, it does not change so much for different subsets of a particular dataset since it is encountered in lower levels in the tree [
7]. Although a large number of iterations may not change the accuracy so much, it increases the computational cost. For this reason, the best idea is to set it to −1, which automatically determines the number of boosting iterations.
During the experiments, we utilized the 10-fold cross-validation approach to train and evaluate our classification model. We employed a comprehensive range of standard metrics to assess the performance of the MMLMT method across various dimensions in the evaluation step. These metrics include accuracy, sensitivity, and precision–recall curve (PRC) area. Each metric provides unique visions into different aspects of the model’s impact on multi-label classification scenarios. The mathematical formulas for these metrics, which use true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN), are presented in Equations (4) to (6) as follows:
Furthermore, we employed several non-parametrical statistical analyses, including Friedman Aligned Ranks, Quade, and Wilcoxon tests to validate the significance of experimental results with acceptable p-values.
4.3. Results
Table 5 summarizes the accuracy of eight classification algorithms across eight datasets, highlighting the superior performance of our proposed MMLMT algorithm. These methods are multi-class multi-label random tree (MMRT), multi-class multi-label naïve Bayes (MMNB), multi-class multi-label k-nearest neighbors (MMKNN), multi-class multi-label logistic regression (MMLR), multi-class multi-label k-star (MMK-Star), multi-class multi-label locally weighted learning (MMLWL), and multi-class multi-label Hoeffding tree (MMHT). These algorithms were selected for their popularity and proven effectiveness in classification tasks, particularly within the context of multi-class and multi-label problems.
The MMRT algorithm utilizes the random tree [
60] classifier, known for its robustness and ability to handle large datasets with high-dimensional feature spaces. MMNB is based on the naïve Bayes [
61] classifier, favored for its simplicity and computational efficiency. The MMKNN algorithm employs the k-nearest neighbors [
62] method, a non-parametric approach widely used due to its effectiveness in capturing local patterns in data. MMLR leverages logistic regression [
63], a linear model well regarded for its interpretability and reliable performance in both binary and multi-class classification tasks. The MMK-Star algorithm involves k-star [
64] which offers a flexible similarity measure based on entropy. MMLWL implements locally weighted learning [
65], an instance-based method that adapts well to varying data distributions by giving more weight to nearby instances. Lastly, MMHT utilizes the Hoeffding tree [
66], a decision tree algorithm designed for streaming data classification.
These methods were chosen not only for their prevalence in the literature but also for their complementary strengths across different types of datasets. By applying these algorithms within a multi-class multi-label framework, we provide a thorough comparison, ensuring that the evaluation of our MMLMT method is both robust and comprehensive. The proposed MMLMT method outperformed other algorithms by achieving the highest average accuracy of 85.90%. It can be noted here that it obtained a 3.91% improvement on average compared to other methods.
Additionally, the computational complexity of the discussed classification algorithms is summarized in
Table 6. This table outlines the training and prediction complexities of each method, offering a clear view of their computational demands. Here, the mathematical symbols including
,
, and
represent the number of instances, features, and classes, respectively, within each algorithm. Understanding these complexities helps in evaluating the practical applicability of the algorithms, especially when dealing with large datasets or resource constraints. This comparative overview aids in appreciating the balance between computational requirements and the performance of the algorithms, providing context for their use in various scenarios.
The experimental results revealed that the MMLMT method consistently achieved equal to or higher accuracy across all the datasets compared to the other methods. Notably, the highest accuracy (99.03%) was obtained by MMLMT on the Thyroid-L7 dataset. Following this, MMLMT demonstrated peak performance by achieving top accuracies of 95.40% and 93.03% on the Enron and Solar-Flare-2 datasets, respectively. Additionally, MMLMT showed strong performance across other datasets, reinforcing its adaptability. Comparative analysis shows that MMLR, MMLWL, and MMHT also performed well with average accuracies of 83.86%, 83.27%, and 83.10%, respectively, but were reliably outperformed by MMLMT. The algorithms like MMNB and MMRT had lower average accuracies of 78.15% and 80.81%, respectively, demonstrating their relative inefficacy. These results underscore the efficiency of MMLMT in attaining high accuracy in multi-class multi-label classification tasks across diverse datasets.
The sensitivity results obtained for the MMLMT method across various datasets are presented in
Figure 3. The sensitivity metric indicates the true positive rate for each dataset, ranging from 0 to 1. The results show that the highest sensitivity (0.99) was achieved on the Thyroid-L7 dataset, followed closely on Enron with a sensitivity of 0.95. The performances on the Scene and Solar_Flare_2 datasets also have high sensitivity values in the range of 0.90 to 0.93. The HackerEarth-Adopt-A-Buddy, Yeast, and Music-Emotions datasets have sensitivity values between 0.80 and 0.87, while the Drug-Consumption dataset has the lowest sensitivity of 0.63. These sensitivity metric results showcase the MMLMT method’s impact on accurately identifying true positives across various datasets.
The results for the PRC Area metric using the MMLMT method revealed varying levels of performance across different datasets, as presented in
Figure 4. The highest one (0.99) was obtained on the Thyroid-L7 dataset, followed closely by the HackerEarth-Adopt-A-Buddy, Scene, and Enron datasets with approximately 0.95 scores. This demonstrates the MMLMT method’s potency in handling multi-class multi-label classification tasks for these datasets. Additionally, high scores were also observed for Yeast, Music-Emotions, and Solar-Flare-2, ranging from 0.78 to 0.88, reflecting acceptable performance. Conversely, the Drug-Consumption dataset exhibited the lowest PRC Area value.
4.4. Statistical Analysis of Results
In this study, we rigorously assess the performance of our proposed MMLMT method against several established classification approaches, including MMRT, MMNB, MMKNN, MMLR, MMK-Star, MMLWL, and MMHT. Our comparative analysis spans multiple datasets to ensure reliable and comprehensive results. To substantiate our findings, we applied a series of non-parametric statistical tests, including Friedman Aligned Ranks [
67], Quade [
68], and Wilcoxon [
69]. The Friedman Aligned Ranks test aligns dataset rankings to competently compare multiple algorithms, while the Quade test adjusts ranks to account for variations between distinct blocks of data. With a significance level of 0.05, we obtained
p-values of 0.00086 and 0.00013 for the Friedman Aligned Ranks and Quade tests, respectively. The results underscore the robustness of our findings, indicating that MMLMT offers a statistically significant enhancement in predictive accuracy compared to the other approaches tested in our experimental setup.
The mathematical formulas of Friedman Aligned Ranks and Quade tests are presented in Equations (7) and (8), respectively.
where
: the number of blocks in the experimental design;
: the number of methods being compared;
: the sum of ranks for method , which is calculated based on how well each method performs across the blocks;
: The Friedman Aligned Ranks test result.
: the total number of observations or data points;
: the number of methods (treatments) being compared;
: the average rank of the th block, indicating how well methods perform within each block;
the overall average rank across all blocks, providing a baseline for comparison;
: the average rank of method across all blocks, indicating the overall performance of each method averaged across blocks;
: test statistic representing the Quade test result.
The Wilcoxon test evaluates whether the mean ranks of related samples differ significantly. This test validates the superior performance of MMLMT, ensuring that the observed advances in predictive accuracy are statistically significant and reflect genuine enhancements, rather than random chance. The result of the Wilcoxon test is shown in
Table 7, which demonstrates that MMLMT consistently outperformed its counterparts with all
p-values below the significance level of 0.05. Specifically, the Wilcoxon test yields
p-values as low as 0.01172 for MMRT, MMNB, MMKNN, MMLR, and MMK-Star, and shares 0.01796 for MMLWL and MMHT. This underscores the sturdiness and statistical significance of MMLMT’s performance advancements.
The mathematical formula for the Wilcoxon test is presented in Equation (9).
where
: the total number of paired observations or data points;
: the rank assigned to the positive differences between paired observations, indicating how much each pair contributes to the test statistic;
: the Wilcoxon test statistic, which is calculated based on the sum of ranks of positive differences between paired observations.
4.5. LMT Structure Analysis
Figure 5 shows an example logistic model tree built by the proposed MMLMT method that was applied to the Thyroid-L7 dataset for a target attribute. This LMT is a tree of height 7, containing 27 nodes, including 13 internal nodes and 14 leaves, spanning across levels. The class is processed with LMT configured with no iteration limits and a minimum of 15 instances per leaf. The structure begins with a split on thyroid-stimulating hormone (TSH) values at the root, leading to further splits on attributes such as free thyroxine index (FTI), query_hypothyroid, T3, and on_thyroxine. Each internal node represents a decision based on an attribute, and the leaf nodes contain logistic regression models predicting class probabilities. For example, if TSH ≤ 6 and FTI ≤ 61 mathematically, the tree further splits on query_hypothyroid and T3, leading to leaf nodes LM_1 and LM_2. In LMT, each leaf node is represented by a logistic model identifier like as LM_1, LM_2, and so on. The numbers in a leaf give statistical information about it such as the total number of instances reaching this leaf. According to the tree, if TSH > 6, the tree splits on FTI and further on T3, on_antithyroid_medication, thyroid_surgery, and TT4, leading to various logistic models like LM_5, LM_6, and LM_7. This hybrid approach offers the understandability of decision trees and the predictive power of logistic regression, capably undertaking diverse attributes and providing valuable insights into medical diagnoses.
A significant advantage of a tree model is its easy interpretability. A path in a tree essentially corresponds to a conjunction of Boolean expressions of the form ‘attribute ≤ value’ (for numeric attributes) or ‘attribute = value’ (for nominal attributes). Therefore, a tree model can be seen as a collection of rules that explain how to classify instances. This explanatory power is particularly beneficial in applications, where understanding the rationale behind predictions is crucial for gaining the trust of professionals and ensuring alignment with informed decisions. By examining the specific paths and splits in the tree structure, experts can identify the most influential factors in the data. In real-world scenarios, LMT has demonstrated its interpretability across various domains, providing clear, rule-based insights that enhance decision-making processes. For instance, in finance, LMT can assist institutions in creating transparent models for credit scoring, where the model clearly outlines the factors leading to loan approval or rejection. This transparency ensures that stakeholders understand the reasoning behind financial decisions, fostering trust and compliance with regulatory standards. Such capabilities make LMT a versatile tool in many fields where clear, understandable decision-making is essential. The easy comprehensibility of the model is an important property in terms of explainable artificial intelligence (XAI) requirements. The LMT’s ability to provide concise rules, combined with robust prediction capabilities, makes it an excellent tool for applications where both accuracy and interpretability are paramount.
In the context of the proposed MMLMT method, the logistic regression model applied to the target attribute of the Thyroid-L7 dataset offers a clear and quantitative representation of the predictive factors influencing diseases. As an example, the mathematical expression of logistic regression for Class NEG in LM_1 is given in Equation (10). This logistic model captures the complex relationships among various medical attributes, quantifying their contributions to the likelihood of a disease diagnosis. For instance, being on thyroxine significantly increases the log odds of a positive classification (6.15), whereas higher T3 levels decrease these odds (−15.62). The model’s parameters provide valuable insights, such as the strong positive impact of querying hyperthyroidism (38.41) and the substantial negative influence of T4U (−7.39). These coefficients, derived from the logistic regression model at leaf nodes of the decision tree, exemplify the hybrid approach’s strength. Notably, Equation (10) is a specific instance of the more general logit formula presented in Equation (2), mathematically outlining the general form of logistic regression used across various leaf nodes in the LMT structure.
Table 8 illustrates an example confusion matrix for the target attribute with class-labels L, M, N, and NEG. The matrix shows the classifier’s performance with the diagonal elements representing the number of instances correctly classified for each label, e.g., M was accurately classified 121 times and NEG 8792 times. Off-diagonal elements indicate misclassifications, such that 8 instances of M are classified as NEG. The confusion matrix provides a detailed view of how the model performs across these labels, clarifying the accuracy and areas of misclassification. This analysis is essential for understanding the model’s effectiveness in managing the label structures of the dataset.
5. Discussion
In this section, MMLMT is compared with several state-of-the-art methods [
70,
71,
72,
73,
74,
75,
76,
77,
78,
79,
80,
81] in the field. Our analysis encompasses the accuracy metric over the Drug-Consumption, Enron, Music-Emotions, Scene, Solar-Flare-2, Thyroid-L7, and Yeast datasets, shown in
Table 9. According to the results, our method showed a 9.87% improvement on average compared to state-of-the-art methods, underscoring the efficacy of MMLMT on the same datasets. MMLMT achieved higher accuracy than the naïve Bayes method [
70] on the Drug-Consumption dataset. Similarly, the MMLMT method outperformed its counterpart [
71], namely lblMLTC, with its perfect accuracy of 95.40% on the Enron dataset. MMLMT also showed its superiority over a wide range of techniques [
72,
73,
74,
75], including MLkNN, DASMLKNN, PLDLDSA, and more, for the Music-Emotions dataset, with a substantial improvement of 8.19% on average. Similarly, MMLMT revealed superior performance of 93.03% accuracy with a 16.11% enhancement compared to the previous approaches, such as 1R, KNN, RIPPER, and more [
77,
78,
79,
80], within the range of 66.04% to 85.40% accuracies across the Solar-Flare-2 dataset. This outcome highlighted the superior performance of the proposed method in correctly classifying instances in the dataset. The MMLMT method attained a small improvement on the Thyroid-L7 dataset when compared to its state-of-the-art peer, ELM. Finally, MMLMT presented a considerable improvement of 5.75% across the previous multi-label-based techniques (e.g., PLDLDSA, ML-kNN) on the Yeast dataset. These results underscore the efficacy of MMLMT across various datasets, demonstrating important improvements over the state-of-the-art methods in terms of accuracy measurement.
6. Conclusions and Future Works
This paper presents a novel classification method, named the multi-class multi-label logistic model tree (MMLMT). This method enhances flexibility in multi-label classification by integrating a multi-class approach. By utilizing the logistic model tree (LMT) algorithm, MMLMT effectively addresses the challenges inherent in multi-class multi-label classification tasks. This method combines the interpretability of decision trees with the powerful predictive accuracy of logistic regression, providing an efficient solution for complex classification problems. In the experimental studies, comprehensive evaluation using various mathematical metrics such as accuracy, sensitivity, and PRC Area, along with non-parametrical statistical analyses including Friedman Aligned Ranks, Quade, and Wilcoxon tests, confirmed the reliability and significance of our results with acceptable p-values.
The main findings of this study can be summarized as follows:
This study proposes the MMLMT method, which uniquely integrates logistic model trees with a multi-class multi-label classification technique, establishing a novel approach in the literature.
MMLMT addresses the classification tasks in which the dataset has multiple target attributes, where each target attribute in the dataset can have two or more different values.
According to the experimental results, the MMLMT method achieved an average accuracy of 85.90% across eight well-known datasets: Drug-Consumption, Enron, HackerEarth-Adopt-A-Buddy, Music-Emotions, Scene, Solar-Flare-2, Thyroid-L7, and Yeast.
MMLMT showed a 3.91% improvement over previous methods (MMRT, MMNB, MMKNN, MMLR, MMK-Star, MMLWL, and MMHT), reflecting its enhanced performance.
Comparative research with existing studies provided valuable benchmarks, demonstrating the superior performance of MMLMT. It achieved a 9.87% higher average accuracy compared to the results reported in the state-of-the-art studies, underscoring its advanced predictive capability.
The use of the logistic model tree classifier ensures a balance between high accuracy and model interpretability, making MMLMT a notably explainable artificial intelligence (XAI) method.
Demonstrating effectiveness across a range of datasets from various domains, the applicability and reliability of MMLMT have been proven in multi-class multi-label classification tasks.
In conclusion, our method offers a practical and accessible solution for tackling multi-label problems, effectively balancing interpretability and accuracy in classification tasks. The MMLMT method has been proven to have robust performance across various domains, making it a versatile approach for diverse applications in advanced multi-label learning. Its ability to handle different datasets with high accuracy while maintaining transparency in its decision-making process emphasizes its potential as a valuable approach for researchers and practitioners alike. This method contributes significantly to multi-label classification and establishes a solid foundation for addressing real-world challenges in the machine learning field.
While the MMLMT method has shown promising results, several avenues for future research may remain. First, ensemble learning solutions can be further integrated with MMLTM by aggregating the outputs of multiple models in handling more complex datasets. Moreover, a web application can be developed to provide an interface to the MMLMT model that enables users to perform analyses. Another potential area of exploration is the application of MMLMT to other real-world problems in diverse fields beyond those addressed in this study. Conducting domain-specific studies could provide valuable insights into the practical benefits of implementing the MMLMT method in different areas.