Time Sequence Deep Learning Model for Ubiquitous Tabular Data with Unique 3D Tensors Manipulation
Abstract
:1. Introduction
2. Background and Related Work
3. Materials and Methods
3.1. Datasets
3.2. Preliminaries
4. Proposed Time Sequence Deep Learning Predictive Model for Tabular Data
4.1. Novelty in Shifting Perspective for Utilization of Time Sequence DL
4.2. Model Architecture
4.3. Development Approach and Code Specifications
Algorithm 1. Time sequence model based on Stacked Bidirectional LSTM networks for tabular datasets. |
Input: Tabular datasets with x features. Output: Target variable Y is a binary label. |
# Step 1: Load data 1.1 Loading tabular datasets # Step 2: Preprocess of datasets 2.1 Feature Extraction from Raw Data 2.2 Missing Values Handling 2.3 Vectorization 2.4 Parameter Scaling (normalization with MinMaxScaler; feature range from 0 to 1) 2.5 Encoding 2.6 Outlier Detection 2.7 Optional: Dimensionality Reduction Techniques (feature selection) 2.8 Shuffling preprocessed data to introduce randomness and prevent model bias # Step 3: 3D tensors reshaping 3.1 Transforming input data into 3D tensors as an adequate input to the LSTM-based model. • 3D tensor is shaped as (number_of_samples, 1, number_of_features) • The target vector Y does not undergo any reshaping and remains a 1D vector-shaped (number_of_samples,) 3.2 Defining the input form based on the number of time steps and the number of features 3.3 Separation of data into a training set and test set 3.4 Defining the ratio of data allocated to each set # Step 4: Design, model training, and hyperparameter tuning of Stacked Bidirectional LSTM classifiers 4.1 Initialization of deep learning Stacked Bidirectional LSTM model architecture with hidden layers 4.2 Setting hyperparameters 4.3 Compiling the model with the appropriate loss function, optimizer, and hyperparameters 4.4 Training the model using training data • Setting the number of epochs and the size of the training series • Computation of RMSE, loss and accuracy convergence monitoring 4.5 Model Performance Evaluation. Testing and thresholding # Step 5: Visualization of the results 5.1 Graphical representation of actual values compared to those predicted to visualize the performance of the model 5.2 Analysis of the obtained values to assess the accuracy of the prediction |
5. Results and Discussion
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Conflicts of Interest
References
- Borisov, V.; Leemann, T.; Seßler, K.; Haug, J.; Pawelczyk, M.; Kasneci, G. Deep Neural Networks and Tabular Data: A Survey. IEEE Trans. Neural Netw. Learn. Syst. 2022, 35, 7499–7519. [Google Scholar] [CrossRef] [PubMed]
- Grinsztajn, L.; Oyallon, E.; Varoquaux, G. Why do tree-based models still outperform deep learning on tabular data? arXiv 2022, arXiv:2207.08815 [cs.LG]. [Google Scholar]
- Brigato, L.; Iocchi, L. A Close Look at Deep Learning with Small Data. In Proceedings of the 25th International Conference on Pattern Recognition (ICPR), Milan, Italy, 10–15 January 2020. [Google Scholar]
- Alzubaidi, L.; Bai, J.; Al-Sabaawi, A.; Santamaría, J.; Albahri, A.S.; Al-dabbagh, B.S.N.; Fadhel, M.A.; Manoufali, M.; Zhang, J.; Al-Timemy, A.H.; et al. A survey on deep learning tools dealing with data scarcity: Defnitions, challenges, solutions, tips, and applications. J. Big Data 2023, 10, 46. [Google Scholar] [CrossRef]
- Gorishniy, Y.; Rubachev, I.; Khrulkov, V.; Babenko, A. Revisiting Deep Learning Models for Tabular Data. arXiv 2023, arXiv:2106.11959 [cs.LG]. [Google Scholar]
- Hofmann, H. UCI Machine Learning Repository: Statlog (German Credit Data) Data Set. Institut fur Statistik und “Okonometrie Universit” at Hamburg. 1994. Available online: https://archive.ics.uci.edu/dataset/144/statlog+german+credit+data URL (accessed on 2 May 2024).
- Quinlan, R. UCI Machine Learning Repository—Statlog (Australian Credit Approval) Dataset. Available online: https://archive.ics.uci.edu/ml/datasets/Statlog+%28Australian+Credit+Approval%29 URL (accessed on 12 May 2024).
- I-Cheng, Y. Default of Credit Card Clients. Taiwan. UCI Machine Learning Repository. 2016. Available online: https://archive.ics.uci.edu/dataset/350/default+of+credit+card+clients URL (accessed on 13 May 2024).
- Freshcorn, B. Give Me Some Credit: 2011 Competition Data. 2011. Available online: https://www.kaggle.com/brycecf/give-me-some-credit-dataset URL (accessed on 2 May 2024).
- FICO. Home Equity Line of Credit (HELOC). 2019. Available online: https://community.fico.com/s/explainable-machine-learning-challenge URL (accessed on 2 May 2024).
- Becker, B.; Kohavi, R. UCI Machine Learning Repository Adult Dataset. 30 April 1996. Available online: http://archive.ics.uci.edu/dataset/2/adult (accessed on 13 May 2024).
- Sezer, O.B.; Gudelek, M.U.; Ozbayoglu, A.M. Financial time series forecasting with deep learning: A systematic literature review: 2005–2019. Appl. Soft Comput. 2020, 90, 106181. [Google Scholar] [CrossRef]
- Wube, H.D.; Esubalew, S.Z.; Weldesellasie, F.F.; TDebelee, G. Deep Learning and Machine Learning Techniques for Credit Scoring: A Review. In Proceedings of the Pan-African Conference on Artificial Intelligence, Addis Ababa, Ethiopia; 2024. Available online: https://www.researchgate.net/publication/379709296_Deep_Learning_and_Machine_Learning_Techniques_for_Credit_Scoring_A_Review (accessed on 2 May 2024).
- Adisa, J.; Ojo, S.; Owolawi, P.; Pretorius, A.; Ojo, S.O. Credit Score Prediction using Genetic Algorithm-LSTM Technique. In Proceedings of the 2022 Conference on Information Communications Technology and Society (ICTAS), Durban, South Africa, 9–10 March 2022. [Google Scholar]
- Wang, C.; Han, D.; Liu, Q.; Luo, S. A deep learning approach for credit scoring of peer-to-peer lending using attention mechanism LSTM. IEEE Access 2018, 7, 2161–2168. [Google Scholar] [CrossRef]
- Hayashi, Y. Emerging Trends in Deep Learning for Credit Scoring: A Review. Electronics 2022, 11, 3181. [Google Scholar] [CrossRef]
- Shen, F.; Zhao, X.; Kou, G.; Alsaadi, F.E. A new deep learning ensemble credit risk evaluation model with an improved synthetic minority oversampling technique. Appl. Soft Comput. 2020, 98, 106852. [Google Scholar] [CrossRef]
- Gicić, A.; Đonko, D.; Subasi, A. Intelligent credit scoring using deep learning methods. Concurr. Comput. 2023, 35, e7637. [Google Scholar] [CrossRef]
- Gicić, A.; Ðonko, D. Proposal of a model for credit risk prediction based on deep learning methods and SMOTE techniques for imbalanced dataset. In Proceedings of the 2023 XXIX International Conference on Information, Communication and Automation Technologies (ICAT), Sarajevo, Bosnia and Herzegovina, 11–14 June 2023. [Google Scholar]
- Onan, A.; Toçoglu, M. A term weighted neural language model and stacked bidirectional LSTM based framework for sarcasm identification. IEEE Access 2021, 9, 7701–7722. [Google Scholar] [CrossRef]
- Graves, A.; Jaitly, N.; Mohamed, A.-R. Hybrid speech recognition with deep. In Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) IEEE, Olomouc, Czech Republic, 8–12 December 2013; pp. 273–278. [Google Scholar]
- Zhang, S.; Zheng, D.; Hu, X.; Yang, M. Bidirectional Long Short-Term Memory Networks for Relation Classification. In Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation, Shanghai, China, 30 October–1 November 2015. [Google Scholar]
- Liu, R. Machine Learning Approaches to Predict Default of Credit Card Clients. Mod. Econ. 2018, 9, 1828–1838. [Google Scholar] [CrossRef]
- Lessmann, S.; Baesens, B.; Seow, H.-V.; Thomas, L.C. Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research. Eur. J. Oper. Res. 2015, 247, 124–136. [Google Scholar] [CrossRef]
German | Australian | Taiwan | Give Me Some Credit | HELOC | Adult Income | |
---|---|---|---|---|---|---|
Samples | 1000 | 690 | 30,000 | 150,000 | 10,459 | 48,872 |
Features | 20 (cat.) 24 (num) | 14 | 24 | 11 | 23 | 14 |
Classes | 2 | 2 | 2 | 2 | 2 | 2 |
Positive samples | 700 | 307 | 6636 | 139,974 | 5000 | 11,687 |
Negative samples | 300 | 383 | 23,364 | 10,026 | 5459 | 37,155 |
Imbalance ratio | 2.33 | 1.25 | 3.52 | 13.96 | 1.09 | 3.18 |
Hyperparameters | Values | Note |
---|---|---|
Number of hidden layers | 3 | |
Number of nodes of the input layer | 60 | Input layer |
Number of nodes of 1. hidden layer | 60 × 2 = 120 | First hidden layer |
Number of nodes of 2. hidden layer | 80 × 2 = 160 | Second hidden layer |
Number of nodes of 3. hidden layer | 120 × 2 = 240 | Third hidden layer |
Number of nodes of a dense layer | 1 | Possible values: 0, 1 |
Dropout | 0.2 | |
Decay rate | 0.97 | Default value |
Activation function | Relu | |
Learning rate | 0.01 | Default value |
Momentum | 0.9 | |
Number of epochs | 50 | |
Batch size | 32 | |
Decision threshold | 0.57 |
Measure | German (Categorical) | German (Numerical) | Australian | Taiwan | Give Me Some Credit | HELOC | Adult Income |
---|---|---|---|---|---|---|---|
AUC | 0.9232 (±0.0880) | 0.8844 (±0.0932) | 0.9519 (±0.0201) | 0.7855 (±0.0066) | 0.8316 (±0.0036) | 0.8063 (±0.0040) | 0.9120 (±0.0033) |
Accuracy | 0.877 (±0.0676) | 0.8720 (±0.0778) | 0.8884 (±0.0337) | 0.8238 (±0.0060) | 0.9369 (±0.0008) | 0.7345 (±0.0075) | 0.8566 (±0.0058) |
RMSE | 0.2991 (±0.0846) | 0.3036 (±0.0914) | 0.2848 (±0.0457) | 0.3642 (±0.0046) | 0.2249 (±0.0008) | 0.4242 (±0.0030) | 0.3136 (±0.0040) |
Precision | 0.8642 (±0.1093) | 0.8260 (±0.1185) | 0.8758 (±0.0226) | 0.7017 (±0.0124) | 0.6272 (±0.0402) | 0.7296 (±0.0182) | 0.7446 (±0.0136) |
Recall | 0.7077 (±0.1246) | 0.7321 (±0.1561) | 0.8743 (±0.0514) | 0.3557 (±0.0318) | 0.1520 (±0.0464) | 0.7823 (±0.0210) | 0.6115 (±0.0383) |
F-Measure | 0.7769 (±0.1179) | 0.7737 (±0.1370) | 0.8745 (±0.0325) | 0.4710 (±0.0261) | 0.2395 (±0.0565) | 0.7546 (±0.0065) | 0.6705 (±0.0207) |
Kappa | 0.6938 (±0.1629) | 0.6859 (±0.1891) | 0.7723 (±0.0672) | 0.3789 (±0.0240) | 0.2196 (±0.0517) | 0.4663 (±0.0152) | 0.5802 (±0.0228) |
Methods | Accuracy | RMSE | AUC | Precision | Recall | F-Measure | Kappa |
---|---|---|---|---|---|---|---|
DT | 86.09 | 0.3386 | 0.867 | 0.861 | 0.861 | 0.861 | 0.7183 |
SVM | 84.64 | 0.3919 | 0.853 | 0.857 | 0.846 | 0.847 | 0.6941 |
MLP | 83.77 | 0.3763 | 0.888 | 0.861 | 0.843 | 0.852 | 0.6722 |
DT-boosting | 84.35 | 0.3696 | 0.902 | 0.853 | 0.867 | 0.860 | 0.6825 |
SVM-boosting | 82.61 | 0.3459 | 0.902 | 0.854 | 0.828 | 0.841 | 0.6493 |
MLP-boosting | 83.77 | 0.3868 | 0.856 | 0.861 | 0.843 | 0.852 | 0.6722 |
DT-bagging | 86.38 | 0.3215 | 0.918 | 0.881 | 0.872 | 0.877 | 0.7245 |
SVM-bagging | 85.22 | 0.3575 | 0.893 | 0.863 | 0.852 | 0.853 | 0.7059 |
MLP-bagging | 85.65 | 0.3363 | 0.909 | 0.876 | 0.864 | 0.870 | 0.71 |
LSTM | 87.22 | 0.3366 | 0.914 | 0.854 | 0.885 | 0.869 | 0.7998 |
LSTM + GA | 89.27% | - | - | - | - | - | - |
Methods | Accuracy | AUC |
---|---|---|
IGDFS + GBT classifier | 98.66 | ---- |
Feature selection + ML classifier selection | 93.12 | ---- |
NRS + ML ensemble | 86.47 | ---- |
Hybrid binary particle swarm optimization and gravitational search algorithm (BPSOGSA) | 85.78 | ---- |
Artificial bee colony-based SVM | 84.0 | ---- |
Bolasso-based feature selection | 84.0 | 0.713 |
Fuzzy group decision making (GDM) | 82.0 | 0.824 |
Heterogeneous ensemble | ---- | 0.684 |
Ensemble classifiers | ---- | 0.77 |
Multi-stage ensemble | 79.5 | 0.831 |
Methods | Accuracy | AUC |
---|---|---|
MCDM-based evaluation approach | ------ | 0.961 |
DNN (time delay neural network) | 88.24 | ------ |
GFSS | 87.6 | 0.813 |
ELM + novel activation function | 80.57 | 0.862 |
Hybrid approach based on filter approach and multiple population GA | 78.53 | ------ |
ML + expert knowledge with GA | ------ | 0.789 |
Step-wise multi-grained augmented boosting DT | 77.15 | 0.792 |
HELOC | Adult | ||||
---|---|---|---|---|---|
Method | Acc | AUC | Acc | AUC | |
Machine Learning | Linear Model | 73.0 ± 0.0 | 80.1 ± 0.1 | 82.5 ± 0.2 | 85.4 ± 0.2 |
KNN | 72.2 ± 0.0 | 79.0 ± 0.1 | 83.2 ± 0.2 | 87.5 ± 0.2 | |
Decision Trees | 80.3 ± 0.0 | 89.3 ± 0.1 | 85.3 ± 0.2 | 89.8 ± 0.1 | |
Random Forest | 82.1 ± 0.2 | 90.0 ± 0.2 | 86.1 ± 0.2 | 91.7 ± 0.2 | |
XGBoost | 83.5 ± 0.2 | 92.2 ± 0.0 | 87.3 ± 0.2 | 92.8 ± 0.1 | |
LightGBM | 83.5 ± 0.1 | 92.3 ± 0.0 | 87.4 ± 0.2 | 92.9 ± 0· 1 | |
CatBoost | 83.6 ± 0.3 | 92.4 ± 0.1 | 87.2 ± 0.2 | 92.8 ± 0.1 | |
Model Trees | 82.6 ± 0.2 | 91.5 ± 0.0 | 85.0 ± 0.2 | 90.4 ± 0.1 | |
Deep Learning | MLP | 73.2 ± 0.3 | 80.3 ± 0.1 | 84.8 ± 0.1 | 90.3 ± 0.2 |
VIME | 72.7 ± 0.0 | 79.2 ± 0.0 | 84.8 ± 0.2 | 90.5 ± 0.2 | |
DeepFM | 73.6 ± 0.2 | 80.4 ± 0.1 | 86.1 ± 0.2 | 91.7 ± 0.1 | |
DeepGBM | 78.0 ± 0.4 | 84.1 ± 0.1 | 84.6 ± 0.3 | 90.8 ± 0.1 | |
NODE | 79.8 ± 0.2 | 87.5 ± 0.2 | 85.6 ± 0.3 | 91.1 ± 02 | |
NAM | 73.3 ± 0.1 | 80.7 ± 0.3 | 83.4 ± 0.1 | 86.6 ± 0.1 | |
Net-DNF | 82.6 ± 0.4 | 91.5 ± 0.2 | 85.7 ± 0.2 | 91.3 ± 0.1 | |
TabNet | 81.0 ± 0.1 | 90.0 ± 0.1 | 85.4 ± 0.2 | 91.1 ± 0.1 | |
TabTransformer | 73.3 ± 0.1 | 80, 1 ± 0.2 | 85.2 ± 0.2 | 90.6 ± 0.2 | |
SAINT | 82.1 ± 0.3 | 90.7 ± 0.2 | 86.1 ± 0.3 | 91.6 ± 0.2 | |
RLN | 73.2 ± 0.4 | 80.1 ± 0.4 | 81.0 ± 1.6 | 75.9 ± 8.2 | |
STG | 73.1 ± 0.1 | 80.0 ± 0.1 | 85.4 ± 0.1 | 90.9 ± 0.1 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Gicic, A.; Đonko, D.; Subasi, A. Time Sequence Deep Learning Model for Ubiquitous Tabular Data with Unique 3D Tensors Manipulation. Entropy 2024, 26, 783. https://doi.org/10.3390/e26090783
Gicic A, Đonko D, Subasi A. Time Sequence Deep Learning Model for Ubiquitous Tabular Data with Unique 3D Tensors Manipulation. Entropy. 2024; 26(9):783. https://doi.org/10.3390/e26090783
Chicago/Turabian StyleGicic, Adaleta, Dženana Đonko, and Abdulhamit Subasi. 2024. "Time Sequence Deep Learning Model for Ubiquitous Tabular Data with Unique 3D Tensors Manipulation" Entropy 26, no. 9: 783. https://doi.org/10.3390/e26090783
APA StyleGicic, A., Đonko, D., & Subasi, A. (2024). Time Sequence Deep Learning Model for Ubiquitous Tabular Data with Unique 3D Tensors Manipulation. Entropy, 26(9), 783. https://doi.org/10.3390/e26090783