Mixed Learning- and Model-Based Mass Estimation of Heavy Vehicles
Abstract
:1. Introduction
2. Mixed-Model- and Learning-Based Mass Estimator
2.1. Recursive Least Squares Mass Estimator
2.2. Network Architecture and Training Procedure
- NN type—The architecture is determined based on the problem by choosing an NN type (feedforward, recurrent, convolutional, etc.). If the task requires spatial data, then a convolutional NN (CNN) may be preferred, whereas an RNN is preferred for sequential data. If there are tabular data, then a shallow NN or DNN might be preferred based on the complexity of the problem. In the context of time series modeling, it is worth noting that DNNs can also be employed for dynamical systems. However, a key distinction arises in how they handle sequential data compared with RNNs. When DNNs are applied to a time series, past values of the same feature are treated as separate inputs because DNNs do not have memories. Consequently, each of these inputs has its own set of weights, leading to an increase in the overall number of parameters. This arises from the fact that distinct weights are assigned to each input, contributing to a potential growth in model complexity. In contrast, RNNs demonstrate a more generic approach because the network handles temporal dependencies using cyclic connections. Hence, the weights for various time steps of a feature do not change, and thus, the same weights are used. Thus, an RNN can capture temporal dependencies without enormously escalating the number of parameters. Nevertheless, since a vanilla RNN is vulnerable to gradient-vanishing problems, a special type of RNN named LSTM was preferred in this study.
- Number of hidden layers—After the NN type is settled on, the number of hidden layers to be used needs to be determined. Although deeper architectures are better at handling complex dependencies, they are computationally more expensive, and they may result in overfitting. Deep networks may need more data to generalize better. On the other hand, shallow networks may lead to underfitting, where the model is unable to capture the dynamics.
- Number of units—After deciding how many hidden layers to use, the quantity of units in the hidden layers is determined. As the number of neurons or cells increases, the computational complexity increases, and the network is prone to overfitting, whereas fewer neurons may cause the network to be unable to capture the complexity.
- Activation function—Activation functions are determined based on the use case (a regression or classification problem) and on the NN type. Sigmoid is preferred for binary classification problems, whereas softmax is used for multi-class classification. ReLU is the most preferred activation function for regression problems with a DNN. Hyperbolic tangent is used for LSTMs.
- Training-validation-testing ratio—The dataset is separated into three parts: training data, validation data, and test data.
- Loss function—The loss function is chosen based on the problem. Loss functions, such as the mean square error (MSE) or mean absolute error (MAE), are used for regression problems, whereas cross-entropy can be used for classification problems.
- Batch size—If processing the whole training dataset at once is not possible due to the hardware constraints, dataset size, and complexity of the NN, then the dataset can be divided into smaller datasets named mini-batches. Therefore, the batch size is the number of training instances that are trained in one iteration. The batch size should evenly divide the total number of training data. During each iteration of training, the parameters of a mini-batch are updated.
- Epoch number—The epoch number represents how many times the learning algorithm is applied to the entire training dataset. It is crucial to choose a sufficiently large maximum epoch number to ensure thorough training of the network. Meanwhile, the number of iterations in one epoch is obtained by dividing the number of training datasets by the batch size.
- Early stopping criteria—If the cost function does not diminish for consecutive predefined times, the training is terminated without considering the maximum number of epochs.
- Optimizer—Stochastic gradient descent (SGD), root-mean-square propagation (RMSprop), and adaptive moment estimation (ADAM) are commonly used optimizers to minimize the loss function during the training.
- Learning rate—If the learning rate is chosen as being too high, the loss function may diverge. If it is not too high but high, the loss function may not converge to the optimal solution. A smaller learning rate leads to a longer training time, requiring more epochs. The selection of the learning rate should be contingent upon the optimization technique employed for training the NN. The learning rate can either remain constant or decrease after a specified number of iterations.
- Dropout—Dropout is a regularization method used to mitigate overfitting by randomly eliminating selected units throughout each iteration of training [38].
- Sequence length—While longer sequences can capture dynamics better, they can cause a more challenging training process. The sequence length should be chosen based on the specific dataset. Although the sequence length may vary from simulation to simulation, a constant simulation time may be preferred if it is possible in order not to need sequence padding or truncation.
- Design—The network configuration is established by selecting the type of NN, specifying the number of layers, choosing the number of units for each layer, and determining the activation functions. Units can be neurons, kernels, or cells based on the NN type. The activation functions are chosen based on the NN type and based on use cases, such as classification and regression. Meanwhile, different NN types can be used in combination, based on the problem. The parameters and are defined by choosing the network architecture. If the network architecture is designed such that not enough weights and biases are defined or if the number of data trained is insufficient, then the training accuracy will be lower and underfitting occurs since the network cannot fully learn. In this case, the amount of high-quality data or the quantity of layers or units can be increased. On the other hand, if more than the required parameters are defined in a network, the network also unintentionally learns the noise during the training. This phenomenon is called overfitting.
- Learning—Parameters are obtained by minimizing the difference between the predicted output and the actual measurements based on a cost function. Hyperparameters, such as the training-validation-test ratio, dropout, epoch number, weight initialization, number of batches, and early stopping criteria, are crucial for ensuring a swift and accurate learning process. In order to detect and prevent overfitting, the validation data set is used. Therefore, validation is also done during training, and overfitting can be observed by assessing the learned model on a separate validation dataset. In the scenario of overfitting, the discrepancy between the predicted output and the actual output, as measured on the validation dataset, is significantly greater than the error observed on the training dataset. To address the overfitting, it is recommended to consider using a network architecture with a reduced number of parameters during the initial design phase. Additionally, to mitigate the issue of overfitting, the technique of dropout might be employed.
- Testing—The loss function is mainly used to determine the weights and biases for training data, whereas performance metrics are used to evaluate the performance after the model is trained. MAE, MSE, root MSE (RMSE), or R2 can be used as regression metrics, whereas accuracy, precision, recall, or F1 score can be used as classification metrics. Consequently, the acquired model is evaluated using novel data that are distinct from the training or validation datasets. The predicted output from the trained model is compared with the true system output to evaluate the model.
2.3. LSTM Supervisor
3. Results
3.1. Data Generation
- Without braking—RLS mass estimator was not used during braking since braking dynamics are much more complex and less accurate than traction dynamics.
- Engaged clutch—Data were generated while the clutch was fully engaged so that the engine was always engaged with the transmission. This criterion was used to minimize the effect of different drivers and transients. Otherwise, the RLS mass estimator would be available for larger intervals; however, the estimation accuracy would diminish.
- Fixed gear—The data samples were for the highest gear in order to reduce the number of data and the number of necessary parameters because data collection for all gears would not increase the mass estimation accuracy, although it would make RLS available for longer periods.
- Limited speed—Data samples were generated based on specified vehicle speed intervals. RLS was assumed to be used on highways; therefore, the speed limit was chosen based on the highest gear, where the minimum speed was defined as 60 km/h, whereas the maximum speed of the vehicle was 95 km/h.
- Without steering—Data were generated while driving on a straight road because only longitudinal vehicle dynamics were considered.
- Flat roads—Downhill or uphill scenarios were not included in order to decrease the complexity.
- Without wind—Wind effects were not included in order to have a more accurate model. For simulation purposes, it could be added, but in reality, it is not known how to obtain wind information at the road level.
3.2. Reliability Analysis
4. Conclusions and Future Study
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Xu, C.; Geyer, S.; Fathy, H.K. Formulation and comparison of two real-time predictive gear shift algorithms for connected/automated heavy-duty vehicles. IEEE Trans. Veh. Technol. 2019, 68, 7498–7510. [Google Scholar] [CrossRef]
- Chen, Y.L.; Shen, K.Y.; Wang, S.C. Forward collision warning system considering both time-to-collision and safety braking distance. In Proceedings of the 2013 IEEE 8th Conference on Industrial Electronics and Applications (ICIEA), Melbourne, Australia, 19–21 June 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 972–977. [Google Scholar]
- Kober, W.; Hirschberg, W. On-board payload identification for commercial vehicles. In Proceedings of the 2006 IEEE International Conference on Mechatronics, Budapest, Hungary, 3–5 July 2006; IEEE: Piscataway, NJ, USA, 2006; pp. 144–149. [Google Scholar]
- Switkes, J.P.; Erlien, S.M.; Schuh, A.B. Applications for Using Mass Estimations for Vehicles. U.S. Patent 20220229446-A1, 21 July 2022. [Google Scholar]
- Ritter, A. Optimal Control of Battery-Assisted Trolley Buses. Ph.D. Thesis, ETH Zurich, Zurich, Switzerland, 2021. [Google Scholar]
- Torabi, S.; Wahde, M.; Hartono, P. Road grade and vehicle mass estimation for heavy-duty vehicles using feedforward neural networks. In Proceedings of the 2019 4th international conference on intelligent transportation engineering (ICITE), Singapore, 5–7 September 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 316–321. [Google Scholar]
- Korayem, A.H.; Khajepour, A.; Fidan, B. Trailer mass estimation using system model-based and machine learning approaches. IEEE Trans. Veh. Technol. 2020, 69, 12536–12546. [Google Scholar] [CrossRef]
- Leoni, J.; Strada, S.; Tanelli, M.; Savaresi, S.M. Real Time Passenger Mass Estimation for e-scooters. In Proceedings of the 2023 American Control Conference (ACC), San Diego, CA, USA, 31 May–2 June 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 1741–1746. [Google Scholar]
- İşbitirici, A.; Giarré, L.; Xu, W.; Falcone, P. LSTM-Based Virtual Load Sensor for Heavy-Duty Vehicles. Sensors 2024, 24, 226. [Google Scholar] [CrossRef] [PubMed]
- Zhang, H.; Yang, Z.; Shen, J.; Long, Z.; Xiong, H. Dynamic mass estimation framework for autonomous vehicle system via bidirectional gated recurrent unit. IET Control Theory Appl. 2023; Early View. [Google Scholar]
- Mittal, A.; Fairgrieve, A. Vehicle Mass Estimation. U.S. Patent 20180245966-A1, 30 August 2018. [Google Scholar]
- Rezaeian, A.; Li, D. Vehicle Center of Gravity Height Detection and Vehicle Mass Detection Using Light Detection and Ranging Point Cloud Data. U.S. Patent 20220144289-A1, 12 May 2022. [Google Scholar]
- Huang, X. Method for Real-Time Mass Estimation of a Vehicle System. U.S. Patent 20190186985-A1, 20 June 2019. [Google Scholar]
- Jundt, O.; Juhasz, G.; Weis, R.; Skrabak, A. System and Method for Identifying a Change in Load of a Commercial Vehicle. U.S. Patent 20220041172-A1, 10 February 2022. [Google Scholar]
- Bae, H.S.; Ryu, J.; Gerdes, J.C. Road grade and vehicle parameter estimation for longitudinal control using GPS. In Proceedings of the IEEE Conference on Intelligent Transportation Systems, Oakland CA, USA, 25–29 August 2001; pp. 25–29. [Google Scholar]
- Fathy, H.K.; Kang, D.; Stein, J.L. Online vehicle mass estimation using recursive least squares and supervisory data extraction. In Proceedings of the 2008 American Control Conference, Seattle, WA, USA, 11–13 June 2008; IEEE: Piscataway, NJ, USA, 2008; pp. 1842–1848. [Google Scholar]
- Vahidi, A.; Stefanopoulou, A.; Peng, H. Recursive least squares with forgetting for online estimation of vehicle mass and road grade: Theory and experiments. Veh. Syst. Dyn. 2005, 43, 31–55. [Google Scholar] [CrossRef]
- McIntyre, M.L.; Ghotikar, T.J.; Vahidi, A.; Song, X.; Dawson, D.M. A two-stage Lyapunov-based estimator for estimation of vehicle mass and road grade. IEEE Trans. Veh. Technol. 2009, 58, 3177–3185. [Google Scholar] [CrossRef]
- Kim, D.; Choi, S.B.; Oh, J. Integrated vehicle mass estimation using longitudinal and roll dynamics. In Proceedings of the 2012 12th International Conference on Control, Automation and Systems, Jeju Island, Republic of Korea, 17–21 October 2012; IEEE: Piscataway, NJ, USA, 2012; pp. 862–867. [Google Scholar]
- Paulsson, E.; Åsman, L. Vehicle Mass and Road Grade Estimation using Recursive Least Squares. Master’s Thesis, Lund University, Lund, Sweden, 2016. [Google Scholar]
- Islam, S.A.U.; Bernstein, D.S. Recursive least squares for real-time implementation [lecture notes]. IEEE Control Syst. Mag. 2019, 39, 82–85. [Google Scholar] [CrossRef]
- Hoagg, J.B.; Ali, A.A.; Mossberg, M.; Bernstein, D.S. Sliding window recursive quadratic optimization with variable regularization. In Proceedings of the 2011 American Control Conference, San Francisco, CA, USA, 29 June–1 July 2011; IEEE: Piscataway, NJ, USA, 2011; pp. 3275–3280. [Google Scholar]
- Lai, B.; Islam, S.A.U.; Bernstein, D.S. Regularization-induced bias and consistency in recursive least squares. In Proceedings of the 2021 American Control Conference (ACC), 26–28 May 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 3987–3992. [Google Scholar]
- Bruce, A.L.; Goel, A.; Bernstein, D.S. Convergence and consistency of recursive least squares with variable-rate forgetting. Automatica 2020, 119, 109052. [Google Scholar] [CrossRef]
- Goel, A.; Bruce, A.L.; Bernstein, D.S. Recursive least squares with variable-direction forgetting: Compensating for the loss of persistency [lecture notes]. IEEE Control Syst. Mag. 2020, 40, 80–102. [Google Scholar] [CrossRef]
- Bruce, A.L.; Goel, A.; Bernstein, D.S. Necessary and sufficient regressor conditions for the global asymptotic stability of recursive least squares. Syst. Control Lett. 2021, 157, 105005. [Google Scholar] [CrossRef]
- Lai, B.; Bernstein, D.S. Generalized Forgetting Recursive Least Squares: Stability and Robustness Guarantees. arXiv 2023, arXiv:2308.04259. [Google Scholar] [CrossRef]
- Lai, B.; Bernstein, D.S. Exponential Resetting and Cyclic Resetting Recursive Least Squares. IEEE Control Syst. Lett. 2022, 7, 985–990. [Google Scholar] [CrossRef]
- Yu, Z.; Hou, X.; Leng, B.; Huang, Y. Mass estimation method for intelligent vehicles based on fusion of machine learning and vehicle dynamic model. Auton. Intell. Syst. 2022, 2, 4. [Google Scholar] [CrossRef]
- Wang, Y. A new concept using LSTM Neural Networks for dynamic system identification. In Proceedings of the 2017 American Control Conference (ACC), Seattle, WA, USA, 24–26 May 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 5324–5329. [Google Scholar]
- Xing, Y.; Lv, C. Dynamic state estimation for the advanced brake system of electric vehicles by using deep recurrent neural networks. IEEE Trans. Ind. Electron. 2019, 67, 9536–9547. [Google Scholar] [CrossRef]
- Lindemann, B.; Maschler, B.; Sahlab, N.; Weyrich, M. A survey on anomaly detection for technical systems using LSTM networks. Comput. Ind. 2021, 131, 103498. [Google Scholar] [CrossRef]
- Torabi, S. Fuel-Efficient Driving Strategies. Ph.D. Thesis, Chalmers University of Technology, Gothenburg, Sweden, 2020. [Google Scholar]
- Ljung, L. System Identification: Theory for the User; Prentice Hall: Upper Saddle River, NJ, USA, 1999. [Google Scholar]
- Söderström, T.; Stoica, P. System Identification; Prentice Hall International: Upper Saddle River, NJ, USA, 2001. [Google Scholar]
- Aström, K.J.; Wittenmark, B. Adaptive Control; Addison Wesley: Boston, MA, USA, 1995. [Google Scholar]
- Chowdhury, K. 10 Hyperparameters to Keep an Eye on for Your LSTM Model—and Other Tips; Geek Culture: Singapore, 2023. [Google Scholar]
- Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 2014, 15, 1929–1958. [Google Scholar]
- Glorot, X.; Bengio, Y. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, Sardinia, Italy, 13–15 May 2010; pp. 249–256. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1026–1034. [Google Scholar]
- Fawcett, T. An introduction to ROC analysis. Pattern Recognit. Lett. 2006, 27, 861–874. [Google Scholar] [CrossRef]
- Scikit. Metrics and Scoring: Quantifying the Quality of Predictions. 2024. Available online: https://scikit-learn.org/stable/modules/model_evaluation.html (accessed on 19 January 2024).
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Gers, F.A.; Schmidhuber, J.; Cummins, F. Learning to forget: Continual prediction with LSTM. Neural Comput. 2000, 12, 2451–2471. [Google Scholar] [CrossRef]
- Ng, A. Sequence Models Complete Course. 2021. Available online: https://www.youtube.com/watch?v=S7oA5C43Rbc (accessed on 29 April 2024).
- Ba, J.L.; Kiros, J.R.; Hinton, G.E. Layer normalization. arXiv 2016, arXiv:1607.06450. [Google Scholar]
Predicted | |||
---|---|---|---|
Positive | Negative | ||
Actual | Positive | True positive | False negative |
Negative | False positive | True negative |
Predicted | |||
---|---|---|---|
Positive | Negative | ||
Actual | Positive | 129,057 | 10,509 |
Negative | 11,833 | 136,601 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
İşbitirici, A.; Giarré, L.; Falcone, P. Mixed Learning- and Model-Based Mass Estimation of Heavy Vehicles. Vehicles 2024, 6, 765-780. https://doi.org/10.3390/vehicles6020036
İşbitirici A, Giarré L, Falcone P. Mixed Learning- and Model-Based Mass Estimation of Heavy Vehicles. Vehicles. 2024; 6(2):765-780. https://doi.org/10.3390/vehicles6020036
Chicago/Turabian Styleİşbitirici, Abdurrahman, Laura Giarré, and Paolo Falcone. 2024. "Mixed Learning- and Model-Based Mass Estimation of Heavy Vehicles" Vehicles 6, no. 2: 765-780. https://doi.org/10.3390/vehicles6020036
APA Styleİşbitirici, A., Giarré, L., & Falcone, P. (2024). Mixed Learning- and Model-Based Mass Estimation of Heavy Vehicles. Vehicles, 6(2), 765-780. https://doi.org/10.3390/vehicles6020036