Improved Carpooling Experience through Improved GPS Trajectory Classification Using Machine Learning Algorithms
Abstract
:1. Introduction
Related Work
2. Dataset and Methods
2.1. Selection of Input Features Vector
2.2. Proposed Methodology
2.3. Feature Ranking and Reduction Protocol
Algorithm 1: Recursive Elimination of Features |
|
2.4. Performance Evaluation Metrics
3. Results
3.1. Model Performance Evaluation
3.2. Preventive Analytics through Feature Ranking
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Pandey, M.K.; Subbiah, K. Performance analysis of time series forecasting of ebola casualties using machine learning algorithm. Proc. ITISE 2017, 2, 885–898. [Google Scholar]
- Pandey, M.K. Novel Application Oriented Problem Solving Approaches in SMAC. Banaras Hindu University. 2017. Available online: http://hdl.handle.net/10603/268444 (accessed on 18 May 2022).
- Pandey, M.K.; Srivastava, P.K. A Probe into Performance Analysis of Real-Time Forecasting of Endemic Infectious Diseases Using Machine Learning and Deep Learning Algorithms. In Advanced Prognostic Predictive Modelling in Healthcare Data Analytics; Springer: Berlin/Heidelberg, Germany, 2021; Volume 64, pp. 241–265. [Google Scholar] [CrossRef]
- Chan, N.D.; Shaheen, S.A. Ridesharing in North America: Past, Present, and Future. Transp. Rev. 2012, 32, 93–112. [Google Scholar] [CrossRef]
- Cruz, M.; Macedo, H.; Mendonça, E.; Guimarães, A. GO!Caronas: Fostering Ridesharing with Online Social Network, Candidates Clustering and Ride Matching. In Proceedings of the 2016 8th Euro American Conference on Telematics and Information Systems (EATIS), Cartagena, Colombia, 28–29 April 2016. [Google Scholar]
- Kalanick, T.; Camp, G. Uber. 2015. Available online: https://www.uber.com/ (accessed on 18 May 2022).
- Mazzella, F. Blablacar. Available online: http://www.blablacar.com (accessed on 26 May 2022).
- Ferrero, F.; Perboli, G.; Rosano, M.; Vesco, A. Car-sharing services: An annotated review. Sustain. Cities Soc. 2018, 37, 501–518. [Google Scholar] [CrossRef]
- He, W.; Li, D.; Zhang, T.; An, L.; Guo, M.; Chen, G. Mining regular routes from GPS data for ridesharing recommendations. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Beijing, China, 12 August 2012; p. 79. [Google Scholar] [CrossRef]
- Currie, J.; Walker, R. Traffic Congestion and Infant Health: Evidence from E-ZPass. Am. Econ. J. Appl. Econ. 2014, 3, 65–90. [Google Scholar] [CrossRef]
- Levy, J.I.; Buonocore, J.J.; von Stackelberg, K. The Public Health Costs of Traffic Congestion: A Health Risk Assessment. Environ. Health 2010, 9, 65. [Google Scholar] [CrossRef] [PubMed]
- Hart, J.E.; Laden, F.; Puett, R.C.; Costenbader, K.H.; Karlson, E.W. Exposure to Traffic Pollution and Increased Risk of Rheumatoid Arthritis. Environ. Health Perspect. 2009, 117, 1065–1069. [Google Scholar] [CrossRef] [PubMed]
- Eriksson, H.-E.; Penker, M. Business Modeling With UML: Business Patterns at Work; Wiley: Hoboken, NJ, USA, 2000; p. 12. ISBN 978-0471295518. [Google Scholar]
- He, W.; Hwang, K.; Li, D. Intelligent Carpool Routing for Urban Ridesharing by Mining GPS Trajectories. IEEE Trans. Intell. Transp. Syst. 2014, 15, 2286–2296. [Google Scholar] [CrossRef]
- Cruz, M.O.; Macedo, H.; Guimaraes, A. Grouping Similar Trajectories for Carpooling Purposes. In Proceedings of the 2015 Brazilian Conference on Intelligent Systems (BRACIS), Natal, Brazil, 4–7 November 2015; pp. 234–239. [Google Scholar] [CrossRef]
- Carma, S.O. 2015. Dynamic Road Pricing. Available online: https://carmacarpool.com (accessed on 21 May 2022).
- Yan, S.; Chen, C.Y.; Chang, S.C. A Car Pooling Model and Solution Method with Stochastic Vehicle Travel Times. IEEE Trans. Intell. Transp. Syst. 2014, 15, 47–61. [Google Scholar] [CrossRef]
- Matos, M.L.; Cruz, M.; Guimaraes, A.; Macedo, H. A social network for carpooling. In Proceedings of the 7th Euro American Conference on Telematics and Information Systems, Valparaiso, Chile, 2–4 April 2014; pp. 1–6. [Google Scholar] [CrossRef]
- Ghoseiri, K.; Haghani, A.; Hamedi, M. Real-Time Rideshare Matching Problem. Ph.D. Thesis, University of Maryland, College Park, MD, USA, 2011. [Google Scholar]
- Arias-Molinares, D.; García-Palomares, J.C. The Ws of MaaS: Understanding mobility as a service fromaliterature review. IATSS Res. 2020, 44, 253–263. [Google Scholar] [CrossRef]
- Dingil, A.E.; Rupi, F.; Esztergár-Kiss, D. An Integrative Review of Socio-Technical Factors Influencing Travel Decision-Making and Urban Transport Performance. Sustainability 2021, 13, 10158. [Google Scholar] [CrossRef]
- Cruz, M.O.; Macedo, H.T.; Barreto, R.; Guimarães, A.P. GPS + Trajectories. Available online: https://archive.ics.uci.edu/ml/datasets/ (accessed on 22 May 2022).
- Wang, L.P. Support Vector Machines: Theory and Application; Wang, L.P., Ed.; Springer: Berlin, Germany, 2005. [Google Scholar]
- Platt, J. Fast training Support Vector Machines using parallel Sequential Minimal Optimization. In Advances in Kernel Methods—Support Vector Learning; MIT Press: Cambridge, MA, USA, 1998; pp. 41–65. [Google Scholar] [CrossRef]
- Aha, D.W.; Kibler, D.; Albert, M.K. Instance-Based Learning Algorithms. Mach. Learn. 1991, 6, 37–66. [Google Scholar] [CrossRef]
- Witten, I.H.; Frank, E.; Hall, M.A. Data Mining, 4th ed.; Elsevier: Amsterdam, The Netherlands, 2017. [Google Scholar]
- Rodriguez, J.; Kuncheva, L.; Alonso, C. Rotation Forest: A New Classifier Ensemble Method. IEEE Trans. Pattern Anal. Mach. Intell. 2006, 28, 1619–1630. [Google Scholar] [CrossRef]
- Breiman, L. Bagging predictors. Mach. Learn. 1996, 24, 123–140. [Google Scholar] [CrossRef]
- Schumacher, R.S.; Hill, A.J.; Klein, M.; Nelson, J.A.; Erickson, M.J.; Trojniak, S.M.; Herman, G.R. From Random Forests to Flood Forecasts: A Research to Operations Success Story. Bull. Am. Meteorol. Soc. 2021, 102, E1742–E1755. [Google Scholar] [CrossRef]
- Friedman, J.; Hastie, T.; Tibshirani, R. Additive logistic regression: A statistical view of boosting (with discussion and a rejoinder by the authors). Ann. Stat. 2000, 28, 337–407. [Google Scholar] [CrossRef]
- Kira, K.; Rendell, L.A. A Practical Approach to Feature Selection. In Machine Learning Proceedings 1992; Elsevier: Amsterdam, The Netherlands, 1992; pp. 249–256. [Google Scholar] [CrossRef]
- Kononenko, I. Estimating Attributes: Analysis and Extensions of RELIEF. In European Conference on Machine Learning; Springer: Berlin/Heidelberg, Germany, 1994; Volume 784, pp. 171–182. [Google Scholar] [CrossRef]
- Breiman, L. Technical note: Some properties of splitting criteria. Mach. Learn. 1996, 24, 41–47. [Google Scholar] [CrossRef]
- Pandey, M.K.; Mittal, M.; Subbiah, K. Optimal balancing & efficient feature ranking approach to minimize credit risk. Int. J. Inf. Manag. Data Insights 2021, 1, 100037. [Google Scholar] [CrossRef]
Attribute Type | Numerical |
---|---|
Number of attributes | 15 |
Number of instances | 163 |
Number of classes | 2 |
Features | Descriptor |
---|---|
d_android | Devices to capture the instances |
speed | Speed in km/H is captured |
distance | Total distance in km is captured |
rating | This is the evaluation parameter of the user’s experience in terms of good (2), normal (1), and bad (3). |
rating_bus | The evaluation parameter is associated with crowding of the bus, crowded means rating is 1, a little crowded means rating is 2, and not crowded at all is represented with rating 3. |
rating_weather | The evaluation parameter is associated with weather; 1 is for sunny and 2 for rainy conditions. |
car_or_bus | The overall experience of choosing a Car (2) or Bus (1). |
N | Algorithm | Definition |
---|---|---|
1. | Random Tree | A tree is built by considering K randomly chosen attributes at each node without pruning. This permits the assessment of class probabilities based on the training and testing set. |
2. | Multi-Layer Perception (MLP) | Instances would be classified in a backpropagation manner. |
3. | Polykernel sequential minimal optimization (SMO) | This algorithm trains the support vector classifiers [23,24] |
4. | Instance-based learning with k-parameter (IBK) | K-nearest neighbor’s classifier picks the most suitable K based on the cross-validation technique as well as based on the weights of distance [25,26]. |
5. | Rotation Forest (RF) | Classification is done using the base learner [27] |
6. | Bagging | This is used mainly to reduce the variance along with the classification of the base learners [28]. |
7. | Random Forest | A forest of random trees is constructed [29]. |
8. | RealADABoost | Performance is improved using ensemble learning [30]. |
Machine Learning Algorithms | 10 Folds | LOOCV | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
Sensitivity | Specificity | Accuracy | AUC | MCC | Sensitivity | Specificity | Accuracy | AUC | MCC | |
MLP | 89.7 | 80.3 | 85.3 | 0.873 | 0.705 | 89.7 | 78.9 | 84.7 | 0.874 | 0.693 |
SMO-PUK | 94.3 | 69.7 | 82.8 | 0.820 | 0.667 | 94.3 | 68.4 | 82.2 | 0.813 | 0.656 |
IBK | 82.8 | 81.6 | 82.2 | 0.815 | 0.643 | 82.8 | 82.9 | 82.8 | 0.834 | 0.656 |
Rotation Forest | 92.0 | 73.7 | 83.4 | 0.876 | 0.672 | 92.0 | 72.4 | 82.8 | 0.901 | 0.661 |
Bagging | 92.0 | 71.1 | 82.2 | 0.876 | 0.650 | 94.3 | 72.4 | 84.0 | 0.898 | 0.689 |
Random Forest | 87.4 | 73.7 | 81.0 | 0.893 | 0.619 | 90.8 | 77.6 | 84.7 | 0.924 | 0.694 |
Random Tree | 79.3 | 75.0 | 77.3 | 0.772 | 0.544 | 81.6 | 81.6 | 81.6 | 0.816 | 0.631 |
RealADABoost-Decision Stump | 83.9 | 78.9 | 81.6 | 0.881 | 0.630 | 83.9 | 80.3 | 82.2 | 0.869 | 0.642 |
RealADABoost-Random Tree | 85.1 | 73.7 | 79.8 | 0.850 | 0.593 | 86.2 | 80.3 | 83.4 | 0.882 | 0.667 |
RealADABoost- RepTree | 85.1 | 75.0 | 80.4 | 0.887 | 0.605 | 86.2 | 76.3 | 81.6 | 0.919 | 0.630 |
RealADABoost- Random Forest | 86.2 | 75.0 | 81.0 | 0.894 | 0.618 | 86.2 | 82.9 | 84.7 | 0.904 | 0.692 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Pandey, M.K.; Saini, A.; Subbiah, K.; Chintalapudi, N.; Battineni, G. Improved Carpooling Experience through Improved GPS Trajectory Classification Using Machine Learning Algorithms. Information 2022, 13, 369. https://doi.org/10.3390/info13080369
Pandey MK, Saini A, Subbiah K, Chintalapudi N, Battineni G. Improved Carpooling Experience through Improved GPS Trajectory Classification Using Machine Learning Algorithms. Information. 2022; 13(8):369. https://doi.org/10.3390/info13080369
Chicago/Turabian StylePandey, Manish Kumar, Anu Saini, Karthikeyan Subbiah, Nalini Chintalapudi, and Gopi Battineni. 2022. "Improved Carpooling Experience through Improved GPS Trajectory Classification Using Machine Learning Algorithms" Information 13, no. 8: 369. https://doi.org/10.3390/info13080369
APA StylePandey, M. K., Saini, A., Subbiah, K., Chintalapudi, N., & Battineni, G. (2022). Improved Carpooling Experience through Improved GPS Trajectory Classification Using Machine Learning Algorithms. Information, 13(8), 369. https://doi.org/10.3390/info13080369