Real-Time Tracking Data and Machine Learning Approaches for Mapping Pedestrian Walking Behavior: A Case Study at the University of Moratuwa
Abstract
:1. Introduction
2. Materials and Methods
2.1. Case Study
2.2. Experimental Design and Workflow
2.3. Data Analysis
Framework of This Study
- Data collection using the sensor logger app;
- Data preparation—Before beginning the data preprocessing, each CSV file is classified based on the time of travel and travel direction to gain additional insights for a comprehensive dataset;
- Data preprocessing—The initial phase of the farmwork involves manual data preprocessing;
- Data preparation for K-means clustering—The dataset was cleaned and normalized to detect clusters of pedestrian walking behavior. We used a bespoke algorithm to preprocess the dataset;
- Mapping the results—The work principally centers on cluster analysis, employing unsupervised machine learning methods to reveal noteworthy trends and identify the homogeneous profile among pedestrians;
- Data validation—Data validation is conducted through the outputs of mapping using K-means clustering and expert subjective assessment.
- 1.
- Data collection using sensor logger app.
- 2.
- Data Preparation
- 3.
- Data preprocessing
- 4.
- Data preparation for K-means clustering
- Utilizes only numerical input variables—K-means uses distance-based metrics to analyze the similarity between data points, restricting the evaluation to only numerical factors. The analysis utilized geographic longitude and latitude coordinates to identify the pattern of walking behavior. In addition, the undefined (NaN) values were removed. Clustering results may be distorted if NaN values are included in the raw dataset (CSV output);
- Outlier removal in data classification—To cluster the data for studying walking behavior, the major dynamic being considered here is walking speed, which was collected through a mobile application. To categorize the speed of data, the existing literature has been examined. The speed property divides walking speeds into the following four categories: “Slow”, “Normal”, “Fast”, and “Very Fast” [23]. It is imperative to evaluate the potential impact of outlier data on the K-means clustering analysis during the preparatory phase at this stage. The IQR-based outlier removal method was used on each speed category to remove data points that were outside the permitted range. Table 2 shows the average value of accuracy in each cluster after removing the outliers and categorizing them into clusters.
- Data Normalization—Data normalization was performed using the min–max scaler method [24] in Python using the sci-kit-learn package. The min–max scaler is given as follows:Let represent the normalized value, represents the initial value within a particular range, represents the minimum value of the attribute within that range, and represents the maximum attribute value within that range.This phase ensured that the results were not affected by variations in scales and that all scales had an equal impact on model fitting.
- The optimal number of clusters—This study utilized the K-means method to identify unique patterns in the data based on geographical coordinates (latitude and longitude) and walking speeds. Clustering algorithms depend on a random initialization of the cluster centroid. Silhouette analysis (SA) was used to address the issue and determine the ideal number of clusters for each speed category [25]. Introduced by Rousseeuw in 1987, the silhouette analysis (SA) technique calculates the silhouette score, a statistic that varies between −1 and 1. This score provides information on the proximity and density of clusters, indicating their closeness or distance from one other and the total density of the clusters.
- 5.
- Mapping the Results
- 6.
- Data Validation
3. Results
3.1. Results of Cluster Mapping
3.2. Results of Data Validation
4. Discussion
5. Conclusions and Outlook
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
Cluster No | Machine Evaluation | Subjective Evaluation | Cluster No | Machine Evaluation | Subjective Evaluation |
---|---|---|---|---|---|
1 | Very Fast | Very Fast | 76 | Fast | Fast |
2 | Very Fast | Very Fast | 77 | Normal | Normal |
3 | Very Fast | Fast | 78 | Fast | Fast |
4 | Very Fast | Fast | 79 | Fast | Fast |
5 | Fast | Fast | 80 | Fast | Fast |
6 | Very Fast | Very Fast | 81 | Normal | Fast |
7 | Very Fast | Very Fast | 82 | Very Fast | Fast |
8 | Normal | Normal | 83 | Fast | Normal |
9 | Very Fast | Fast | 84 | Slow | Slow |
10 | Very Fast | Very Fast | 85 | Normal | Normal |
11 | Very Fast | Very Fast | 86 | Fast | Fast |
12 | Slow | Normal | 87 | Very Fast | Very Fast |
13 | Slow | Slow | 88 | Very Fast | Fast |
14 | Fast | Fast | 89 | Fast | Fast |
15 | Slow | Slow | 90 | Very Fast | Very Fast |
16 | Very Fast | Very Fast | 91 | Fast | Fast |
17 | Normal | Normal | 92 | Slow | Slow |
18 | Normal | Normal | 93 | Normal | Fast |
19 | Fast | Fast | 94 | Fast | Fast |
20 | Fast | Very Fast | 95 | Very Fast | Very Fast |
21 | Very Fast | Very Fast | 96 | Slow | Slow |
22 | Fast | Fast | 97 | Fast | Fast |
23 | Normal | Normal | 98 | Normal | Normal |
24 | Fast | Fast | 99 | Slow | Slow |
25 | Slow | Slow | 100 | Slow | Slow |
26 | Fast | Fast | 101 | Normal | Fast |
27 | Slow | Slow | 102 | Normal | Normal |
28 | Slow | Slow | 103 | Normal | Normal |
29 | Fast | Fast | 104 | Fast | Fast |
30 | Normal | Normal | 105 | Slow | Slow |
31 | Normal | Normal | 106 | Normal | Normal |
32 | Normal | Normal | 107 | Normal | Normal |
33 | Normal | Normal | 108 | Slow | Slow |
34 | Normal | Normal | 109 | Very Fast | Very Fast |
35 | Normal | Normal | 110 | Slow | Slow |
36 | Very Fast | Very Fast | 111 | Very Fast | Very Fast |
37 | Very Fast | Fast | 112 | Fast | Fast |
38 | Fast | Fast | 113 | Fast | Fast |
39 | Fast | Normal | 114 | Fast | Fast |
40 | Slow | Slow | 115 | Fast | Fast |
41 | Slow | Slow | 116 | Slow | Slow |
42 | Very Fast | Very Fast | 117 | Slow | Normal |
43 | Slow | Slow | 118 | Fast | Fast |
44 | Very Fast | Very Fast | 119 | Slow | Slow |
45 | Slow | Slow | 120 | Normal | Slow |
46 | Normal | Normal | 121 | Normal | Normal |
47 | Slow | Slow | 122 | Normal | Normal |
48 | Fast | Fast | 123 | Slow | Slow |
49 | Normal | Normal | 124 | Slow | Slow |
50 | Slow | Normal | 125 | Slow | Slow |
51 | Normal | Normal | 126 | Normal | Normal |
52 | Slow | Slow | 127 | Slow | Normal |
53 | Very Fast | Very Fast | 128 | Normal | Normal |
54 | Normal | Normal | 129 | Normal | Normal |
55 | Slow | Slow | 130 | Very Fast | Normal |
56 | Very Fast | Very Fast | 131 | Slow | Fast |
57 | Slow | Slow | 132 | Very Fast | Very Fast |
58 | Normal | Normal | 133 | Very Fast | Very Fast |
59 | Normal | Normal | 134 | Slow | Slow |
60 | Fast | Fast | 135 | Slow | Slow |
61 | Slow | Slow | 136 | Normal | Normal |
62 | Normal | Normal | 137 | Fast | Very Fast |
63 | Normal | Slow | 138 | Normal | Normal |
64 | Slow | Slow | 139 | Fast | Very Fast |
65 | Slow | Slow | 140 | Slow | Slow |
66 | Slow | Slow | 141 | Very Fast | Very Fast |
67 | Very Fast | Very Fast | 142 | Slow | Slow |
68 | Normal | Normal | 143 | Very Fast | Very Fast |
69 | Very Fast | Very Fast | 144 | Very Fast | Very Fast |
70 | Normal | Normal | 145 | Normal | Normal |
71 | Normal | Fast | 146 | Normal | Normal |
72 | Slow | Slow | 147 | Normal | Normal |
73 | Very Fast | Very Fast | 148 | Normal | Normal |
74 | Fast | Fast | 149 | Fast | Normal |
75 | Very Fast | Very Fast | 150 | Fast | Fast |
References
- Parra-Ovalle, D.; Miralles-Guasch, C.; Marquet, O. Pedestrian street behavior mapping using unmanned aerial vehicles. A case study in Santiago de Chile. PLoS ONE 2023, 18, e0282024. [Google Scholar] [CrossRef] [PubMed]
- Ma, Y.; Zhang, J.; Yang, X. Effects of Audio-Visual Environmental Factors on Emotion Perception of Campus Walking Spaces in Northeastern China. Sustainability 2023, 15, 15105. [Google Scholar] [CrossRef]
- Mehta, V. Lively streets: Determining environmental characteristics to support social behavior. J. Plan. Educ. Res. 2007, 27, 165–187. [Google Scholar] [CrossRef]
- Goličnik, B.M.; Thompson, C.W. Emerging relationships between design and use of urban park spaces. Landsc. Urban Plan. 2010, 94, 38–53. [Google Scholar] [CrossRef]
- Cosco, N.G.; Moore, R.; Islam, M.Z. Behavior mapping: A method for linking preschool physical activity and outdoor design. Med. Sci. Sports Exerc. 2010, 42, 513–519. [Google Scholar] [CrossRef] [PubMed]
- Shoval, N.; Schvimer, Y.; Tamir, M. Tracking technologies and urban analysis: Adding the emotional dimension. Cities 2018, 72, 34–42. [Google Scholar] [CrossRef]
- Zhu, M.; Teng, R.; Wang, C.; Wang, Y.; He, J.; Yu, F. Key environmental factors affecting perceptions of security of night-time walking in neighbourhood streets: A discussion based on fear heat maps. J. Transp. Health 2023, 32, 101636. [Google Scholar] [CrossRef]
- Xiang, L.; Cai, M.; Ren, C.; Ng, E. Modeling pedestrian emotion in high-density cities using visual exposure and machine learning: Tracking real-time physiology and psychology in Hong Kong. Build. Environ. 2021, 205, 108273. [Google Scholar] [CrossRef]
- Stanitsa, A.; Hallett, S.H.; Jude, S. Investigating pedestrian behaviour in urban environments: A Wi-Fi tracking and machine learning approach. Multimodal Transp. 2023, 2, 100049. [Google Scholar] [CrossRef]
- Murgano, E.; Caponetto, R.; Pappalardo, G.; Cafiso, S.D.; Severino, A. A novel acceleration signal processing procedure for cycling safety assessment. Sensors 2021, 21, 4183. [Google Scholar] [CrossRef]
- Ratti, C.; Frenchman, D.; Pulselli, R.M.; Williams, S. Mobile landscapes: Using location data from cell phones for urban analysis. Environ. Plan. B Plan. Des. 2006, 33, 727–748. [Google Scholar] [CrossRef]
- Young, F.; Mason, R.; Morris, R.E.; Stuart, S.; Godfrey, A. IoT-enabled gait assessment: The next step for habitual monitoring. Sensors 2023, 23, 4100. [Google Scholar] [CrossRef] [PubMed]
- Miah, S.; Milonidis, E.; Kaparias, I.; Karcanias, N. An Innovative Multi-Sensor Fusion Algorithm to Enhance Positioning Accuracy of an Instrumented Bicycle. IEEE Trans. Intell. Transp. Syst. 2020, 21, 1145–1153. [Google Scholar] [CrossRef]
- Feng, Y.; Duives, D.; Daamen, W.; Hoogendoorn, S. Data collection methods for studying pedestrian behaviour: A systematic review. Build. Environ. 2021, 187, 107329. [Google Scholar] [CrossRef]
- Silitonga, S. Walkability; The relationship of walking distance, walking time and walking speed. J. Rekayasa Konstruksi Mekanika Sipil (JRKMS) 2020, 3, 19–26. [Google Scholar] [CrossRef]
- De Arruda Campos, M.B.; Chiaraida, A.; Smith, A.; Stonor, T.; Takamatsu, S. Towards a “walkability index”. In Proceedings of the European Transport Conference (ETC), Strasbourg, France, 8–10 October 2003; Available online: https://trid.trb.org/view/771383 (accessed on 1 June 2024).
- Nishio, T.; Niitsuma, M. Environmental map building to describe walking dynamics for determination of spatial feature of walking activity. In Proceedings of the 2019 IEEE 28th International Symposium on Industrial Electronics (ISIE), Vancouver, BC, Canada, 12–14 June 2019. [Google Scholar] [CrossRef]
- Wan, T.; Lu, W.; Sun, P. Constructing the quality measurement model of street space and its application in the old town in Wuhan. Front. Public Health 2022, 10, 816317. [Google Scholar] [CrossRef] [PubMed]
- Ahmed, M.U.; Brickman, S.; Dengg, A.; Fasth, N.; Mihajlovic, M.; Norman, J. A Machine Learning Approach to Classify Pedestrians’ Events based on IMU and GPS. Int. J. Artif. Intell. 2019, 17, 154–167. [Google Scholar]
- Gong, L.; Yamamoto, T.; Morikawa, T. Identification of activity stop locations in GPS trajectories by DBSCAN-TE method combined with support vector machines. Transp. Res. Procedia 2018, 32, 146–154. [Google Scholar] [CrossRef]
- Zhang, Z.; Li, X.; Yuan, H. Best Integer Equivariant Estimation based on Unsupervised Machine Learning for GNSS Precise Positioning and Navigation in Complex Environments. IEEE Trans. Aerosp. Electron. Syst. 2023, 60, 2672–2682. [Google Scholar] [CrossRef]
- Twomey, N.; Diethe, T.; Fafoutis, X.; Elsts, A.; McConville, R.; Flach, P.; Craddock, I. A comprehensive study of activity recognition using accelerometers. Informatics 2018, 5, 27. [Google Scholar] [CrossRef]
- Schimpl, M.; Lederer, C.; Daumer, M. Development and Validation of a New Method to Measure Walking Speed in Free-Living Environments Using the Actibelt® Platform. PLoS ONE 2011, 6, e23080. [Google Scholar] [CrossRef] [PubMed]
- Visalakshi, N.K.; Suguna, J. K-means clustering using Max-MIN distance measure. In Proceedings of the NAFIPS 2009—2009 Annual Meeting of the North American Fuzzy Information Processing Society, Cincinnati, OH, USA, 14–17 June 2009; IEEE: Piscataway, NJ, USA, 2009. [Google Scholar] [CrossRef]
- Yuan, C.; Yang, H. Research on K-value selection method of K-means clustering algorithm. J 2019, 2, 226–235. [Google Scholar] [CrossRef]
- Zhang, L.; Ye, Y.; Zeng, W.; Chiaradia, A. A systematic measurement of street quality through multi-sourced urban data: A human-oriented analysis. Int. J. Environ. Res. Public Health 2019, 16, 1782. [Google Scholar] [CrossRef] [PubMed]
- Rastogi, R.; Thaniarasu, I.; Chandra, S. Design implications of walking speed for pedestrian facilities. J. Transp. Eng. 2011, 137, 687–696. [Google Scholar] [CrossRef]
- Franěk, M.; Režný, L. Environmental features influence walking speed: The effect of urban greenery. Land 2021, 10, 459. [Google Scholar] [CrossRef]
Second Elapsed (Seconds) | Bearing Accuracy (Degrees) | Speed Accuracy (ms−1) | Vertical Accuracy (m) | Horizontal Accuracy (m) | Speed (ms−1) | Bearing (Degree) | Altitude (m) | Longitude (Degrees) | Latitude (Degrees) |
---|---|---|---|---|---|---|---|---|---|
4188.1 | 0 | 0.0806 | 34.74 | 9.94 | 0.0001 | 0 | −77.2 | 79.9002 | 6.7956 |
2096.2 | 0 | 0 | 4.18 | 11.70 | 0.0001 | 0 | −76.4 | 79.8990 | 6.7953 |
1699.1 | 0 | 0.15 | 1.13 | 13.32 | 0.0002 | 0 | −73.8 | 79.9000 | 6.7963 |
Cluster | Bearing Accuracy (Degrees) | Speed Accuracy (ms−1) | Vertical Accuracy (m) | Horizontal Accuracy (m) | Speed (ms−1) |
---|---|---|---|---|---|
Slow | 0 | 0 | 0.259 | 0.543 | 0.6775 |
Normal | 0 | 0.1485 | 0 | 0.600 | 1.1081 |
Fast | 0 | 0 | 0 | 0.677 | 1.4091 |
Very Fast | 0 | 0 | 0.548 | 0.667 | 1.5447 |
Agreement | Cohen’s Kappa Coefficient | Std. Err. |
---|---|---|
85.3% | 0.8 | 0.013 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Sawandi, H.; Jayasinghe, A.; Retscher, G. Real-Time Tracking Data and Machine Learning Approaches for Mapping Pedestrian Walking Behavior: A Case Study at the University of Moratuwa. Sensors 2024, 24, 3822. https://doi.org/10.3390/s24123822
Sawandi H, Jayasinghe A, Retscher G. Real-Time Tracking Data and Machine Learning Approaches for Mapping Pedestrian Walking Behavior: A Case Study at the University of Moratuwa. Sensors. 2024; 24(12):3822. https://doi.org/10.3390/s24123822
Chicago/Turabian StyleSawandi, Harini, Amila Jayasinghe, and Guenther Retscher. 2024. "Real-Time Tracking Data and Machine Learning Approaches for Mapping Pedestrian Walking Behavior: A Case Study at the University of Moratuwa" Sensors 24, no. 12: 3822. https://doi.org/10.3390/s24123822
APA StyleSawandi, H., Jayasinghe, A., & Retscher, G. (2024). Real-Time Tracking Data and Machine Learning Approaches for Mapping Pedestrian Walking Behavior: A Case Study at the University of Moratuwa. Sensors, 24(12), 3822. https://doi.org/10.3390/s24123822