Exploiting Big Data for Experiment Reporting: The Hi-Drive Collaborative Research Project Case
Abstract
:1. Introduction
2. Related Work
3. Specifications and Requirements
3.1. Data Specifications
- Duration of the experiments, which contains, as related indicators, the number of tests lasting up to 15 min, between 15 and 30 min, 30 and 60 min, etc.;
- ADF operation times, which includes the number of activations with operation time less than 1 min, between 1 and 5 min, between 5 and 10 min, etc., and the number of encountered traffic sections (e.g., pedestrian crossings, roundabouts, traffic lights, tunnels);
- Travelled distance and driving times, which are in turn segmented in terms of the time of day (e.g., morning, afternoon, evening, night), type of road (motorway, urban, rural, etc.), road conditions, number of lanes, speed limits, state of the ADF and enabler (activated or not).
Covered Information | Description | |
---|---|---|
Group | Specific Value | |
Time of tests in the experiment | Between 0 and 6 | Number of test runs performed between hour x and hour y, local time. |
Between 6 and 12 | ||
Between 12 and 18 | ||
Between 18 and 24 | ||
Duration of tests in the experiment | Less than 15 min | Number of test runs per-formed during the indicated length of time. |
Less than 30 min | ||
Less than 1 h | ||
Less than 1 h and 30 min | ||
Less than 2 h | ||
Longer than 2 h |
3.2. User Requirements
4. System Design and Implementation
4.1. Extensions to Measurify
- A few basic descriptor fields, such as name of the experiment, location, managing organization, start and end date, etc.;
- Several metadata items for more in-depth description, such as type of targeted enablers (e.g., vehicle to infrastructure V2I, localization, positioning, machine learning, etc.), technical focus (e.g., technical enablers, users, etc.), etc.;
- State-tracking, progress/performance indicators (e.g., number of baseline tests/runs performed for this experiment, number of tests performed per time of day, per experiment length, travelled distance with ADF and enabler activated, etc.).
- Very basic information about the nature of the experiment is mapped to the Experiment Descriptor fields. This data structure is common to all the experiments/projects;
- More detailed information about the nature of the experiment is mapped to the Experiment Metadata. This data structure is experiment specific;
- The indicators periodically reporting the progress of the experiment are mapped to the Experiment History record. This data structure is experiment specific.
4.2. The EMDB Workflow
- Configuration, which sets up the specific ADB;
- Operation, which fills it during the progress of the application.
4.3. Graphical User Interface
4.4. Automatic Extraction of the Progress/Performance Indicators
4.5. Lab Testing
5. Results and Discussion
5.1. Deployment
5.2. Instantiating the EMDB in Other Projects
- Define the static information needed to describe each project’s experiment (i.e., the Metadata).
- Define each experiment’s static information values (i.e., the values of the experiment Descriptor and Metadata).
- Define the dynamic information needed to describe each project’s experiment (i.e., the Progress Indicators).
- Define one or more Protocols to specify the data types of Metadata and Indicators.
- Create an empty EMDB installation by using the Admin Dashboard.
- Encode the Protocol(s) in a .json file (Figure 1), or through the Admin Dashboard GUI, without directly editing the file. Insert the protocols into the EMDB.
- Instantiate the project’s Experiments, specifying their Descriptor, Metadata, and referenced Protocol by uploading to the server the Experiment’s .json file (Figure 2). This step can also be performed through the Admin Dashboard GUI, without directly editing the file.
- Develop the script to automatically extract performance indicator values from the project/experiment’s data files. This step is optional, and completely project specific. It is time consuming, but highly beneficial to reduce the final reporting time.
5.3. Comparison with Off-the-Shelf Reporting Tools
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- El-Omari, S.; Moselhi, O. Integrating Automated Data Acquisition Technologies for Progress Reporting of Construction Projects. Autom. Constr. 2011, 20, 699–705. [Google Scholar] [CrossRef]
- Thompson, R.L.; Smith, H.J.; Iacovou, C.L. The Linkage between Reporting Quality and Performance in IS Projects. Inf. Manag. 2007, 44, 196–205. [Google Scholar] [CrossRef]
- Iacovou, C.L.; Thompson, R.L.; Smith, H.J. Selective Status Reporting in Information Systems Projects: A Dyadic-Level Investigation. MIS Q. 2009, 33, 785–810. [Google Scholar] [CrossRef]
- Lenzo, B.; de Castro, R.; Chen, Y.; Xu, S.; Zhang, X. Recent Advances in Automated Driving Technologies from the Guest Editors. IEEE Veh. Technol. Mag. 2022, 17, 16–17. [Google Scholar] [CrossRef]
- Yurtsever, E.; Lambert, J.; Carballo, A.; Takeda, K. A Survey of Autonomous Driving: Common Practices and Emerging Technologies. IEEE Access 2020, 8, 58443–58469. [Google Scholar] [CrossRef]
- Khan, M.A. Intelligent Environment Enabling Autonomous Driving. IEEE Access 2021, 9, 32997–33017. [Google Scholar] [CrossRef]
- Wu, Q.; Zhao, Y.; Fan, Q.; Fan, P.; Wang, J.; Zhang, C. Mobility-Aware Cooperative Caching in Vehicular Edge Computing Based on Asynchronous Federated and Deep Reinforcement Learning. IEEE J. Sel. Top. Signal Process. 2023, 17, 66–81. [Google Scholar] [CrossRef]
- L3Pilot: L3Pilot. Available online: https://l3pilot.eu/ (accessed on 22 June 2023).
- Hiller, J.; Koskinen, S.; Berta, R.; Osman, N.; Nagy, B.; Bellotti, F.; Rahman, A.; Svanberg, E.; Weber, H.; Arnold, E.H.; et al. The L3Pilot Data Management Toolchain for a Level 3 Vehicle Automation Pilot. Electronics 2020, 9, 809. [Google Scholar] [CrossRef]
- Bellotti, F.; Osman, N.; Arnold, E.H.; Mozaffari, S.; Innamaa, S.; Louw, T.; Torrao, G.; Weber, H.; Hiller, J.; De Gloria, A.; et al. Managing Big Data for Addressing Research Questions in a Collaborative Project on Automated Driving Impact Assessment. Sensors 2020, 20, 6773. [Google Scholar] [CrossRef] [PubMed]
- Barnard, Y.; Innamaa, S.; Koskinen, S.; Gellerman, H.; Svanberg, E.; Chen, H. Methodology for Field Operational Tests of Automated Vehicles. Transp. Res. Procedia 2016, 14, 2188–2196. [Google Scholar] [CrossRef]
- Berta, R.; Kobeissi, A.; Bellotti, F.; De Gloria, A. Atmosphere, an Open Source Measurement-Oriented Data Framework for IoT. IEEE Trans. Ind. Inform. 2021, 17, 1927–1936. [Google Scholar] [CrossRef]
- Hi-Drive Deployment of Higher Automation. Available online: https://www.hi-drive.eu/ (accessed on 22 June 2023).
- Rahman, M.S.; Reza, H. A Systematic Review Towards Big Data Analytics in Social Media. Big Data Min. Anal. 2022, 5, 228–244. [Google Scholar] [CrossRef]
- Ved, M.; Rizwanahmed, B. Big Data Analytics in Telecommunication Using State-of-the-Art Big Data Framework in a Distributed Computing Environment: A Case Study. In Proceedings of the 2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC), Milwaukee, WI, USA, 15–19 July 2019; Volume 1, pp. 411–416. [Google Scholar]
- Ang, K.L.-M.; Ge, F.L.; Seng, K.P. Big Educational Data & Analytics: Survey, Architecture and Challenges. IEEE Access 2020, 8, 116392–116414. [Google Scholar] [CrossRef]
- Philip, N.Y.; Razaak, M.; Chang, J.; O’Kane, M.; Pierscionek, B.K. A Data Analytics Suite for Exploratory Predictive, and Visual Analysis of Type 2 Diabetes. IEEE Access 2022, 10, 13460–13471. [Google Scholar] [CrossRef]
- Khalajzadeh, H.; Abdelrazek, M.; Grundy, J.; Hosking, J.; He, Q. Survey and Analysis of Current End-User Data Analytics Tool Support. IEEE Trans. Big Data 2022, 8, 152–165. [Google Scholar] [CrossRef]
- Ataei, P.; Litchfield, A. The State of Big Data Reference Architectures: A Systematic Literature Review. IEEE Access 2022, 10, 113789–113807. [Google Scholar] [CrossRef]
- Huang, K. Chaser; ModPhyLab. 2022. Available online: https://github.com/chaserhkj/ModPhyLab (accessed on 12 September 2023).
- GitHub—Longqianh/ZJU-Experiment-Report-Template: An Experiment Report Template for ZJUers with LaTeX. Available online: https://github.com/longqianh/ZJU-experiment-report-template (accessed on 26 July 2023).
- Winkler, D.; Ekaputra, F.J.; Serral, E.; Biffl, S. Efficient Data Integration and Communication Issues in Distributed Engineering Projects and Project Consortia. In Proceedings of the 14th International Conference on Knowledge Technologies and Data-Driven Business, Graz, Austria, 16 September 2014; Association for Computing Machinery: New York, NY, USA, 2014; pp. 1–4. [Google Scholar]
- Hsu, M.-J.; Ho, C.-P. Creating a Knowledge Discovery Model Using MOEX’s Examination Database for in-Depth Analysis and Reporting. In Proceedings of the 2012 IEEE Symposium on Robotics and Applications (ISRA), Kuala Lumpur, Malaysia, 3–5 June 2012; pp. 705–707. [Google Scholar]
- Krechowicz, A.; Deniziak, S.; Łukawski, G. Highly Scalable Distributed Architecture for NoSQL Datastore Supporting Strong Consistency. IEEE Access 2021, 9, 69027–69043. [Google Scholar] [CrossRef]
- Wiseso, L.G.; Imrona, M.; Alamsyah, A. Performance Analysis of Neo4j, MongoDB, and PostgreSQL on 2019 National Election Big Data Management Database. In Proceedings of the 2020 6th International Conference on Science in Information Technology (ICSITech), Palu, Indonesia, 21–22 October 2020; pp. 91–96. [Google Scholar]
- Berta, R.; Bellotti, F.; De Gloria, A.; Lazzaroni, L. Assessing Versatility of a Generic End-to-End Platform for IoT Ecosystem Applications. Sensors 2022, 22, 713. [Google Scholar] [CrossRef]
- Fresta, M.; Bellotti, F.; Capello, A.; Cossu, M.; Lazzaroni, L.; De Gloria, A.; Berta, R. Efficient Uploading of.Csv Datasets into a Non-Relational Database Management System. In Proceedings of the Applications in Electronics Pervading Industry, Environment and Society, Genova, Italy, 28–29 September 2023; Berta, R., De Gloria, A., Eds.; Springer Nature: Cham, Switzerland, 2023; pp. 9–15. [Google Scholar]
- React—A JavaScript Library for Building User Interfaces. Available online: https://legacy.reactjs.org/ (accessed on 27 July 2023).
- Macrae, C. Vue.Js: Up and Running; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2018; Available online: https://www.oreilly.com/library/view/vuejs-up-and/9781491997239/ (accessed on 22 June 2023).
- Chart.Js. Available online: https://www.chartjs.org/ (accessed on 22 June 2023).
- JavaScript Component Testing and E2E Testing Framework|Cypress. Available online: https://www.cypress.io/ (accessed on 22 June 2023).
- Online Reporting Tool|Reporting Software—Zoho Analytics. Available online: https://www.zoho.com/analytics/reporting-software.html (accessed on 28 July 2023).
- What Is Power BI? Definition and Overview|Microsoft Power BI. Available online: https://powerbi.microsoft.com/en-us/what-is-power-bi/ (accessed on 28 July 2023).
- Finereport. Available online: https://sourceforge.net/projects/finereport/ (accessed on 28 July 2023).
Resource | Description |
---|---|
Thing | A Thing represents the subject (context) of a Measurement. A thing could be a trip, a vehicle, a house, a driver. |
Feature | A Feature fully describes the different types of Measurements that can be stored in the DB. Every Measurement refers to a single Feature. Example of Features are Trip-level (or scenario instance level) performance indicators, that synthetize the performance of an ADF in an experimental vehicle’s trip (or in a segment of a trip). Each feature specifies several items, which are its constituting fields (e.g., average speed, maximum speed, travelled Kms). |
Device | A Device is a tool providing measurements regarding a Thing (or an actuator that acts within a thing to modify its status). |
Measurement | A Measurement represents a sample of a Feature measured by a Device. |
Tag | A Tag is an alphanumerical string usable to put labels on resources, for better specifying them (e.g., to support queries, also for the dynamic generation of the graphical user interface). For instance, a measurement could be tagged with a rainy weather condition. |
User | A User represents a user of the DB, with different roles (e.g., admin, provider, analyst). |
Field Key | Field Value |
---|---|
Road_dry | 14 |
Road_wet | 4 |
Road_icy | 0 |
Road_snowy | 2 |
Item | Value | Notes |
---|---|---|
# protocols | 1 | The Hi-Drive protocol only |
# experiment descriptors | 7 | Static values set at the initialization |
# metadata | 88 | Static values set at the initialization |
# history step indicators (total) | 334 | |
# history steps to be filled manually | 41 | |
# experiments | 20 | |
# history steps | 18 | |
# hdf5 files | ~40 | Per experiment and history step |
.hdf5 file size | ~55 MB | Average, for each file |
Reporting frequency | Monthly | |
Cloud machine | AWS EC2 t2.medium | 2 vCPU, 4 GB RAM |
Upload/download speed | 250–300 MBit/s | Nominal bandwidth |
Upload size | ~8 KB | Per experiment and history step |
Download size | ~500 KB | All experiments and steps |
Time to upload one history step | <1 s | By an Experiment manager |
Time to download all steps for one experiment | ~1 s | By a Project manager |
Secure protocol | https |
Standard | Stress | |
---|---|---|
# Data source files (.hdf5) (per experiment and step) | 40 | 40 |
Signal sampling frequency | 10 Hz | 10 Hz |
# Hours per experimental run per day | 4 | 8 |
# Samples per day | 144,000 | 288,000 |
# Vehicular signals sampled | 21 | 21 |
Single .hdf5 file size | 55 MB | 110 MB |
Total size of files to be processed (per experiment and step) | ~2.2 GB | ~4.4 GB |
Progress indicator extraction time (per experiment and step) | 153 s | 310 s |
Time to fill manual indicators | ~10 min | ~10 min |
Time to upload one history step | <1 s | Same as standard case |
Time to download all steps for one experiment | ~1 s | Same as standard case |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Capello, A.; Fresta, M.; Bellotti, F.; Haghighi, H.; Hiller, J.; Mozaffari, S.; Berta, R. Exploiting Big Data for Experiment Reporting: The Hi-Drive Collaborative Research Project Case. Sensors 2023, 23, 7866. https://doi.org/10.3390/s23187866
Capello A, Fresta M, Bellotti F, Haghighi H, Hiller J, Mozaffari S, Berta R. Exploiting Big Data for Experiment Reporting: The Hi-Drive Collaborative Research Project Case. Sensors. 2023; 23(18):7866. https://doi.org/10.3390/s23187866
Chicago/Turabian StyleCapello, Alessio, Matteo Fresta, Francesco Bellotti, Hamed Haghighi, Johannes Hiller, Sajjad Mozaffari, and Riccardo Berta. 2023. "Exploiting Big Data for Experiment Reporting: The Hi-Drive Collaborative Research Project Case" Sensors 23, no. 18: 7866. https://doi.org/10.3390/s23187866
APA StyleCapello, A., Fresta, M., Bellotti, F., Haghighi, H., Hiller, J., Mozaffari, S., & Berta, R. (2023). Exploiting Big Data for Experiment Reporting: The Hi-Drive Collaborative Research Project Case. Sensors, 23(18), 7866. https://doi.org/10.3390/s23187866