1. Introduction
Meteorology is the study of weather events, having a significant focus on the forecasting of principal weather variables. Prediction on weather conditions is performed by using historical data or mathematical models [
1] or a combination of both. There is great importance in quantifying meteorological variables such as temperature, humidity, air pressure, and wind flow; their variations and interactions and how they change over time. For this reason, various spatial scales have been proposed in order to describe and predict weather conditions on local, regional, or global levels [
2,
3].
Weather conditions have had a significant effect on sea-transportation and all the maritime affairs over time [
4]. Maritime activities are related to maritime commerce and maritime leisure, but they also affect the environment and consequently atmosphere and so weather conditions. All are related and interconnected and seem to belong to a vicious circle that impact the quality of the environment and the maritime meteorological conditions [
5,
6]. The safety of maritime transportation is strictly related to weather information. International Maritime Organization (IMO) [
7] and the World Meteorological Organization (WMO) [
8] have proposed and defined regulation strategies on how to provide marine weather predictions to minimize accidents and economic losses.
For these reasons, as also reported in [
4], there is an essential need for maritime weather prediction systems, which is connected to the sustainable development of sea commerce. The development of advanced, real-time and onboard forecast systems is becoming of significant importance in order to achieve high weather forecast quality (especially with regard to storms, heavy precipitation, wind, waves, and extreme temperatures).
In the era of Machine Learning (ML), there are available several methods that could be investigated, adapted, expanded and tested in order to be used for weather prediction. The main approaches for weather prediction could be classified in the following four categories [
9]:
When it comes to forecasting meteorological parameters like wind speed and wind direction, special care must be taken in order to tackle the problem of the local environment and the complexity of these parameters [
12,
13] which can seriously affect the performance of the forecasting methods [
14,
15]. Hence, it is a common approach to test several machine learning techniques to solve the weather forecasting problem [
16,
17]. However, all these algorithms can be classified into two main categories: online and trained. Trained models are ML algorithms that require a data-heavy learning phase before being able to produce predictions. These models are typically well-performing, but their forecasts are strongly linked to the training-set, thus making them not usable in moving scenarios like onboard weather prediction. On the contrary, online learning models are algorithms that continuously learn and adapt their predictions capabilities to the flow of data. These algorithms are the best candidates for the development of an onboard weather prediction system for maritime applications because the vessel position change does not influence them.
Nowadays, the Internet of Things (IoT) initiative has allowed communication and interoperability between machines [
18,
19,
20] and so, in the specific application area, meteorological data collection has become more accessible and performant [
21,
22,
23]. The IoT is, in fact, one of the disruptive and essential technologies for fields such as meteorology, since it allows data from different machines to converge on servers through the use of various technologies such as wireless sensor networks (WSN) and Low Power Wide Area Network (LPWAN) [
24]. Moreover, the increase in the computational capabilities of IoT devices enabled the local processing of data pushing the IoT toward the “edge” paradigm [
25,
26]. In the edge paradigm, sensors are also endowed with local processing capabilities, thus making them able to provide real-time analysis. In these scenarios, the connection with remote servers is used only for data storage and synchronization, making network stability and availability not necessary for the functioning of the system.
In the meteorological field, it is possible through the IoT-edge paradigm to develop low-cost solutions [
27] both for the collection and display of data and for the forecasting based on machine learning algorithms executed directly on the IoT nodes. The junction of IoT and Artificial Intelligence has been named AIoT and is considered one of the most promising technologies for the development of innovative systems and services of the Industry 4.0 era [
28].
The goal of this work is to present a complete system named PortWeather, which forecasts the wind speed and direction at different horizons 30 min, 60 min, 90 min and two hours. The system consists of a commercial weather station which sends the current weather parameters to a local IOT unit based on a 32-bit micro-controller where the heart of the weather prediction algorithm runs without the need for a connection with a remote server.
The PortWeather system uses an online-learning ML weather forecast algorithm that does not require pre-train and the vessel position changes do not influence it. The system uses only local data to continuously train the weather predictor that produces the 30, 60, 90, and 120 min forecasts. Among various online learning algorithms developed for the weather forecast, a linear regression ML model has been chosen because it allows a lightweight implementation executable on a tiny microcontroller while still guaranteeing results comparable with more complex ML models.
Such integrated A-IoT edge solution has been designed for use on small professional and private boats that do not have the apparatus required for continuous connection to weather forecast remote services.
The PortWeather system has been tested on different datasets, including static data collected by ground weather stations and on a real sea trip where the system was installed on a test boat.
The PortWeather system can be easily installed on any typology of boats without requiring any training or ML/AI configuration. This feature makes the system easy to install, scalable, and also usable in long trips where the weather condition and micro-climate can change widely.
The manuscript is structured as follows: After the Introduction,
Section 2 presents the state of the art in weather forecast systems and in ML algorithms currently used for weather prediction.
Section 3 describes the PortWeather system, while
Section 4 describes the preparation of the testing dataset and the results achieved by the forecast algorithm. Finally,
Section 5 concludes the paper with some final remarks, recommendations, and suggestions for future work.
4. PortWeather System Performance Tests
In order to test the prediction algorithm developed (both for the prediction of wind speed and wind direction), two main types of tests were performed: a static test and a dynamic test. In both cases, the results obtained for the predictions at 30, 60, 90, and 120 min were analyzed by calculating Median Absolute Error (
MAE) and the Root Mean Square Error (
RMSE), which are common measures for time series regression problems [
53,
54] and the mean absolute percentage error (
MAPE) [
55].
The measures are defined as:
where
N is the number of the testing samples,
is the predicted value, and
the true wind speed (or direction) value. However,
RMSE may not be a good indicator of the average performance of the model [
56]. Therefore, more importance will be given to the results obtained by the
MAE.
We also calculated residual errors in the results of the predictions. The calculation of residual errors is realized as the difference between the predicted value and the true wind speed (or direction) value , and it is carried out in order to have further confirmation of the validity of the model.
4.1. Land Weather Station Test
The static test was carried out for the preliminary checks of the algorithm developed. In this case, the data on which the algorithm was executed was acquired by static/land weather stations. The data were collected from 6 November 2019 00:00:00 to 10 November 2019 13:45:00 from the weather stations of Livorno and Pianosa, [
57].
Figure 3 displays the locations of the weather stations:
Table 5 and
Table 6 provide a brief description of the data acquired from the land weather stations of Livorno and Pianosa, respectively.
Figure 4 and
Figure 5 present the comparisons between the real values and the respective forecasts for the Livorno weather station dataset (both for wind speed and direction).
Table 7 presents the main results showing the wind speed error rates for the Livorno weather station. In particular, in this case, it is noted that the prediction of the wind speed realized by the algorithm differs by at most ±1.3 m/s, in the case of forecasts at two hours, with respect to the real speed.
The calculated errors have been gathered, record by record, and
Table 8 presents the distribution of the residual values for the Livorno Weather station data set. It is concluded that the results obtained are consistent with what is reported in the calculation of the MAE, in fact, in the worst case, which is the 2-h prediction, most residual values belong to the range [−2, 2].
In the case of Livorno’s wind direction prediction, the algorithms differs by at most ±17.91°, in the case of forecasts at 2 h, as described in the following
Table 9.
Table 10 presents calculation of the errors for the wind direction for the Livorno station, with the distribution of the residual values. It is inferred that the results obtained are consistent with what is reported in the calculation of the MAE, in fact, in the worst case, the 2-h prediction, most residual values belong to the range [−18; +18].
Figure 6 and
Figure 7 present the comparisons between the real values and the respective forecasts for the Pianosa weather station dataset (correspondingly for wind speed and direction).
Table 11 presents the error rates for wind speed prediction for the Pianosa weather station. In this case, the prediction of the wind speed realized by the algorithm differs by at most ±1.7 m/s, in the case of forecasts at 2 h, with respect to the real wind speed.
The calculated errors have been gathered recorded by record, and
Table 12 presents the distribution of the residual values for the Pianosa weather station. It is conducted that the results obtained are consistent with what is reported in the calculation of the MAE, in fact, in the worst case, the 2-h prediction, most residual values belong to the range [−2; +2]:
The worse prediction of the wind direction realized by the algorithm differs from the real direction by at most ±22.75
, in the case of forecasts at two hours for the Pianosa weather station as one can see in
Table 13.
Table 14 presents the error distribution for the wind direction parameter for the Pianosa weather station. It can be observed that the results obtained are consistent with what is reported in the calculation of the MAE, in fact, in the worst case, the 2-h prediction, most residual values belong to the range [−25; +25].
Both from the analysis of the errors and the study of the graphs, it is concluded that the implemented algorithm reaches a high accuracy in predicting the wind speed and direction using statically collected data.
4.2. Dynamic Test
After evaluating the accuracy of the prediction algorithm on statically collected data, we tested the algorithm on dynamic data acquired using the PortWeather system installed on a tugboat. For the test, the tugboat made a trip from Chios island (Greece) to Athens (Greece), and the data collection lasted from 22 January 2018 11:24:00 to 9 February 2018 12:50:00.
Figure 8 displays the planned trip of the ship.
Table 15 provides a brief description of the data acquired from the on-board weather station.
Figure 9 and
Figure 10 present the comparisons between the real values and the respective forecasts (correspondingly for the wind speed and direction) for the onboard data set.
Table 16 presents the results showing the error rates for the onboard weather station. It is noted that the prediction of the wind speed realized by the algorithm differs by at most ±1.9 m/s, in the case of forecasts for 2-h prediction, with respect to the real wind speed.
The calculated errors have been recorded record by record, and
Table 17 describes the distribution of the residual values. It is seen that the results obtained are consistent with what is reported in the calculation of the MAE, in fact, in the worst case, which is the 2-h prediction, most residual values belong to the range [−2; +2].
The worse prediction of the wind direction realized by the algorithm differs by at most ±20.3
, in the case of forecasts at 2 h, with respect to the real Wind direction as it can be seen in
Table 18.
From the calculation of the errors, record by record, we have obtained
Table 19 that describes the distribution of the residual values. The results obtained are consistent with what is reported in the calculation of the MAE, in fact, in the worst case, the 2-h prediction, most residual values belong to the range [−21; +21]:
4.3. Overall Evaluation
Table 20 and
Table 21 present the MAE, MAPE, and RMSE error measures for the three different weather stations (Livorno, Pianosa and onboard) for the wind speed and wind direction, respectively. It can be inferred that, in terms of the MAE measure, the best accuracy of our method is performed for the Livorno data set for the wind speed and wind direction parameter. An increase of all the error measures is clearly reported with the increase of the prediction interval from 30 to 120 min.
The comparison of real and predicted value shows a time lag effect on the predicted data. As reported in [
58], this effect is typical of simplified models where the prediction is highly correlated with a linear combination of historical data. However, although this effect is present, its visual analysis could lead to a misleading interpretation of the algorithm performances. Indeed, error measures analysis shows that the PortWeather system has relative low MAE, MAPE, and RMSE suggesting that the algorithm has an accuracy that is perfectly compatible with the kind of uses expected for the system.
5. Conclusions
This study presents an onboard weather prediction IOT system that provides short-term forecasts of wind speed and direction, without the need of an internet connection. The PortWeather system is composed of a commercial weather station interfaced with an Industrial IOT acquisition unit. The system is designed to be easy to install, reliable, and low-power, allowing the use on little and medium-sized boats. A forecasting algorithm based on linear regression was designed and implemented in order to work on a resource-constrained microcontroller for IOT applications. The implemented algorithm uses an online learning strategy allowing the system to work without the need of a training phase and Internet connection. This training-less approach allows the system to be resilient to boat geographical location shift and to recover from prolonged shutdown or sensors replacement quickly.
The efficiency of the system has been proven by testing the system in real sea conditions where weather parameters have been recorded on a tugboat operating in Greece. The algorithm performances have also been evaluated using other datasets, acquired from land weather stations located in different geographical and micro-climatic areas (Livorno and Pianosa, Italy).
As a general evaluation of the system, it is essential to highlight that linear regression algorithms are straightforward methods for the forecast of a time series evolution. These model are very simple, and sometimes it is possible to guess the output of these algorithms by looking at the time series plots. However, this is true for scientists but not for fishers and sea operators.
Maritime traffic security for little and medium-sized boats is an open challenge that needs to be addressed with accessible, scalable, and easy install and use solutions. Despite being based on an algorithm whose performances could be overcome nowadays by more complicated and computational massive neural-network-based algorithms, PortWeather can be considered as a reliable and innovative IOT solution for the real-time weather forecast on little and medium-size boats.
Although the system performs relatively well, there is room for improvement. We plan to install and test new microcontrollers that offer more CPU power as more memory. This will allow us to implement more advanced machine learning algorithms that could help in mitigating the time-lag effect introduced by the simplified regressive algorithm used in PortWeather. On the algorithm performance evaluation, it would be of extreme interest to evaluate the system in heavy weather conditions where the wind direction and speed parameter range are very high. Moreover, we plan to work on the system cloud connection going beyond the pure remote data storage for further analysis. On this side, it could be of extreme interest to allow the system to download historical weather data from publicly available Application Programming Interfaces (APIs) like Weather Underground [
59] so that the system will be able to start instantly without the need for the initial training that is actually required at the boot.