Wavelets in Combination with Stochastic and Machine Learning Models to Predict Agricultural Prices

Garai, Sandip; Paul, Ranjit Kumar; Rakshit, Debopam; Yeasin, Md; Emam, Walid; Tashkandy, Yusra; Chesneau, Christophe

doi:10.3390/math11132896

Open AccessArticle

Wavelets in Combination with Stochastic and Machine Learning Models to Predict Agricultural Prices

by

Sandip Garai

^1,†,

Ranjit Kumar Paul

^1,*

,

Debopam Rakshit

¹

,

Md Yeasin

^1,*

,

Walid Emam

²

,

Yusra Tashkandy

²

and

Christophe Chesneau

³

¹

ICAR-Indian Agricultural Statistics Research Institute, New Delhi 110012, India

²

Department of Statistics and Operations Research, Faculty of Science, King Saud University, P.O. Box 2455, Riyadh 11451, Saudi Arabia

³

Department of Mathematics, University of Caen-Normandie, 14000 Caen, France

^*

Authors to whom correspondence should be addressed.

^†

Present Address: ICAR-Indian Institute of Agricultural Biotechnology, Ranchi 834003, India.

Mathematics 2023, 11(13), 2896; https://doi.org/10.3390/math11132896

Submission received: 6 June 2023 / Revised: 22 June 2023 / Accepted: 24 June 2023 / Published: 28 June 2023

(This article belongs to the Special Issue Advances in Statistical Modeling)

Download

Browse Figures

Versions Notes

Abstract

:

Wavelet decomposition in signal processing has been widely used in the literature. The popularity of machine learning (ML) algorithms is increasing day by day in agriculture, from irrigation scheduling and yield prediction to price prediction. It is quite interesting to study wavelet-based stochastic and ML models to appropriately choose the most suitable wavelet filters to predict agricultural commodity prices. In the present study, some popular wavelet filters, such as Haar, Daubechies (D4), Coiflet (C6), best localized (BL14), and least asymmetric (LA8), were considered. Daily wholesale price data of onions from three major Indian markets, namely Bengaluru, Delhi, and Lasalgaon, were used to illustrate the potential of different wavelet filters. The performance of wavelet-based models was compared with that of benchmark models. It was observed that, in general, the wavelet-based combination models outperformed other models. Moreover, wavelet decomposition with the Haar filter followed by application of the random forest (RF) model gave better prediction accuracy than other combinations as well as other individual models.

Keywords:

decomposition; interrelations; machine learning; nonlinearity; wavelet filters

MSC:

37M10

1. Introduction

Agricultural datasets, in particular, are of great importance as they provide information on various variables, such as the occurrence and intensity of rainfall, daily temperature fluctuations, and price variations of commodities. The forecasting of future commodity prices is crucial for all stakeholders in the agricultural supply chain, from farmers/producers to consumers and policymakers. Having proper knowledge about possible future prices can prevent distress sales by farmers and enable them to make better decisions about their farming activities.

Time series modeling is a crucial tool in data analysis and forecasting, as it allows researchers to uncover hidden patterns and trends within the data. Model selection is an important step in time series modeling, as the choice of model can have a significant impact on the accuracy of the forecast. Ultimately, the selection of the appropriate technique depends on the nature of the data and the specific research question being addressed.

The wavelet transformation helps capture minute events in a signal that may not be obvious [1]. It represents localized phenomena at different time scales in the signal. A wavelet representation of a signal can identify frequency content as well as temporal variations [2]. There are two different ways to analyze a signal through wavelet transformation: continuous wavelet transform (CWT) and discrete wavelet transform (DWT). CWT operates on a continuous signal, whereas DWT performs decomposition on scales with discrete values with a dyadic structure. CWT produces a redundant number of signals to capture specific details in the signal through reconstruction of the original signal, which is quite difficult in this case [3]. Reconstruction is useful for forecasting the original signals [3]. Consequently, DWT is best suited for discrete, multiscale agricultural price datasets. DWT produces orthogonal sub-signals that are amenable to further treatment.

Stochastic models, such as autoregressive integrated moving average (ARIMA) and generalized autoregressive conditional heteroscedastic (GARCH) models, have several data specifications before modeling can be performed [4]. Data-driven ML algorithms need very little human intervention, unlike other stochastic models [5]. Several important algorithms are used for forecasting purposes. MARS is a piecewise regression model that can efficiently handle nonlinearity in the data [6]. PCR is an artificial intelligence (AI) algorithm based on principal components (PCs), which are linear combinations of the original predictor variables [7,8]. The PCs are orthogonal among themselves, which easily solves the serious issue of multicollinearity in regression analysis [9,10]. The support vector regression (SVR) algorithm is simultaneously useful in both classification and regression problems and is based on the risk minimization principle [11,12,13]. Zhang et al. [14] used the linear programming method and proposed a two-phase SVR with multiple kernel functions. This helped them find important features to predict output variables with reduced computational complexity, which is the case in solving convex quadratic programming. RF is based on the famous bagging algorithm, which combines several decision trees to give the final prediction output of many input variables [15,16,17,18]. ANN makes use of a three-layered system for use in classification and regression tasks [19]. Li et al. [20] proved the outperformance of an induced ordered weighted averaging (IOWA)-optimized neural network (NN) model over other models to predict a vegetable price series. Zhou et al. [21] discussed tensor principle component analysis (PCA)-based techniques to recover clean data in the presence of noise. Zhao [22] studied a wavelet-based signal processing technique along with an ML model to predict future prices of agricultural products. Paul and Garai [23] predicted tomato prices using wavelet filter-based decomposition in combination with stochastic and ML models, but they did not address the issue of the best combination of filter and model or their relationship. Iniyan et al. [24] used ML as a dynamic device to forecast crop yield. They utilized several ML methods with several variables to help farmers decide which crop grow and to increase yield.

In the present investigation, onion prices from three major markets in India, namely Bengaluru, Delhi, and Lasalgaon, were used. Onion is the second-most-produced vegetable in India after potatoes. As per the 3rd Advance Estimates (2020-21), its production stands at 26.83 million tonnes. The modeling and forecasting of onion price series in various markets in India has importance among researchers. Many studies are available in the literature on the application of stochastic and ML models [25,26,27,28,29,30]. The price series were predicted using stochastic models and ML algorithms. Thereafter, wavelet-decomposed subseries were used in these models to obtain better results. The efficacy of the prediction results was measured using three commonly used error functions: root mean squared error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE). There are several wavelet filters that have evolved in the literature for use in different aspects. It is necessary to identify which filter performs particularly well for an individual model so that it can be efficiently used for modeling time series with that particular model in the future. In this paper, an attempt is made to address the issue of finding the best-performing filter for a particular method based on several performance metrics.

2. Methodology

2.1. ARIMA

The most popular linear time series model is the autoregressive integrated moving average (ARIMA) model [31]. For a time series

\{y_{t}\}

, the ARMA (

p, q

) model is presented by Equations (1) or (2):

y_{t} = φ_{1} y_{t - 1} + φ_{2} y_{t - 2} + \dots + φ_{p} y_{t - p} + ε_{t} - θ_{1} ε_{t - 1} - θ_{2} ε_{t - 2} - \dots - θ_{q} ε_{t - q}

(1)

or

φ (L) y_{t} = {θ (L) ε}_{t}

(2)

where

φ (L)

and

θ (L)

are the AR and MA polynomials of lag operator

L

of order

p

and

q

, respectively.

For a wide class of nonstationary time series, the ARMA model is generalized by incorporating a differencing term. The ARIMA

(p, d, q)

model is defined as:

φ (L) {Δ^{d} y}_{t} = {θ (L) ε}_{t}

(3)

where

p, d

, and

q

represent the order of autoregression, integration (differencing), and moving average, respectively.

2.2. GARCH

The ARIMA model is unable to capture the nonlinear structure of a time series. The generalized autoregressive conditional heteroscedastic (GARCH) model was proposed by [32] to capture the conditional heteroscedasticity present in time series data. For a GARCH process, the conditional distribution of the error,

ε_{t}

, given the available information

(ψ_{t - 1})

up to

t - 1

time epoch, is assumed to follow a normal distribution, i.e.,

ε_{t} | ψ_{t - 1} \sim N (0, h_{t})

and

ε_{t} = \sqrt{h_{t}} ν_{t}

. The values of

ν_{t}

are identically and independently distributed (i.i.d.) innovations with zero mean and unit variance.

Here, the conditional variance of the GARCH (p, q) process is defined as:

h_{t} = α_{0} + \sum_{i = 1}^{q} α_{i} ε_{t - i}^{2} + \sum_{j = 1}^{p} β_{j} h_{t - j}

(4)

provided

α_{0} > 0, α_{i} \geq 0 \forall i, β_{j} \geq 0 \forall j

.

The GARCH (p, q) process is said to be weakly stationary if and only if:

\sum_{i = 1}^{q} α_{i} + \sum_{j = 1}^{p} β_{j} < 1

(5)

2.3. ANN

ANNs are nonlinear, data-driven, and self-adaptive approaches. Like human brains, neural networks also consist of processing units (artificial neurons) and connections (weights) between them.

The output

y

from a neuron can be expressed as:

y = θ (\sum_{i = 1}^{n} w_{i} x_{i} + b)

(6)

where

x_{i}

is the input to the network,

w_{i}

is the corresponding weights,

b

is the bias imposed on the output of a neuron, and

θ

is the activation function. The number of hidden layers, number of neurons in each hidden layer, learning rate, activation function, regularization techniques, and optimization algorithm are different hyperparameters. Manual tuning, grid search, random search, Bayesian optimization, and evolutionary algorithms are some methods employed to tune them. Once the system is optimized, it can produce outputs from the supplied inputs.

2.4. SVR

SVR is useful in both classification and regression studies [11]. It is formulated as Equations (7) and (8), subject to the constraints represented in Equation (9), where:

y = k (z) = v ϕ (z) + c

(7)

Minimize:

\frac{1}{2} {| |v| |}^{2} + P \sum_{b = 1}^{l} ζ_{b} + ζ_{b}^{*} where, P > 0

(8)

Subject to:

y_{b} - (v ϕ (z_{b}) + c_{b}) \leq ε + ζ_{b} (v ϕ (z_{b}) + c_{b}) - y_{b} \leq ε + ζ_{b}^{*} ζ_{b}, ζ_{b}^{*} \geq 0

(9)

In the above equations,

z = (z_{1}, z_{2}, \dots, z_{l})

are the input variables;

y_{b}

is the predicted value of the output variable;

k

or

ϕ

is the kernel function;

v, c, a n d l

are constants where

v \in R^{l}

and

c \in R

;

P

is the cost factor or regularization parameter;

ζ_{b} a n d ζ_{b}^{*}

are slack variables; and epsilon (

ε

) is a constant. The kernel function (linear, polynomial, radial basis function), cos factor, and epsilon are tunable parameters for optimizing the SVR algorithm.

2.5. RF

RF is a supervised learning algorithm that is used for both classification and regression. A natural forest consists of trees. Similarly, the random forest algorithm develops decision trees from the sample information, obtains predictions from each of them, and finally selects the best solution using voting. It is based on the bagging technique (bootstrap aggregation) over the decision trees. The correlation between trees is obtained by applying randomization in two ways. Firstly, each tree is trained on a bootstrapped subset. Secondly, the feature by which splitting is performed in each node is not selected from all possible features, but only from their random subset of size

m

. This algorithm generates all

N

trees independently. The efficiency of the RF algorithm is achieved by constructing a full binary tree of maximum depth for each candidate tree. The random forest prediction can be estimated using the following formula:

y = \frac{1}{k} \sum_{i = 1}^{k} y_{i}

(10)

Here,

k

is the total number of candidate regression trees,

y_{i}

is th prediction from the ith tree, and

y

is the final random forest prediction.

2.6. SMLR

MLR is formulated as:

y = x_{0} + b_{1} x_{1} + b_{2} x_{2} + \dots + b_{n} x_{n}

(11)

where

y

denotes the response,

x_{1}

,

x_{2}

,…,

x_{n}

are the explanatory variables (predictors), and

b_{1}

,

b_{2}

,…,

b_{n}

are the constants to be estimated, called regression coefficients. In SMLR, the final model is produced by selectively adding or deleting the explanatory variables one by one and checking their statistical significance iteratively. The algorithm proceeds in a stepwise manner, typically using a combination of forward selection, backward elimination, and/or variable updating. Once the stepwise procedure is complete, the algorithm provides a final model that includes a subset of the available predictor variables. The selected variables are considered the most relevant and statistically significant for predicting the response variable.

2.7. MARS

The MARS model uses a series of piecewise linear or nonlinear splines a basis functions (BF) to mimic the nonlinear relationship between output (splines) and input variables. Each subspace’s BF and slope can be altered by moving from one subspace to the adjacent subspace. Knots are the endpoints of each section. The MARS model can be denoted by:

\hat{f (x)} = b_{0} + \sum_{i = 1}^{m} w_{i} {B F}_{i} (x)

(12)

where

\hat{f (x)}

is the expected response,

b_{0}

is the bias,

w_{i}

is the unknown coefficient of the weight connecting

m

BFs to the response. The unknown coefficients are estimated by using the least squares method. The spline BF can be expressed as:

{B F}_{i} (x) = \prod_{k = 1}^{k_{i}} S_{k_{i}} [x (k, i) - C_{k i}]

(13)

where

k_{i}

is the number of knots;

S_{k_{i}}

denotes the right or left associated linear step function, which takes values of either +1 or −1;

x (k, i)

denotes the input variable

i

at knot

k

; and

C_{k i}

represents the knot location.

An optimal MARS model is created by using a two-stage forward and backward technique. In the forward stage, the data are overfitted by taking into account a large number of BFs. To overcome this, the duplicate BFs are eliminated from Equation (12) in the backward stage. The generalized cross-validation (GCV) criterion is used to remove the duplicate BFs. The GCV is calculated as:

G C V = \frac{\frac{1}{N} \sum_{i = 1}^{N} {[y_{i} - \hat{f (x_{i})}]}^{2}}{{[1 - \frac{C (B)}{N}]}^{2}}

(14)

where

N

is the total number of points in the data;

y_{i}

is the observed response,

\hat{f (x_{i})}

is the expected response; and

C (B)

is a complexity penalty that increases with the number of BFs in the model, which is defined as

C (B) = (B + 1) + d B

. Here,

B

is the number of BFs in the model, and

d

denotes the penalty of BF.

2.8. PCR

PCA is a data dimension reduction technique in which a set of correlated variables is transformed into a set of uncorrelated PCs that retain as much information as the original variables. The PCs are ordered by explained variance, and the first PC can explain most of the variance in the data.

Let

x' = (x_{1}, x_{2}, x_{3}, . . ., x_{p})

be the vector of variables under study and

\sum

be the variance–covariance matrix of the dataset.

λ_{i} (i = 1,2, \dots, p)

are the eigenvalues of

\sum

and

a_{1}, a_{2}, \dots, a_{p}

are the corresponding eigen vectors. Then, the PCs are defined as

z_{i} = a_{i}' x

subject to the condition that

a_{i}' a_{i} = 1

and

C o v (z_{i}, z_{j}) = 0 \forall i \neq j = 1,2, \dots, p

. The variances of the PCs will be the corresponding eigenvalues, i.e.,

V a r (z_{i}) = λ_{i}

. Out of this

p

number of PCs, the first few are selected that can explain around 85% to 90% of the total variability. PCR is a statistical technique that combines PCA and linear regression. It is used for handling multicollinearity in regression models and dealing with high-dimensional data. In PCR, the selected PCs are used as regressors in a linear regression model, and usually the OLS method is applied for estimation. PCR helps to improve the stability and interpretability of the regression model.

2.9. Wavelet

The wavelet transform (WT) uses particular high- and low-pass filters to decompose a signal into several subseries containing information at different resolutions [33]. Levels (

L

) of decomposition are fixed according to the number of observations (

N

) in the series (

L \leq l o g N

,

l o g a r i t h m

is of base 2). Detailed (Equation (15)) and smooth coefficients (Equation (16)) are generated at the first decomposition. The decomposition takes place in an approximate (smooth) series in the next steps until it completes all levels of decomposition. Detailed coefficients are given by:

D_{j} (t) = \sum_{k = - \infty}^{\infty} W_{ψ_{j, k}} ψ_{j, k} (t)

(15)

Smooth coefficients are given by:

A_{j} (t) = \sum_{k = - \infty}^{\infty} V_{ϕ_{j, k}} ϕ_{j, k} (t)

(16)

where

ψ_{j, k} (t)

is the wavelet function and is associated with

ϕ_{j, k} (t)

, the scaling function. The wavelet coefficient, or mother wavelet, is represented as

W_{ψ_{j, k}}

; the father wavelet or scaling coefficient is denoted by

V_{ϕ_{j, k}}

;

t

represents time; and

j

, and

k

are the scale and translation parameters, respectively.

2.10. Proposed Methods

The extent of statistical dependencies in the onion price series was obtained using the autocorrelation function (ACF) and partial autocorrelation function (PACF). Lag series were prepared accordingly for all series. Price series were predicted using a suitable ARIMA model based on the Akaike information criterion (AIC). The residual series were obtained, and it was found that they were not white noise. The residual series were fitted with the GARCH model to obtain the prediction. Using the prepared data frame of the lagged series, the ANN, SVR, RF, SMLR, MARS, and PCR models were trained. The predictions obtained from these individual models were stored to determine the accuracy measures. Wavelet decomposition of the original price series was carried out using Haar, D4, C6, LA8, and BL14 filters at three levels. Three levels of decomposition were suggested in many studies [23,34,35]. Thereafter, the wavelet-decomposed subseries were fitted one by one into ARIMA, GARCH, ANN, SVR, and RF models, respectively. Therefore, five predictions were obtained from one ML algorithm. All of them were stored to calculate their prediction performance. A detailed explanation of the work is given in the following steps.

Step 1: The original series was divided into training and validation sets. Two validation sets were prepared, containing the last 1-week and last 1-month data, respectively.

Step 2: The training set was used to train various models by preparing lag series wherever necessary. The obtained predictions were validated through unseen validation sets, and the performance under different scenarios was recorded.

Step 3: Decomposition of the actual series was performed through five wavelet filters, namely Haar, D4, C6, Bl14, and LA8.

Step 4: The decomposed series were divided into training and validation sets.

Step 5: The lag series were prepared for fitting the training set into various models.

Step 6: Predictions from the wavelet-based models were obtained and compared with those from the validation sets.

Step 7: According to the results, the best wavelet filter in combination with the best statistical model was determined.

In this study, stochastic ARIMA and GARCH models and ML models such as ANN, MARS, PCR, RF, SMLR, and SVR were used along with wavelet-based decomposition (Figure 1).

These models were represented as WML in Equation (21). If

y_{t}

is the actual series (

t

represents time),

W ()

represents the wavelet function, and

D_{1}, D_{2}, D_{3}, A_{3}

are decomposed series at three levels of decomposition (Equation (17)), then:

W (y) = D_{1 t}, D_{2 t}, D_{3 t}, A_{3 t}

(17)

W A R M A

represents the wavelet based-ARMA model, and is expressed as:

W A R M A \{D_{1 t}, D_{2 t}, D_{3 t}, A_{3 t}\} : φ (L) D_{j t} / A_{J t} = θ (L) μ_{j t} / ϑ_{J t}

(18)

Here,

j = 1 (1) 3; J = 3

; and

μ

and

ϑ

are error terms assoicated with

D

’s and

A,

respectively. The ‘

/

’ sign in Equations (18)–(21) has been used to denote ‘or’.

The conditional variance of error term (

c_{t}

) can be modeled using the wavelet-based GARCH model, as in Equation (18):

c_{t} = α_{0} + \sum_{i = 1}^{q} α_{i} {(μ_{j} / ϑ_{J})}_{t - i}^{2} + \sum_{j = 1}^{p} β_{j} c_{t - j}

(19)

The wavelet-based ANN (

W A N N

) model is represented below:

W A N N \{D_{1 t}, D_{2 t}, D_{3 t}, A_{3 t}\} : {\hat{D}}_{j t} / {\hat{A}}_{J t}_{A N N} = θ (\sum_{i = 1}^{n} w_{i} L_{i} + b)

(20)

The term

θ

has been used to represent the activation function. The term

L_{i}

represents lags of

D_{j t} / A_{J t}

.

{\hat{D}}_{j t} / {\hat{A}}_{J t}

represents the respective predicted values.

The wavelet-based SVR (

W S V R

) model is given as:

W S V R \{D_{1 t}, D_{2 t}, D_{3 t}, A_{3 t}\} : {\hat{D}}_{j t} / {\hat{A}}_{J t}_{S V R} = v ϕ ([L]) + c

(21)

Here,

[L]

represents the matrix of

L_{i}

. For implementation of the wavelet-based RF (

W R F

) model, each

D_{j t} / A_{J t}

is modeled with the RF algorithm using

L_{i}

as a predictor variable.

{\hat{D}}_{j t} / {\hat{A}}_{J t}_{i}

is the prediction from the

i th

tree:

W R F \{D_{1 t}, D_{2 t}, D_{3 t}, A_{3 t}\} : {\hat{D}}_{j t} / {\hat{A}}_{J t}_{R F} = \frac{1}{k} \sum_{i = 1}^{k} {\hat{D}}_{j t} / {\hat{A}}_{J t}_{i}

(22)

The inverse wavelet transform is represented as

i n v []

,

W M L {}

is the function (Equations (18)–(21)) to predict all subseries individually, and

P

is final prediction of the WML model, so:

P = i n v [W M L \{D_{1}, D_{2}, D_{3}, A_{3}\}]

(23)

The ranks of the individual models were determined based on RMSE, MAE, and MAPE values for each of the markets. Then, the wavelet filter that performed the best for an individual model was determined using similar metrics. For a particular wavelet filter, the ranks of the wavelet-based combination models were obtained based on the performance measures. Based on the results, the best wavelet-ML combination model was declared, and poorly performing models were pointed out for particular markets.

3. Data Description

The daily modal prices (Rs./q) of onions for the Bengaluru, Delhi, and Lasalgaon markets were obtained from the Agricultural Marketing Information System (AGMARKNET) website (https://agmarknet.gov.in/, accessed on 28 January 2023) for the time interval of 1 May 2019 to 31 December 2022. The descriptive statistics of these price series are given in Table 1. From this table, it can be seen that the mean prices were in the order of Bengaluru > Lasalgaon > Delhi. The same sequence was seen for the median and maximum prices, standard deviation (SD), and coefficient of variation (CV) percentage. The minimum price for the Bengaluru and Lasalgaon markets was the same, and it was less for the Delhi market. All price series were positively skewed and leptokurtic. For skewness and kurtosis, the sequence was Bengaluru > Delhi > Lasalgaon. The kurtosis of the price series of the Bengaluru market was significantly higher than that of the other two markets. The time plots of the price series are shown in Figure 2. Here, it can be seen that all of them followed almost similar patterns. The highest spike in price was noticeable in December 2019. Again, during the last quarter of 2020, a price spike with relatively less intensity was noticeable.

The other properties of the dataset, such as normality, stationarity, and linearity, were also tested. The normality of the price series was tested using the Shapiro–Wilk test [36]. The null hypothesis of this test was that the series followed normality, and it was seen (Table 2) that none of the price series displayed normality at a 1% level of significance.

The stationarity of the price series was tested using the Kwiatkowski–Phillips–Schmidt–Shin (KPSS) test [37] and the Phillips–Perron (PP) test [38]. The KPSS test had a null hypothesis of the absence of a unit root (stationary data), whereas the PP test assumed that the data had a unit root (non-stationary data). The values of the test statistics for these two tests are given in Table 3, and it was seen that all price series were stationary.

The linearity of the price series was tested using the Broock–Dechert–Scheinkman (BDS) test [39]. Its null hypothesis was that the data in a time series were independently and identically distributed (i.i.d.). The values of the test statistics for two and three embedding dimensions at different values of epsilon are given in Table 4, and all were significant at a 1% level of significance. Hence, it could be concluded that the price series were nonlinear.

The autocorrelation function (ACF) and partial autocorrelation function (PACF) of the onion price series are illustrated in Figure 3. The significant values of ACF and PACF indicated statistical dependencies among the different lagged realizations of any time series. The ACF values of the price series of all the markets were significant for the large lag. This is known as hyperbolic decay. This indicated the presence of long-term persistence (long memory). Significant values of PACF at different lags were also noticeable.

In this research article, wavelet decomposition of the price series helped to address non-normality, nonlinearity, and the presence of long-term persistence.

4. Performance Measures

Three accuracy measures, namely RMSE, MAE, and MAPE, were used to compare the prediction performance of the models. RMSE is the square root of the average of the squared residuals. Here, the residual indicated the difference between the actual and predicted values. MAE depicts the average of absolute residuals. These two measures depend on the unit and scale of the observations. They may be used to compare the performance of two different models, but whether a model actually performs well cannot be concluded from these measures. Nevertheless, MAPE does the job. Firstly, the ratio of the absolute residual and actual observation is taken. The average of these ratios is multiplied by a percentage to obtain the MAPE value. The MAPE is a unit-free measure. Generally, a MAPE value less than 10 is considered good [23]. All of these are expected to be lower for a better predictor.

5. Results and Discussion

In this study, the onion price series of three markets in India were divided into training and validation sets. The validation set contained 30 observations for each market. Here, the short- (1-week) and long- (1-month) term prediction performances of the models were studied. A total of 33 different models, including wavelet-based combination models, were used to model the series. Among the models used for this purpose, two were stochastic models and six were ML models.

The training process of an ANN involved feeding the training data through the network, adjusting the weights based on the error, and updating the model’s hyperparameters (number of hidden layers, number of neurons in each hidden layer, learning rate, activation function, regularization techniques, and optimization algorithm). The validation set was used to assess the model’s performance and choose the best hyperparameters. There are several methods for tuning the hyperparameters of ANN, including manual tuning, grid search, random search, Bayesian optimization, and evolutionary algorithms. Significant changes were not noticed during the manual tuning phase with different hyperparameter setups. To keep the model simple but efficient, a single hidden layer with four hidden units, a sigmoid activation function, resilient back propagation with a weighted backtracking optimization algorithm, L2 regularization, and a learning rate of 0.05 were set. For the RF-based model, 100, 250, 500, and 1000 trees (bootstrapped subsets) were tried. In the present investigation, 500 trees were found to be optimal for the three markets with the maximum explained variation in the data. To tune the SVR model, the epsilon values were varied from 0.001 to 0.1. The cost factor varied in the range of 0.01 to 1. Gamma values were considered between 0.1 and 0.5. The optimum combination of hyperparameters was selected based on the minimum MSE value. The optimum values of hyperparameters are represented in Tables S1–S4 in the supplementary material.

In the case of the Bengaluru market, the order of the best ARIMA model was (4, 1, 4) with an AIC value of −6516.45. The most efficient SVR model had a radial kernel, an epsilon value of 0.1, cost factor of 1, gamma of 0.33, and 269 support vectors. The RF model with 500 trees explained 95.71% of the variation in the data. In the case of the Delhi market, the order of the best ARIMA model was (1, 1, 0) with an AIC value of −6331.73. The most efficient SVR model had 209 support vectors, and the other parameters remained the same. The best RF model explained 97.39% of the variation in price in Delhi. In the case of the Lasalgaon market, the order of the best ARIMA model was (0, 1, 1) with an AIC value of −5140.65. The best SVR model used 325 support vectors, and the other parameters remained the same as those of the Bengaluru market. The best RF model explained 94.97% variation in the Bengaluru price.

Five wavelet filters were used to decompose the three datasets. Each of the decomposed sets was predicted using the ARMA, GARCH, ANN, SVR, and RF models to obtain the final predictions of the original datasets. They called the wavelet-based combination models. In total, 25 wavelet-based combination models were formulated.

The validation results of the performance metrics for the 1-month as well as 1-week data for the above-mentioned models for the three markets are provided in Table 5 and Table 6, respectively. The MAPE values in the tables are given in percentage. For representing the different wavelet-based combination models, the term ‘Wmodel_filter’ was used, where ‘W’ represents ‘wavelet’, ‘model’ means either ARMA, GARCH, ANN, SVR, or RF, and ‘filter’ indicates either Haar, D4, C6, BL14, or LA8.

To compare the prediction performance of the different methods, Figure 4 depicts the actual versus predicted values obtained using the best prediction model.

There were eight individual models, including the stochastic and ML models. Their performance was compared for predicting the onion prices of the mentioned markets, and the ranks of these models are presented in Table 7. The PCR was the best-performing model for the Bengaluru and Delhi markets based on the RMSE values. SVR was the best model for the Lasalgaon market. The ARIMA model was the worst performer for the Bengaluru and Delhi markets, and GARCH did not perform well for the Lasalgaon market. Based on the MAE values, MARS was the best-performing model for the first two markets, and ANN was the best for the last market. ARIMA was the worst performer for the first market and GARCH was the worst for the remaining two markets. Nevertheless, ANN was the best-performing model for the three markets based on the MAPE values. ARIMA and GARCH were the worst-performing models for the first and last two markets, respectively.

There are several wavelet filters that have evolved in the literature for use in different aspects. It is necessary to identify which filter performs particularly well for an individual model so that it can be efficiently used for modeling time series with that particular model in the future. A thorough representation of the ranks of the models with separate filters is provided in Table 8 for predicting the onion prices of the selected markets. The three performance measures indicated that the Haar filter was the best performer with the RF model for all markets. The best performance was achieved by RF in combination with the D4 filter for the Bengaluru market. In the Delhi and Lasalgaon markets, GARCH was the best model with the D4 filter. It may be said that the RF and GARCH models had invariable performance when used with the C6 filter for the prediction of onion prices. RF, ARMA, and SVR performed the best with the BL14 filter for the Bengaluru, Delhi, and Lasalagaon markets, respectively. GARCH and RF were more or less equally efficient with the LA8 filter for the best prediction of onion prices in the three markets.

In the previous section, discussion was confined to the fact that a particular model was good for prediction using a particular wavelet filter. In this section, an attempt is made to identify the best filter for use with an individual model to predict onion prices. From Table 9, it was noticeable that the Haar and D4 filters could be used interchangeably with the stochastic and ML models mentioned above for the best prediction. However, the BL14 and C6 filters should not be used for the decomposition of datasets in these models.

Now that filter-wise and model-wise comparisons had been performed, the best and worst models needed to be identified for the prediction of onion prices. It was observed from Table 10 that the Haar filter combined with the RF and SVR models was the best performer for all markets. WGARCH_BL14 and WANN_BL14 were the worst-performing models for the prediction of onion prices in the first two markets. Here, GARCH was the most poorly performing model. WANN_C6 and WARMA_C6 were the most undesirable models for the Lasalgaon market. The ANN-based combination was the worst model.

6. Conclusions

In the present study, the wholesale price of onions for three major markets in India was modeled using 33 different models, including wavelet-based combination models. The prediction performance was compared using three different criteria, namely RMSE, MAE, and MAPE. Among the stand-alone models, PCR, ANN, and MARS performed well. An attempt was made to determine the optimum combination of wavelet filter and stochastic or ML model. It was observed that RF performed efficiently with the Haar filter. The D4 filter performed well with GARCH and/or RF. The BL14 filter may be coupled with the ARMA, RF, and (or) SVR model(s) for better prediction. The LA8 filter gave better performance with the GARCH and RF models. The best wavelet-based combination models were WRF_Haar and WSVR_Haar for the present data under consideration. Overall, the Haar and D4 filters coupled with the ML model outperformed other combination models. This study can be helpful for future endeavors by reducing the effort of selecting a specific wavelet filter in combination with a specific model for best performance.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/math11132896/s1, Table S1: Optimum order of ARIMA model for different markets; Table S2: Optimum value of hyperparameters of ANN models for different markets; Table S3: Optimum value of hyperparameters of RF models for different markets; Table S4: Optimum value of hyperparameters of SVR models for different markets.

Author Contributions

Conceptualization, S.G., R.K.P. and M.Y.; Methodology, S.G. and R.K.P.; Validation, R.K.P. and D.R.; Formal analysis, S.G.; Resources, M.Y., W.E., Y.T. and C.C.; Data curation, D.R.; Writing—original draft, S.G.; Writing—review & editing, R.K.P., D.R., M.Y., W.E., Y.T. and C.C.; Funding acquisition, W.E. and Y.T. All authors have read and agreed to the published version of the manuscript.

Funding

The study was funded by Researchers Supporting Project number (RSP2023R488), King Saud University, Riyadh, Saudi Arabia.

Data Availability Statement

All the datasets used in this paper are available from the corresponding author upon request.

Acknowledgments

The authors are grateful to ICAR-Indian Agricultural Statistics Research Institute, New Delhi, for providing the facilities to carry out the study. The authors are thankful to the anonymous reviewers for their fruitful comments that helped to improve the quality of the manuscript.

Conflicts of Interest

The authors declare that they have no conflict of interest.

References

Grossmann, A.; Morlet, J. Decomposition of Hardy Functions into Square Integrable Wavelets of Constant Shape. J. Math. Anal. 1984, 15, 723–736. [Google Scholar] [CrossRef] [Green Version]
Heil, C.E.; Walnut, D.F. Continuous and Discrete Wavelet Transforms. SIAM Rev. 1989, 31, 628–666. [Google Scholar] [CrossRef] [Green Version]
Fugal, D.L. Conceptual Wavelets in Digital Signal Processing: An In-Depth, Practical Approach for the Non-Mathematician; Space & Signals Technical Pub.: San Diego, CA, USA, 2009. [Google Scholar]
Paul, R.K.; Ghosh, H.; Prajneshu. Development of out-of-sample forecasts formulae for ARIMAX-GARCH model and their application. J. Indian Soc. Agric. Stat. 2014, 68, 85–92. [Google Scholar]
Ramyar, S.; Kianfar, F. Forecasting Crude Oil Prices: A Comparison between Artificial Neural Networks and Vector Autoregressive Models. Comput. Econ. 2019, 53, 743–761. [Google Scholar] [CrossRef]
Friedman, J.H. Multivariate adaptive regression splines. Ann. Stat. 1991, 19, 1–67. [Google Scholar] [CrossRef]
Agarwal, A.; Shah, D.; Shen, D.; Song, D. On robustness of principal component regression. Adv. Neural Inf. Process. Syst. 2019, 32. [Google Scholar] [CrossRef]
Jolliffe, I.T. A Note on the Use of Principal Components in Regression. Appl. Stat. 1982, 31, 300. [Google Scholar] [CrossRef] [Green Version]
Abdi, H.; Williams, L.J. Principal component analysis. Wiley Interdiscip. Rev. Comput. Stat. 2010, 2, 433–459. [Google Scholar] [CrossRef]
Bro, R.; Smilde, A.K. Principal component analysis. Anal. Methods 2014, 6, 2812–2831. [Google Scholar] [CrossRef] [Green Version]
Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
Ebrahimi-Khusfi, Z.; Taghizadeh-Mehrjardi, R.; Kazemi, M.; Nafarzadegan, A.R. Predicting the ground-level pollutants concentrations and identifying the influencing factors using machine learning, wavelet transformation, and remote sensing techniques. Atmos. Pollut. Res. 2021, 12, 101064. [Google Scholar] [CrossRef]
Schapire, R.E.; Freund, Y.; Bartlett, P.; Lee, W.S. Boosting the margin: A new explanation for the effectiveness of voting methods. Ann. Stat. 1998, 26, 1651–1686. [Google Scholar] [CrossRef]
Zhang, Z.; Gao, G.; Tian, Y.; Yue, J. Two-phase multi-kernel LP-SVR for feature sparsification and forecasting. Neurocomputing 2016, 214, 594–606. [Google Scholar] [CrossRef]
Breiman, L. Bagging predictors. Risks 1996, 24, 123–140. [Google Scholar] [CrossRef] [Green Version]
Liyew, C.M.; Melese, H.A. Machine learning techniques to predict daily rainfall amount. J. Big Data 2021, 8, 153. [Google Scholar] [CrossRef]
Palanichamy, N.; Haw, S.-C.; Srikrishna, S.; Murugan, R.; Govindasamy, K. Machine learning methods to predict particulate matter PM2.5. F1000Research 2022, 11, 406. [Google Scholar] [CrossRef] [PubMed]
Wang, G.; Ma, J. A hybrid ensemble approach for enterprise credit risk assessment based on Support Vector Machine. Expert Syst. Appl. 2012, 39, 5325–5331. [Google Scholar] [CrossRef]
Merdun, H.; Cinar, O. Artificial neural network and regression techniques in modelling surface water quality. Environ. Prot. Eng. 2010, 36, 95–109. [Google Scholar]
Li, B.; Ding, J.; Yin, Z.; Li, K.; Zhao, X.; Zhang, L. Optimized neural network combined model based on the induced ordered weighted averaging operator for vegetable price forecasting. Expert Syst. Appl. 2021, 168, 114232. [Google Scholar] [CrossRef]
Zhou, P.; Lu, C.; Lin, Z. Tensor principal component analysis. Tensors Data Process. Theory Methods Appl. 2021, 2, 153–213. [Google Scholar] [CrossRef]
Zhao, H. Futures price prediction of agricultural products based on machine learning. Neural Comput. Appl. 2021, 33, 837–850. [Google Scholar] [CrossRef]
Paul, R.K.; Garai, S. Performance comparison of wavelets-based machine learning technique for forecasting agricultural commodity prices. Soft Comput. 2021, 25, 12857–12873. [Google Scholar] [CrossRef]
Iniyan, S.; Akhil Varma, V.; Teja Naidu, C. Crop yield prediction using machine learning techniques. Adv. Eng. Softw. 2023, 175, 103326. [Google Scholar] [CrossRef]
Das, T.; Paul, R.K.; Bhar, L.M.; Paul, A.K. Application of Machine Learning Techniques with GARCH Model for Forecasting Volatility in Agricultural Commodity Prices. J. Indian Soc. Agric. Stat. 2020, 74, 187–194. [Google Scholar]
Paul, R.K.; Yeasin, M.; Kumar, P.; Kumar, P.; Balasubramanian, M.; Roy, H.S.; Paul, A.K.; Gupta, A. Machine learning techniques for forecasting agricultural prices: A case of brinjal in Odisha, India. PLoS ONE 2022, 17, e0270553. [Google Scholar] [CrossRef]
Paul, R.K.; Simmi, R.; Raka, S. Effectiveness of price forecasting techniques for capturing asymmetric volatility for onion in selected markets of Delhi. Indian J. Agric. Sci. 2016, 86, 303–309. [Google Scholar]
Paul, R.K.; Yeasin, M.; Kumar, P.; Paul, A.K.; Roy, H.S. Deep Learning Technique for Forecasting Price of Cauliflower. Curr. Sci. 2023, 124, 1065–1073. [Google Scholar]
Rakshit, D.; Paul, R.K.; Panwar, S. Asymmetric Price Volatility of Onion in India. Indian J. Agric. Econ. 2021, 76, 245–260. [Google Scholar]
Rakshit, D.; Paul, R.K.; Yeasin, M.; Emam, W.; Tashkandy, Y.; Chesneau, C. Modeling Asymmetric Volatility: A News Impact Curve Approach. Mathematics 2023, 11, 2793. [Google Scholar] [CrossRef]
Box, G.E.P.; Jenkins, M.G. Time Series Analysis: Forecasting and Control; San Francisco Holden-Day: San Francisco, CA, USA, 1970. [Google Scholar]
Bollerslev, T. Generalized autoregressive conditional heteroskedasticity. J. Econom. 1986, 31, 307–327. [Google Scholar] [CrossRef] [Green Version]
Percival, D.B.; Walden, A.T. Wavelet Methods for Time Series Analysis; Cambridge University Press: Cambridge, UK, 2000; Volume 4. [Google Scholar]
Anjoy, P.; Paul, R.K. Comparative performance of wavelet-based neural network approaches. Neural Comput. Appl. 2019, 31, 3443–3453. [Google Scholar] [CrossRef]
Paul, R.K.; Garai, S. Wavelets Based Artificial Neural Network Technique for Forecasting Agricultural Prices. J. Indian Soc. Probab. Stat. 2022, 23, 47–61. [Google Scholar] [CrossRef]
Shapiro, S.S.; Wilk, M.B. An analysis of variance test for normality (complete samples). Biometrika 1965, 52, 591–611. [Google Scholar] [CrossRef]
Kwiatkowski, D.; Phillips, P.C.B.; Schmidt, P.; Shin, Y. Testing the null hypothesis of stationarity against the alternative of a unit root: How sure are we that economic time series have a unit root? J. Econom. 1992, 54, 159–178. [Google Scholar] [CrossRef]
Phillips, P.C.B.; Perron, P. Testing for a unit root in time series regression. Biometrika 1988, 75, 335–346. [Google Scholar] [CrossRef]
Broock, W.A.; Scheinkman, J.A.; Dechert, W.D.; LeBaron, B. A test for independence based on the correlation dimension. Econom. Rev. 1996, 15, 197–235. [Google Scholar] [CrossRef]

Figure 1. Individual and wavelet-based combination models.

Figure 2. Time plots of onion price series for Bengaluru, Delhi, and Lasalgaon markets.

Figure 3. ACF and PACF of onion price series. Blue lines indicate significance at 5% level of significance.

Figure 4. Plots of actual vs. best predicted values for Bengaluru, Delhi, and Lasalgaon markets.

Table 1. Descriptive statistics of onion price series for Bengaluru, Delhi, and Lasalgaon markets.

Statistics	Bengaluru	Delhi	Lasalgaon
Mean (Rs./q)	2100.18	1848.27	1976.76
Median (Rs./q)	1775.00	1604.00	1660.00
Minimum (Rs./q)	600.00	567.00	600.00
Maximum (Rs./q)	12,500.00	7650.00	8625.00
SD (Rs./q)	1472.46	1021.45	1226.93
CV (%)	70.11	55.27	62.06
Skewness	3.03	1.89	1.83
Kurtosis	12.93	4.90	4.30

Table 2. Test for normality (Shapiro–Wilk test).

Market	Bengaluru	Delhi	Lasalgaon
Test statistic	0.713 ***	0.831 ***	0.832 ***

*** p < 0.01.

Table 3. Test for stationarity.

Market	Bengaluru	Delhi	Lasalgaon
KPSS	0.99	1.52	1.71
PP	−22.42 **	−17.46 *	−23.35 **

** p < 0.05, * p < 0.10.

Table 4. Test for linearity (BDS test).

Statistics	Embedding Dimension
	2	3
Bengaluru
eps[1]	108.51 ***	146.74 ***
eps[2]	62.89 ***	66.00 ***
eps[3]	52.55 ***	51.31 ***
eps[4]	46.08 ***	43.52 ***
Delhi
eps[1]	160.45 ***	250.22 ***
eps[2]	80.21 ***	90.48 ***
eps[3]	65.30 ***	65.99 ***
eps[4]	61.20 ***	58.83 ***
Lasalgaon
eps[1]	131.801 ***	201.19 ***
eps[2]	79.49 ***	90.07 ***
eps[3]	62.93 ***	64.00 ***
eps[4]	56.61 ***	54.61 ***

*** p < 0.01.

Table 5. Prediction performance of the selected models in the validation set (1-month).

SL No.	Models	Bengaluru			Delhi			Lasalgaon
SL No.	Models	RMSE	MAE	MAPE	RMSE	MAE	MAPE	RMSE	MAE	MAPE
1	ARIMA	582.11	529.20	22.94	246.56	159.50	13.29	195.79	149.02	14.13
2	GARCH	415.22	321.55	15.68	231.17	171.72	13.59	224.71	175.54	17.37
3	ANN	184.09	87.36	4.91	83.59	35.34	2.88	112.44	69.72	5.92
4	SVR	185.51	100.77	5.53	88.54	44.99	3.69	111.92	71.95	6.54
5	RF	205.55	123.84	6.64	100.74	64.57	5.09	140.00	96.31	8.65
6	SMLR	184.45	89.78	4.86	84.25	37.74	3.01	114.46	75.59	8.32
7	MARS	183.33	86.21	4.78	83.79	34.37	2.80	112.30	70.88	6.54
8	PCR	182.37	87.08	4.86	82.92	36.87	3.01	113.91	74.61	7.16
9	WARMA_Haar	549.05	491.95	21.67	229.89	178.67	13.94	185.77	166.32	12.65
10	WARMA_D4	544.48	491.15	21.69	230.20	176.17	13.82	207.29	189.39	16.35
11	WARMA_C6	793.29	761.62	30.05	431.30	398.51	24.09	1737.55	1726.79	59.68
12	WARMA_BL14	1677.35	1635.90	47.38	250.72	222.12	15.98	1241.32	1135.63	47.35
13	WARMA_LA8	711.84	675.04	27.57	394.04	368.59	22.90	821.65	681.20	34.24
14	WGARCH_Haar	496.67	427.44	19.33	230.78	161.93	13.04	189.33	147.24	14.36
15	WGARCH_D4	511.94	451.19	20.25	229.51	168.90	13.38	155.11	127.22	11.21
16	WGARCH_C6	601.45	543.12	23.32	281.80	255.43	17.77	1539.23	1530.56	57.38
17	WGARCH_BL14	1905.66	1821.63	49.51	1285.94	1137.12	43.97	1205.03	1180.06	50.21
18	WGARCH_LA8	460.79	396.35	18.26	449.56	369.14	40.32	537.30	499.58	70.47
19	WANN_Haar	645.79	614.49	25.82	268.41	168.11	14.48	182.80	146.43	13.46
20	WANN_D4	558.61	513.03	22.50	323.79	180.78	17.37	229.54	198.24	18.32
21	WANN_C6	727.87	706.64	28.63	388.30	307.63	30.08	1967.42	1934.85	62.21
22	WANN_BL14	1872.26	1819.97	49.84	955.73	904.56	40.45	1160.95	1094.15	47.45
23	WANN_LA8	1527.81	1454.86	43.84	395.63	295.33	27.75	930.40	896.01	43.12
24	WSVR_Haar	61.26	52.21	3.05	51.12	45.93	3.67	62.05	54.54	5.21
25	WSVR_D4	456.73	342.24	21.26	289.45	186.13	17.26	272.83	227.44	21.14
26	WSVR_C6	636.65	574.44	28.74	324.37	272.78	21.21	582.55	488.25	33.21
27	WSVR_BL14	1062.50	857.66	29.87	561.77	516.55	31.37	1031.12	901.30	40.11
28	WSVR_LA8	614.71	519.19	27.07	318.05	265.95	21.03	665.25	518.02	33.24
29	WRF_Haar	55.23	30.78	1.82	33.74	19.80	1.42	32.08	24.02	2.17
30	WRF_D4	422.49	318.33	19.48	257.64	171.78	14.84	238.21	197.41	17.25
31	WRF_C6	599.20	541.42	27.38	300.28	260.01	19.76	558.75	462.01	30.19
32	WRF_BL14	1057.85	835.42	29.05	535.39	489.75	30.26	1077.32	918.14	40.27
33	WRF_LA8	576.03	483.96	25.17	293.99	250.71	19.36	640.12	494.75	31.36

Table 6. Prediction performance of the selected models in the validation set (1-week).

SL No.	Models	Bengaluru			Delhi			Lasalgaon
SL No.	Models	RMSE	MAE	MAPE	RMSE	MAE	MAPE	RMSE	MAE	MAPE
1	ARIMA	752.59	698.52	30.05	18.90	14.29	1.19	105.16	83.17	7.61
2	GARCH	632.65	580.21	26.43	51.12	49.60	3.92	109.58	97.65	9.71
3	ANN	346.57	227.37	12.19	10.05	5.87	0.49	120.42	83.97	7.99
4	SVR	348.04	245.42	12.85	18.39	14.79	1.22	115.07	92.1	8.65
5	RF	373.17	250.90	12.96	21.23	17.75	1.47	145.22	115.83	10.97
6	SMLR	344.05	220.34	11.65	9.50	8.76	0.72	121.99	89.68	8.38
7	MARS	345.81	222.05	11.67	8.78	5.60	0.46	121.58	85.09	8.07
8	PCR	344.05	220.34	11.65	9.50	8.76	0.72	121.99	89.68	8.38
9	WARMA_Haar	700.51	646.52	28.51	68.10	66.96	5.23	214.1	173.65	13.98
10	WARMA_D4	704.77	662.21	29.18	61.96	60.71	4.76	263.21	246.65	19.11
11	WARMA_C6	858.77	816.53	33.48	440.40	440.23	26.61	1643.84	1627.87	60.61
12	WARMA_BL14	1351.51	1241.29	41.80	176.04	175.60	12.63	1337.81	1135.71	48.29
13	WARMA_LA8	788.01	740.20	31.32	395.55	395.35	24.56	1318.03	1309.82	55.48
14	WGARCH_Haar	669.88	624.04	27.93	42.05	38.27	3.05	92.21	83.48	7.88
15	WGARCH_D4	668.80	623.94	27.94	48.98	47.08	3.73	123.39	97.16	8.67
16	WGARCH_C6	742.95	683.89	29.55	206.95	206.63	14.54	1646.12	1643.01	61.08
17	WGARCH_BL14	1612.27	1548.80	48.13	826.27	809.34	39.58	1268.92	1260.07	54.51
18	WGARCH_LA8	545.38	523.13	25.00	419.27	418.27	52.72	585.25	564.65	124.26
19	WANN_Haar	726.08	669.66	29.17	15.85	13.12	1.07	136.29	98.09	8.6
20	WANN_D4	706.20	664.05	29.24	20.22	13.57	1.10	248.58	231.96	18.18
21	WANN_C6	801.57	763.52	32.09	239.51	203.42	13.71	1869.73	1787.79	61.61
22	WANN_BL14	1252.41	1160.29	40.60	533.75	513.08	29.14	1351.66	1190.36	48.8
23	WANN_LA8	857.53	763.46	31.20	247.35	215.08	14.50	1146.11	1140.41	52.1
24	WSVR_Haar	95.85	91.39	5.57	23.87	23.45	1.97	21.25	18.1	1.77
25	WSVR_D4	648.29	582.50	26.12	54.93	49.86	4.29	226.37	201.48	15.83
26	WSVR_C6	842.31	788.82	32.48	349.75	339.96	21.67	976.02	945.24	46.62
27	WSVR_BL14	1345.17	1234.22	41.66	587.22	555.70	30.50	1076.81	1047.69	49.44
28	WSVR_LA8	628.13	557.55	25.38	257.97	211.47	14.04	1212.17	1159.25	51.07
29	WRF_Haar	107.08	74.45	4.49	8.66	6.53	0.54	28.97	22.13	2.23
30	WRF_D4	611.81	552.65	25.24	35.69	28.53	2.30	224.51	206.72	16.29
31	WRF_C6	790.42	735.57	30.99	317.42	310.44	20.23	939.53	908.98	45.64
32	WRF_BL14	1346.16	1234.50	41.68	554.43	526.28	29.45	1200.38	1095.37	48.18
33	WRF_LA8	586.75	517.88	24.06	227.77	184.78	12.54	1170.48	1111.96	49.71

Table 7. Comparison among different individual models.

Markets	RMSE	MAE	MAPE
Bengaluru	PCR < MARS < ANN = SMLR < SVR < RF < GARCH < ARIMA	MARS < PCR = ANN < SMLR < SVR < RF < GARCH < ARIMA	ANN = MARS = PCR < SVR = SMLR< RF < GARCH < ARIMA
Delhi	PCR < ANN = MARS < SMLR < SVR < RF < GARCH < ARIMA	MARS < ANN < PCR < SMLR < SVR < RF < ARIMA < GARCH	ANN = MARS < PCR = SVR < SMLR< RF < ARIMA < GARCH
Lasalgaon	SVR < MARS = ANN < PCR < SMLR < RF < ARIMA < GARCH	ANN < MARS < SVR < PCR < SMLR < RF < ARIMA < GARCH	ANN = MARS < SVR = PCR < SMLR< RF < ARIMA < GARCH

Table 8. Wavelet filter-wise comparison of wavelet-based combination models.

Filters	Markets	RMSE	MAE	MAPE
Haar	Bengaluru	RF < SVR < GARCH < ARMA < ANN	RF < SVR < GARCH < ARMA < ANN	RF < SVR < GARCH < ARMA < ANN
	Delhi	RF < SVR < ARMA < GARCH < ANN	RF < SVR < GARCH < ARMA < ANN	RF < SVR < GARCH < ARMA = ANN
	Lasalgaon	RF < SVR < ANN < ARMA < GARCH	RF < SVR < ANN < GARCH < ARMA	RF < SVR < ARMA = ANN < GARCH
D4	Bengaluru	RF < SVR < GARCH < ARMA < ANN	RF < SVR < GARCH < ARMA < ANN	RF < GARCH < SVR < ARMA = ANN
	Delhi	GARCH < ARMA < RF < SVR < ANN	GARCH < RF < ARMA < ANN < SVR	GARCH < ARMA < RF < ANN= SVR
	Lasalgaon	GARCH < ARMA < ANN < RF < SVR	GARCH < ARMA < RF < ANN< SVR	GARCH < ARMA < RF < ANN< SVR
C6	Bengaluru	RF < GARCH < SVR < ANN < ARMA	RF < GARCH < SVR < ANN < ARMA	GARCH < RF < ANN = SVR < ARMA
	Delhi	GARCH < RF < SVR < ANN < ARMA	GARCH < RF < SVR < ANN < ARMA	GARCH < RF < SVR < ARMA < ANN
	Lasalgaon	RF < SVR < GARCH < ARMA < ANN	RF < SVR < GARCH < ARMA < ANN	RF < SVR < GARCH < ARMA < ANN
BL4	Bengaluru	RF < SVR < ARMA < ANN < GARCH	RF < SVR < ARMA < ANN < GARCH	RF < SVR < ARMA < GARCH = ANN
	Delhi	ARMA < RF < SVR < ANN < GARCH	ARMA < RF < SVR < ANN < GARCH	ARMA < RF < SVR < ANN < GARCH
	Lasalgaon	SVR < RF < ANN < GARCH < ARMA	SVR < RF < ANN < ARMA < GARCH	SVR = RF < ARMA = ANN < GARCH
LA8	Bengaluru	GARCH < RF < SVR < ARMA < ANN	GARCH < RF < SVR < ARMA < ANN	GARCH < RF < SVR < ARMA < ANN
	Delhi	RF < SVR < ARMA < ANN < GARCH	RF < SVR < ANN < ARMA < GARCH	RF < SVR < ARMA < ANN < GARCH
	Lasalgaon	GARCH < RF < SVR < ARMA < ANN	RF < GARCH < SVR < ARMA < ANN	RF < SVR < ARMA < ANN < GARCH

Table 9. Performance comparison of wavelet filters for particular models.

Wavelet Models	Markets	RMSE	MAE	MAPE
ARMA	Bengaluru	D4 < Haar < LA8 < C6 < BL14	D4 < Haar < LA8 < C6 < BL14	Haar = D4 < LA8 < C6 < BL14
	Delhi	Haar < D4 < BL14 < LA8 < C6	D4 < Haar < BL14 < LA8 < C6	Haar = D4 < BL14 < LA8 < C6
	Lasalgaon	Haar < D4 < LA8 < BL14 < C6	Haar < D4 < LA8 < BL14 < C6	Haar < D4 < LA8 < BL14 < C6
GARCH	Bengaluru	LA8 < Haar < D4 < C6 < BL14	LA8 < Haar < D4 < C6 < BL14	LA8 < Haar < D4 < C6 < BL14
	Delhi	D4 < Haar < C6 < LA8 < BL14	Haar < D4 < C6 < LA8 < BL14	Haar = D4 < C6 < LA8 < BL14
	Lasalgaon	D4 < Haar < LA8 < BL14 < C6	D4 < Haar < LA8 < BL14 < C6	D4 < Haar < BL14 < C6 < LA8
ANN	Bengaluru	D4 < Haar < C6 < LA8 < BL14	D4 < Haar < C6 < LA8 < BL14	D4 < Haar < C6 < LA8 < BL14
	Delhi	Haar < D4 < C6 < LA8 < BL14	Haar < D4 < LA8 < C6 < BL14	Haar < D4 < LA8 < C6 < BL14
	Lasalgaon	Haar < D4 < LA8 < BL14 < C6	Haar < D4 < LA8 < BL14 < C6	Haar < D4 < LA8 < BL14 < C6
SVR	Bengaluru	Haar < D4 < LA8 < C6 < BL14	Haar < D4 < LA8 < C6 < BL14	Haar < D4 < LA8 < C6 < BL14
	Delhi	Haar < D4 < LA8 < C6 < BL14	Haar < D4 < LA8 < C6 < BL14	Haar < D4 < C6 = LA8 < BL14
	Lasalgaon	Haar < D4 < C6 < LA8 < BL14	Haar < D4 < C6 < LA8 < BL14	Haar < D4 < C6 = LA8 < BL14
RF	Bengaluru	Haar < D4 < LA8 < C6 < BL14	Haar < D4 < LA8 < C6 < BL14	Haar < D4 < LA8 < C6 < BL14
	Delhi	Haar < D4 < LA8 < C6 < BL14	Haar < D4 < LA8 < C6 < BL14	Haar < D4 < LA8 < C6 < BL14
	Lasalgaon	Haar < D4 < C6 < LA8 < BL14	Haar < D4 < C6 < LA8 < BL14	Haar < D4 < C6 < LA8 < BL14

Table 10. Performance comparison of all models.

Best Model (s)
Markets	RMSE	MAE	MAPE
Bengaluru	WRF_Haar < WSVR_Haar	WRF_Haar < WSVR_Haar	WRF_Haar < WSVR_Haar
Delhi	WRF_Haar < WSVR_Haar	WRF_Haar < WSVR_Haar	WRF_Haar < WSVR_Haar
Lasalgaon	WRF_Haar < WSVR_Haar	WRF_Haar < WSVR_Haar	WRF_Haar < WSVR_Haar
Poorly Performing Model (s)
Markets	RMSE	MAE	MAPE
Bengaluru	WGARCH_BL14 > WANN_BL14	WGARCH_BL14 > WANN_BL14	WGARCH_BL14 > WANN_BL14
Delhi	WGARCH_BL14 > WANN_BL14	WGARCH_BL14 > WANN_BL14	WGARCH_BL14 > WANN_BL14
Lasalgaon	WANN_C6 > WARMA_C6	WANN_C6 > WARMA_C6	WANN_C6 > WARMA_C6

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Garai, S.; Paul, R.K.; Rakshit, D.; Yeasin, M.; Emam, W.; Tashkandy, Y.; Chesneau, C. Wavelets in Combination with Stochastic and Machine Learning Models to Predict Agricultural Prices. Mathematics 2023, 11, 2896. https://doi.org/10.3390/math11132896

AMA Style

Garai S, Paul RK, Rakshit D, Yeasin M, Emam W, Tashkandy Y, Chesneau C. Wavelets in Combination with Stochastic and Machine Learning Models to Predict Agricultural Prices. Mathematics. 2023; 11(13):2896. https://doi.org/10.3390/math11132896

Chicago/Turabian Style

Garai, Sandip, Ranjit Kumar Paul, Debopam Rakshit, Md Yeasin, Walid Emam, Yusra Tashkandy, and Christophe Chesneau. 2023. "Wavelets in Combination with Stochastic and Machine Learning Models to Predict Agricultural Prices" Mathematics 11, no. 13: 2896. https://doi.org/10.3390/math11132896

APA Style

Garai, S., Paul, R. K., Rakshit, D., Yeasin, M., Emam, W., Tashkandy, Y., & Chesneau, C. (2023). Wavelets in Combination with Stochastic and Machine Learning Models to Predict Agricultural Prices. Mathematics, 11(13), 2896. https://doi.org/10.3390/math11132896

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Wavelets in Combination with Stochastic and Machine Learning Models to Predict Agricultural Prices

Abstract

1. Introduction

2. Methodology

2.1. ARIMA

2.2. GARCH

2.3. ANN

2.4. SVR

2.5. RF

2.6. SMLR

2.7. MARS

2.8. PCR

2.9. Wavelet

2.10. Proposed Methods

3. Data Description

4. Performance Measures

5. Results and Discussion

6. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI