1. Introduction
In this dynamic business world, research on risk management has gained a great deal of importance and it will continue to be of interest to many researchers. Price risk in the agricultural commodities market is expected. Stakeholders such as traders, industrial buyers, and growers will experience adverse effects from such risk.
Sekhar (
2004) stated that the economic reforms in India in the year 1991–1992 resulted in a liberalized import policy, which in turn transmitted the international price volatility to the domestic market. Commodity Cocoa is not excluded from such price volatility; its price is not stable in the global market. Hence, the income of farmers and the government of Cocoa growing countries have become volatile (
Oomes and Tieben 2016).
Brown et al. (
2008) mentioned that the conflict between the larger producing countries, exchange rate fluctuations, and export dumping had exposed commodity Cocoa in the international market to price risk.
Aidenvironment and Sustainable Food Lab (
2018) stated that the price volatility in the topical commodities market is common, where the Cocoa international market saw a 40% price drop in 2016.
Jayasekhar and Ndung (
2018) opined that the domestic Cocoa price in India would follow the price movements in the international market. The price risk in the market makes farmer families suffer. Traders and industrial buyers also see an adverse impact on their numbers (
Chloe Taylor 2021). Commodity price risk may become a bottleneck for exports of such commodities; this is evident in Indonesian Cocoa export performance (
Fauziah Widayat et al. 2019).
Hedging, price predictions, and minimum support prices from the government are the major price risk management strategies available for farmers and traders of commodities. In the international markets, hedging the price risk using forwards, futures, or options contracts is recommended for growers and traders of Cocoa (
Janchum et al. 2017;
Aidenvironment and Sustainable Food Lab 2018). Market-based price risk management strategies such as futures and options were introduced on many commodities in a few countries (
FAO et al. 2011). However, many commodities are still not traded in the Indian commodity exchanges, and Cocoa is one such commodity.
Many firms, traders, and growers are exposed to the volatility of commodity prices; this in turn will influence the price of raw materials (
Gaudenzi et al. 2018). This price risk has a direct bearing on the financial and operational performance of the company. To mitigate such price risks, hedging using financial instruments such as forwards, futures, and options or using an accurate price forecast model is important (
Buhl et al. 2011). The futures of commodity Cocoa are not traded in India. The futures o Cocoa are traded on the USA and the UK on the Intercontinental Exchange (ICE) platform. Using vector auto regression methodology (VAR), this study aims at finding the link between the Cocoa market in India and ICE. As an alternative, this study aims to develop a price prediction model using Box–Jenkins ARIMA methodology.
2. Literature Review
Box and Jenkins ARIMA methodology has been used in many disciplines to predict the demand and price of commodities, weather, and economic indicators. For example (
Abdullah 2012;
Cortez et al. 2018;
Nochai and TItida 2006) used Box and Jenkins methodology to develop an appropriate prediction model for bitcoin, oil, and other commodity prices.
Hossain et al. (
2006) developed an ARIMA model to predict the prices of pulses in Bangladesh. The application of this model in the real estate sector to forecast residential property prices is also witnessed from the works of (
Tse 1997;
Chin and Fan 2005). In the financial markets, the capital market is the most active and highly volatile; to manage the price risk of this volatile market ARIMA model was developed by (
Adebiyi et al. 2014;
Mondal et al. 2014).
Abonazel and Ibrahim (
2019) developed the ARIMA model to predict the Egyptian GDP, and
Farooqi (
2014) used this model to forecast the Imports and Exports of Pakistan. Using eight years of monthly data for the period 2008 to 2016,
Sukiyono et al. (
2018) developed the ARIMA model to predict the Cocoa price in Indonesia. In time series analysis, the ARIMA methodology developed by Box and Jenkins in the year 1979 has been widely used in various disciplines. Although multivariate time series analysis such as VAR and regression are popular in forecasting models, univariate model ARIMA is widely accepted because the explanatory variables in this model are the past values of the dependent variable (
Hossain et al. 2006). The ARIMA model is the most preferred methodology in financial and agricultural forecasts; particularly, it has shown greater efficiency in generating short-term forecasts (
Adebiyi et al. 2014;
Abonazel and Ibrahim 2019). In such a volatile market, an appropriate prediction model will help all stakeholders of the commodity market; this makes applying the ARIMA model for such predictions worthwhile. A good number of studies have appeared with regard to the ARIMA model for agricultural commodities; for example, (
Darekar and Reddy 2017;
Mishra et al. 2019;
Shil et al. 2013;
Sukiyono et al. 2018) have developed an ARIMA model to predict the prices of Arecanut, Cocoa, Potato, and Cotton, respectively. Using eight years of monthly data for the period 2008 to 2016,
Sukiyono et al. (
2018) developed an ARIMA model to predict the Cocoa price in Indonesia.
The commodity Cocoa is used for the production of chocolate, which is consumed by end-users for joy and celebrations. However, the price of this agricultural commodity has been more volatile in the world market since 2014 (
Pipitone 2019).
Maurice and Davis (
2011) stated that the world Cocoa market was highly volatile in the last decade. Various crises in the world market such as financial, economic, energy, and food are the reasons for this volatility. From the Indian perspective, the crop Cocoa was not traditionally grown; however, the world Cocoa market was very lucrative in the 1970s, which attracted farmers in India to bring Cocoa plantations into their agricultural properties (
Jayasekhar and Ndung 2018). However, in later decades, the price became more volatile, which made farmers switch from Cocoa to other plantation crops. The last decade has witnessed decent growth in the income of Cocoa farmers and even area expansion (
Jayasekhar and Ndung 2018).
Qaiser Gillani et al. (
2021) also discussed about the linkage between long-term economic growth and public spending in case of Asian countries. The demand for Cocoa in the international market is increasing and the Cocoa exports in India are growing considerably too; the Indian Cocoa market follows global price movements (
Radhika and Amarnath 2008).
Hubballi (
2015) opined that the Cocoa crop is suitable for a mixed crop with Coconut and Arecanut.
Kumar et al. (
2021b) suggested that India represents the largest producer and consumer of Areca nut in the world. State Karnataka has good potential for this crop because of the decent monsoon every year in this state. To abbreviate from this section, the Cocoa plantation crop in India is increasing; however, the domestic and world market is very sensitive in terms of pricing.
3. Data and Methodology
The study’s objective is to develop a price forecast for two varieties of Cocoa in India. Data for this time series study were collected from the office of The Central Arecanut and Cocoa Marketing and Processing Co-operative limited (CAMPCO). The monthly data were collected for the period starting from April 2009 to March 2020. For the same time period, the ICE US and London Cocoa futures prices were collected from the official website of investing.com. Two varieties of Cocoa beans are traded in the Karnataka state of India: Dry Cocoa beans and Wet Cocoa beans. In this study, we analyzed the price series of both varieties of Cocoa. The ARIMA and VAR models were developed for both of these series. From the data perspective in the time series analysis, the prerequisite is the stationarity of the series. If the series used for econometric analysis is not stationary, such analysis will give a spurious regression (
Gujarati et al. 2009). To tackle such stationary issues, researchers can use integrated series with
d order differences in the raw series (
Kunst 2011). Many econometric experts (for example,
Mills and Patterson (
2009);
Kunst (
2011);
Gujarati et al. (
2009)) opined that correlograms of the series would give an initial clue on the stationarity of the series. However, formal hypothesis testing such as the Augmented Dickey–Fuller test is advised. Hence, in this study, correlograms and the Augmented Dickey–Fuller test are used to check the stationarity of the series.
Linear regression is a common tool used for forecasts. The general term of a bivariate linear regression model is shown in the Equation (1).
In this simple form of a linear regression equation,
is the dependent variable,
is the explanatory or independent variable,
and
are the constants, and
is the error of the model (
Rawlings et al. 1998). The same model has been extended for multi-variate predictions; this model could be suitable when the researcher has more than one independent variable for the dependent variable of his interest. The general form of this multivariate regression model is shown in the Equation (2).
In the second equation, we can observe multiple independent variables with names
,
and
and their relationship factors
,
, and
with dependent variable
. The above two regression equations or methodology can be used only when the determinants of
are measurable. In the short run, although explanatory variables are constant, the
might vary due to market trends. In such situations, the application of univariate models such as Auto-Regressive (AR), Moving Average (MA), Auto Regressive Moving Average (ARMA), or Auto-Regressive Integrated Moving Average (ARIMA) would be most appropriate (
Brooks 2008). These models’ very common pervasive characteristic is that the explanatory variables in all these models are the past values of the dependent series or its error terms (
Stanton 2017). This is evident from the following third, fourth, and fifth equations.
Equations three, four, and five are the general form of the AR, MA, and ARMA process, where
is the value of the dependent variable. In equations three, four, and five,
denotes the white noise error term, where
denotes the optimal number of lags for the AR (
p) and MA (
q) process. AR model implies the present value of the dependent variable, y, depending on its past values. MA model implies that the current value of the dependent variable is the function of the present and the past values of a white noise error term (
Chris Brooks 2014). Equation five shows the characteristics of the ARIMA model, where the current value of the dependent variable,
y, is the function of its past values plus the blend of the present and past values of a white noise disturbance term (
Brooks 2008). If the raw series is stationary at its base level, then the model applicable would be ARMA (
p,
q,); however, if the series is not stationary, then such series calls for integrated or differenced series to avoid spurious regression. Such differenced series would be used to identify the AR (
p) and MA (
q) terms for the ARMA model (
Fabozzi et al. 2014). Hence, Box and Jenkins ARIMA modelling differs from ARMA, with additional terms ‘integrated’ and ‘I’ in the acronym.
The first half of
Section 4 focuses on the identification of ‘
p’, ‘
q’, and ‘d’ parameters for the ARIMA model to forecast the monthly price of Cocoa. Parameter ‘
d’ is the order of differencing to make the series stationary. Once the series becomes stationary, the graphical representation of the Auto Correlation Function (ACF) and Partial Auto Correlation Function (PACF) of the stationary series can be used to identify the
‘p’ and ‘
q’ parameters (
Meeker 2001). However,
Brooks (
2008) opined that interpreting ACF and PACF correlograms are not an easy task because these figures rarely produce simple patterns. Hence, it is advised to use information criteria statistics of the estimates to decide on the number of lags or parameters for the model. There are three popular information criteria recommended by many econometricians for optimal lag selection; they are, Akaike’s information criteria (AIC), Schwarz’s Bayesian information criteria (SBIC), and the Hannan–Quinn criterion (HQIC) (
Agung 2009;
Fabozzi et al. 2014;
Mallikarjuna et al. 2019;
Mills and Patterson 2009;
Shmueli and Litchtendahl 2016). The ACF and PACF correlograms of the residuals of the estimates are used for the diagnostic checking (
Gujarati et al. 2009). The following two hypotheses are developed and tested in this study.
H01: The Dry Cocoa price series has a unit root.
H02: The Wet Cocoa price series has a unit root.
H03: The autocorrelation and partial autocorrelation values for residuals of the estimated model are not serially correlated.
Further, we develop the VAR model; the endogenous variables in the model are US Cocoa futures, London Cocoa futures, and Cocoa prices from India. The following equation shows the general form of the bivariate VAR model. Where (CP) in equation six is the Cocoa price, which is dependent on its own lagged values, lagged values of Cocoa futures price (CF), and
is the white noise error term. CF in the seventh equation is the Cocoafuture price, which is dependent on its own lagged values and lagged values of Cocoa price (CP) and (
is the white noise error term. However, we accommodate the above-mentioned three endogenous variables in the model.
5. Estimation
Using the E Views-11 student version package, coefficients are estimated for all the identified tentative models. Customarily, the estimates output will produce coefficients along with some information criteria and goodness-of-fit statistics. Researchers often use the Akaike information criterion (AIC) and Schwarz criterion (SBIC) along with the regression coefficients, volatility, and adjusted R2 to resolve the ambiguity of optimal AR and MA terms for an ARIMA model. In addition to these parameters, R2 and Durbin—Watson statistics are also considered to measure the non-stationary series’ impact and avoid spurious regression.
From
Table 5, it is understandable that the regression coefficients tentative model (1, 1, 1) is not statistically significant. Hence, model (1, 1, 1) can be filtered, and the other two models are carried forward for further analysis with other statistical values. Sigma square is the measure of volatility; a model with a lower sigma square is preferred, and likewise, a model with the lowest Akaike information criterion (AIC) and Schwarz criterion (SBIC) is considered as a better model (
Gujarati et al. 2009). Further adjusted
R2 is the measure of goodness of fit; customarily, the model with the highest
R2 is considered as the better model. With these criteria, model (1, 1, 0) is considered an appropriate model because the volatility, AIC, and SBIC values of this model are less compared to model (0, 1, 1). Even the adjusted (
R2) favors the model (1, 1, 0). There is no symptom of autocorrelation in the analyzed series because the
R2 value is less than Durbin–Watson
d statistic value. Moreover, the Durbin–Watson value closes to two indicates a non-stationary series; hence, the estimated coefficients are not spurious.
The first exciting discussion in this section is about the contradictory discussion from the identification section about the stationarity of the raw Wet Cocoa beans price series. The estimates for the tentative models of the raw Wet Cocoa beans price series are presented in
Table 6. The
R2 and the Durbin–Watson statistic are brought here in this Table to check whether the model has produced spurious regression because of serially correlated series. The ACF and PACF correlograms depicted a graph identical to a non-stationary series. However, the
R2 and Durbin–Watson statistics in
Table 6 favor the Augmented Dickey–Fuller test result. Durbin–Watson statistic values are very close to two and the
R2 values are less than Durbin–Watson
d values for all the estimations. Hence, the raw Wet Cocoa price series is a non-stationary series and the estimates are not spurious, so it is not necessary to estimate the coefficients for differenced Wet Cocoa price series. From other statistics from
Table 6, models (1, 1, 3) and (1, 1, 4) can be dropped easily because both AR and MA have statistically significant coefficients; for the other two models, both AR and MA have statistically significant coefficients. The volatility model (1, 1, 1) estimate is greater than the volatility of the model (1, 1, 2) estimate.
Similarly, the AIC and the SBIC values favor the model (1, 1, 2). The adjusted
R2 value of model (1, 1, 2) is greater than the other model in the race. Hence, the ARIMA optimal models based on information criteria are model (1, 1, 0) for Dry Cocoa beans price and model (1, 1, 2) for Wet Cocoa price. Diagnostic checking with the help of residuals of the estimates is advised by many econometricians (
Gujarati et al. 2009;
Brooks 2008).
6. Diagnostic Checking
Whether the estimated model is good enough for prediction is the obvious question that arises after the model estimation. If the model is not suitable for predictions, such model parameters may have to be revised by repeating all the procedures starting from the introduction till the diagnostic check. To perform the diagnostic check, the developers of the methodology, Box and Jenkins, advised residual diagnostic and over fitting methods. Moreover, to measure the goodness of fit for the estimated ARIMA model, many econometricians have suggested examining the ACF and PACF correlograms of the residuals (
Brooks 2008;
Meeker 2001;
Gujarati et al. 2009). The aim is to construct a parsimonious or over fitted model, which is not advised because such a model may result in high standard errors for coefficients. The theory of the parsimonious model was taken care of in the identification section and now it is essential to examine the correlograms to check whether the residuals are serially correlated.
The correlograms of ACF and PACF in
Figure 4 confirm that the residuals not serially correlated, because both ACF and PACF correlograms are simply flat. For all the lags, the ACF and PACF are within the 95 percent confidence interval.
Table 7 shows that auto correlation (AC)and partial auto correlation (PAC) values are very minor and the probability values with a 95 percent confidence interval prove that the residuals series for the model (1, 1, 0) and (1, 1, 2) are not serially correlated. This confirms that the estimated models have captured all the pieces of information and there is no need to revise the model parameters. Hence, the ARIMA (1, 1, 0) is the optimal model to predict the monthly prices of Dry Cocoa beans; likewise, model (1, 1, 2) is an optimal model to predict the monthly price of Wet Cocoa beans.
7. VAR Model
The vector auto regressive methodology begins with identifying the optimal lag order. The Akaike information criterion (AIC), Schwarz information criterion (SC), and Hannan–Quinn information criterion (HQ) are used to decide on the optimal lag order for the VAR model. The AIC, SC, and HQ values are presented in
Table 8. The AIC identified lag order 2 as the optimal lag, and SC– and HQ (majority) criterions notified lag order 1 as the optimal lag. Hence, lag order 1 is considered as the optimal lag order for the proposed VAR.
This multivariate VAR has four endogenous variables and the optimal lag order is one. Hence, each equation in the model will have (4
1 + exogenous intercept c = 1) regressors.
Table 9 presents the regression estimate coefficients with standard error, t-statistics, and probability values.
With four endogenous variables, this VAR (1) model has estimated 25 coefficients, 10 of 25 are presented in
Table 8. Our interest is to examine the linkage between the Indian Cocoa market with ICE Cocoa futures. Hence, only the estimates for the Wet and Dry Cocoa series are presented in the above table. The estimates highlighted in bold are statistically significant with a 95% confidence level. The coefficients with the first lag of US Cocoa futures, London Cocoa futures, and Wet Cocoa price are significant for the Dry Cocoa price series. This confirms that the ICE traded US and London Cocoa futures price will influence the price of Dry Cocoa in India. This finding is similar to the findings of (
Jumah and Kunst 2001) and (
Sahay et al. 2002). They found the linkage of the currencies, US Dollar and Pound Sterling, with Cocoa prices in the international market. They justify that the trade of US Cocoa futures and London Cocoa futures in the ICE platform are the primary reasons for such a linkage. However, the Wet Cocoa beans price in India is not affected by the ICE Cocoa futures. The first lag of its own series is statistically significant for Wet Cocoa prices. The changes in the international Cocoa price will be corrected in the Indian Dry Cocoa market. The equation given by the VAR system to forecast the price of Cocoa prices in India is shown in Equation (8)
In Equation (8) coefficients C(7), C(8), and C(10) are not statistically significant. Hence, Equation (8) is represented as Equation (9) by substituting the actual significant coefficients.
To understand the goodness of fit of the estimated VAR equations, the fitness summary is presented in
Table 10. The
R2 in
Table 5 indicates the goodness of fit; this value is 14% for dry Cocoa beans and 84% for wet Cocoa beans. These statistical values imply the proportion of variance in dry and wet Cocoa returns that can be explained by our coefficients. The Durbin–Watson statistic was used to test whether the residual series of the estimated VAR models were serially correlated, and the test value of dry Cocoa beans was close to two, which indicates that the residuals in the series were free from autocorrelation. Hence, the estimated model for dry Cocoa beans is a good fit; this model can be used for the short-term prediction of dry Cocoa beans in India.
The response of dry Cocoa beans returns to a unit shock in the US and London Cocoa futures is shown in
Figure 5. The red lines are 95% confidence intervals and the blue line indicates the impulse response function. The response of dry Cocoa beans returns for the US Cocoa futures shock are sharp and positive at the beginning, stabilize and sharp decrease until the fifth lag, and then they die off after the sixth lag. The response for London Cocoa futures shock is also sharp positive in the beginning, followed by a steady decline till the fifth lag, and then it dies off thereafter.
The market-based tools such as futures or options will help in price discovery and price locking. The price of any commodity is the function of demand and supply for that commodity in the market. Price discovery is the primary function of market-based risk management tools such as futures and options (
Chng 2004). In the futures market, as a commodity will be traded for future delivery, and as the trading happens from all parts of the nation, the traders perform all the possible analyses before they lock the price. Hence, the futures quote can become a good price forecast for any underlying circumstance. A study by
Malhotra and Corelli (
2021) proved that in the EURO/USD futures market, regular futures contributes 65% to price discovery. The e-auction helps in the price discovery of Cardamom in the Indian Market (
Vijayakumar 2021).
In the absence of such predictions and hedging tools, the econometric prediction models will help in price risk management. Forecasting the price for a commodity is an integral part of price risk management. The Box–Jenkins ARIMA model assumes that the future values are the functions of past data; in particular, this methodology uses past terms of AR and MA to predict the future prices. This is proved by the empirical studies of (
Fattah et al. 2018;
Mishra et al. 2019). Many studies have appeared so far concerning the application of time series models to develop the price prediction models for different agricultural commodities. The same is discussed in the literature review section of this study. If the price determinant variables are available, in such cases, multivariate models can be applied to understand the linkage between two or many markets and to develop the price prediction models. This study has developed a univariate ARIMA model and a multivariate model VAR to forecast the price of Wet and Dry Cocoa beans in India. This study can be extended using multivariate time series models, and alternative price risk management strategies can be developed. As the direct futures are not available for commodity cocoa, cross hedges with other related futures can be examined.
8. Conclusions
The volatile business and agricultural market demand proper price risk management strategies. In this context, this study has developed two ARIMA models and one VAR model to predict the monthly prices for two varieties of Cocoa using Box and Jenkins methodology. Based on the Akaike information criterion (AIC) and Schwarz criterion (SBIC), volatility, and adjusted
R2, the optimal AR and MA terms are identified for the models. Based on the above criteria, the models selected are ARIMA (1, 1, 0) and ARIMA (1, 1, 2) for Dry Cocoa price and Wet Cocoa price, respectively. The correlograms of the residuals proved that the model is fit to use for prediction. The VAR (1) system produced significant coefficients only for Dry Cocoa beans. The VAR system proved that the US and London Cocoa Futures traded on the ICE platform are the price determinants of Dry Cocoa beans in the Indian market.
Jayasekhar and Ndung (
2018) stated that the Cocoa price in India is the function of price in the world market; this is evident from the developed VAR model.
This study will contribute to the existing literature on price risk management for commodity Cocoa in India. The models developed can be used by the Cocoa trading entities, growers, industrial users, and government authorities for timely planning and to take managerial decisions. As Cocoa futures are not traded in India, the cooperatives, traders, and manufacturers can trade with futures of ICE. The cooperatives and authorities can suggest the use of ICE futures to hedge the price risk of Cacao in India. The developed models help to discover the price of commodity Cocoa, which helps the farmer to decide on the selling decision and traders and industrial buyers on the buying decision. If one can anticipate the price, then one can decide on the buy or sell decision based on the predicted future price. This, in turn, helps the stakeholders to manage the price risk in the absence of derivative products such as futures and options.