Forecasts of the Amount Purchase Pork Meat by Using Structured and Unstructured Big Data
Abstract
:1. Introduction
2. Materials and Methods
2.1. Data
2.2. Forecasting Methodology
2.2.1. Time Series: Autoregressive Exogenous Model and Vector Error Correction Model
2.2.2. Machine Learning: Gradient Boosting and Random Forest
2.2.3. Long Short-Term Memory
3. Results
3.1. Forecasted Daily Amounts Required to Purchase Pork Belly Meat
3.2. Forecasted Weekly Forecased Amounts Required to Purchase Pork Belly Meat
3.3. Forecasted Errors in LSTM When Structure Data and Unstructured Data Were Used Over Structure Data Alone
4. Discussion
Supplementary Materials
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Agriculture and Consumer Protection Department. ASF Situation in Asia Update. Available online: http://www.fao.org/ag/againfo/programmes/en/empres/ASF/situation_update.html (accessed on 25 November 2019).
- Rah, H.; Park, K.; An, B.; Choi, S.; Chae, D.; Yoo, K.H. Development of Prediction Model of Agro-Food Demand by Unstructured and Structured Bigdata. In The 5th International Conference on Big Data Applications and Services; Korea Big Data Service Society: Jeju, Korea, 2017; Volume 5, pp. 122–127. [Google Scholar]
- Shin, M.-H.; Oh, S.-H.; Hwang, D.-Y.; Seo, S.-S.; Kim, Y.-C. Effect of SNS Characteristics on Consumer Satisfaction and Purchase Intention of Agri-food Contents. J. Korea Contents Assoc. 2012, 12, 358–367. [Google Scholar] [CrossRef] [Green Version]
- Kim, S.H. The Impact of Foot-and-Mouth Disease on Pork Consumption: Analysis of Consumer Response to Media. Master’s Thesis, Seoul National University, Seoul, Korea, 2016. [Google Scholar]
- Artola, C.; Pinto, F.; de Pedraza García, P. Can internet searches forecast tourism inflows? Int. J. Manpow. 2015, 36, 103–116. [Google Scholar] [CrossRef]
- Choi, H.; Varian, H. Predicting the present with Google Trends. Econ. Record 2012, 88, 2–9. [Google Scholar] [CrossRef]
- Bollen, J.; Mao, H.; Zeng, X. Twitter Mood Predicts the Stock Market. J. Comput. Sci. 2010, 2, 1–8. [Google Scholar] [CrossRef] [Green Version]
- Kurumatani, K. Time series prediction of agricultural products price based on time alignment of recurrent neural networks. In Proceedings of the 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), Orlando, FL, USA, 17–20 December 2018; pp. 81–88. [Google Scholar]
- Kim, J.; Cha, M.; Lee, J.G. A Model for Nowcasting Commodity Price based on Social Media Data. J. KIISE 2017, 44, 1258–1268. [Google Scholar] [CrossRef]
- Meza, X.V.; Park, H.W. Organic Products in Mexico and South Korea on Twitter. J. Bus. Ethics 2016, 135, 587–603. [Google Scholar] [CrossRef]
- Yoo, D.-I. Vegetable Price Prediction Using Atypical Web-Search Data. In Proceedings of the 2016 Annual Meeting, Boston, MA, USA, 31 July–2 August 2016; Agricultural and Applied Economics Association: Milwaukee, WI, USA, 2016. [Google Scholar]
- Rah, H.; Oh, E.; Yoo, D.-I.; Cho, W.-S.; Nasridinov, A.; Park, S.; Cho, Y.; Yoo, K.-H. Prediction of Onion Purchase Using Structured and Unstructured Big Data. J. Korea Contents Assoc. 2018, 18, 30–37. [Google Scholar]
- Statistics Korea. Korean Statistical Information System (KOSIS). Available online: http://kosis.kr/index/index.do (accessed on 2 March 2018).
- Korea Agro-Fisheries & Food Trade Corporation. Korea Agricultural Marketing Information Service (KAMIS). Available online: https://www.kamis.or.kr/customer/main/main.do (accessed on 2 March 2018).
- Korea Rural Economic Institute. Outlook and Agricultural Statistics Information System (OASIS). Available online: https://oasis.krei.re.kr/index.do (accessed on 2 March 2018).
- Bae, S.; Yu, J. Predicting the real estate price index using machine learning methods and time series analysis model. Hous. Stud. Rev. 2018, 26, 107–133. [Google Scholar] [CrossRef]
- R Development Core Team R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2013.
- Diebold, F.X.; Mariano, R.S. Comparing predictive accuracy. J. Bus. Econ. Stat. 2002, 20, 134–144. [Google Scholar] [CrossRef]
- Suharsono, A.; Aziza, A.; Pramesti, W. Comparison of vector autoregressive (VAR) and vector error correction models (VECM) for index of ASEAN stock price. In AIP Conference Proceedings; AIP Publishing: College Park, MD, USA, 2017; p. 020032. [Google Scholar]
- Fukata, K.; Washio, T.; Yada, K.; Motoda, H. A method to search ARX model orders and its application to sales dynamics analysis. In Data Mining for Design and Marketing; Chapman and Hall/CRC: New York, NY, USA, 2009; pp. 90–103. [Google Scholar]
- Zhang, J.; Hu, W.; Zhang, X. The relative performance of VAR and VECM model. In Proceedings of the 2010 3rd International Conference on Information Management, Innovation Management and Industrial Engineering, Kunming, China, 26–28 November 2010; pp. 132–135. [Google Scholar]
- Samuel, A.L. Some Studies in Machine Learning Using the Game of Checkers. IBM J. Res. Dev. 1959, 3, 210–229. [Google Scholar] [CrossRef]
- Oh, I.-S. Machine Learning; Hanbit Academy: Seoul, Korea, 2017. [Google Scholar]
- Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
- Lantz, B. Machine Learning with R; Packt Publishing: Birmingham, UK, 2013; p. 396. [Google Scholar]
- Greff, K.; Srivastava, R.K.; Koutník, J.; Steunebrink, B.R.; Schmidhuber, J. LSTM: A search space odyssey. IEEE Trans. Neural Netw. Learn. Syst. 2016, 28, 2222–2232. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Lipton, Z.C.; Berkowitz, J.; Elkan, C. A critical review of recurrent neural networks for sequence learning. arXiv 2015, arXiv:1506.00019. [Google Scholar]
- Cho, Y.; Oh, E.; Cho, W.-S.; Nasridinov, A.; Yoo, K.-H.; Rah, H. Relations between Paprika Consumption and Unstructured Big Data, and Paprika Consumption Prediction. Int. J. Contents 2019, 15, 113–119. [Google Scholar]
Data type | Data name | Feature name | Description |
---|---|---|---|
Structured data | Agri-food consumers panel data | Panel_purchase_amount | Daily average amount required to purchase pork belly meat per consumer panel |
Sales of pork meat | Retail_price_meat | Daily retail prices of pork belly meat | |
Wholesale_price_carcass | Daily wholesale prices of pork carcass | ||
Wholesale_price_carcass_quarter_before | Daily wholesale prices of pork carcass in previous quarter | ||
Monthly_sales_trend_ton_meat | Monthly sales trend of pork meat (ton) | ||
Production of pork meat | Pig_bred_number_quarter_before | Number of pig bred in previous quarter | |
Pig_slaughtered_number_quarter_before | Number of pig slaughtered in previous quarter | ||
Output_ton_year_before_carcass | Pork meat production in previous year (ton) | ||
Import_ton_year_before | Imported pork meat in previous year (ton) |
Data Type | Data Name | Feature Name | Description | Data Counts |
---|---|---|---|---|
Unstructured data | Broadcast news | News_freq | Daily frequency that keyword term was mentioned in broadcast news | 6655 |
Emotions_Number_Angries | Daily frequency of comments with angry emoticon of broadcast news in which keyword term was mentioned | 3979 | ||
Emotions_Number_likes | Daily frequency of comments with like emoticon of broadcast news in which keyword term was mentioned | 14,811 | ||
Emotions_Number_sads | Daily frequency of comments with sad emoticon of broadcast news in which keyword term was mentioned | 395 | ||
Emotions_Number_wants | Daily frequency of comments with want more reports emoticon of broadcast news in which keyword term was mentioned | 438 | ||
Emotions_Number_warms | Daily frequency of comments with moved emoticon of broadcast news in which keyword term was mentioned | 153 | ||
News_comment_freq | Daily frequency of comment of broadcast news in which keyword term was mentioned | 44,342 | ||
News_positive_term_freq | Daily frequency of positive term of broadcast news in which keyword term was mentioned | 35,319 | ||
News_negative_term_freq | Daily frequency of negative term of broadcast news in which keyword term was mentioned | 4429 | ||
Television program/shows | Video_freq | Daily frequency that keyword term was mentioned in television program/shows other than broadcast news | 1529 | |
Video_total_ranking_ave_p | Average television view rate of television program/shows in which keyword term was mentioned | 1529 | ||
Video_freq_times_viewrate | Video_freq times Video_total_ranking_ave_p | 1529 | ||
Video_positive_term_freq | Daily frequency of positive term of television program/shows in which keyword term was mentioned | 119,396 | ||
Video_negative_term_freq | Daily frequency of negative term of television program/shows in which keyword term was mentioned | 4745 | ||
Blogs | Blog_freq | Daily frequency that keyword term was mentioned in blog | 75,035 | |
Blog_comments | Daily frequency of comment of blog in which keyword term was mentioned | 109,950 | ||
Blog_likes | Daily frequency of comments with like emoticon of blog in which keyword term was mentioned | 70,025 | ||
Blog_positive_term_freq | Daily frequency of positive term of blog in which keyword term was mentioned | 1,666,492 | ||
Blog_negative_term_freq | Daily frequency of negative term of blog in which keyword term was mentioned | 56,870 |
Case18 | Case27 | Case33 | Case36 |
---|---|---|---|
seq_length = 14 | seq_length = 21 | seq_length = 21 | seq_length = 21 |
hidden_dim = 15 | hidden_dim = 20 | hidden_dim = 10 | hidden_dim = 5 |
forget_bias = 0.5 | forget_bias = 0.5 | forget_bias = 0.5 | forget_bias = 0.5 |
stacked_layers = 12 | stacked_layers = 12 | stacked_layers = 12 | stacked_layers = 12 |
keep_prob = 0.5 | keep_prob = 0.5 | keep_prob = 0.5 | keep_prob = 0.5 |
epoch = 1000 | epoch = 1000 | epoch_num = 1000 | epoch = 1000 |
learning_rate = 0.01 | learning_rate = 0.01 | learning_rate = 0.01 | learning rate = 0.01 |
Forecasting interval | Data | Algorithm | Mean Absolute Percentage Error (MAPE) | Mean Absolute Error (MAE) | Remark |
---|---|---|---|---|---|
Daily | Structured | Autoregressive exogenous | 15.18 | 2547.5 | lag = 1 |
Vector error correction model | 14.62 | 2601.4 | lag = 1 | ||
Random Forest | 15.95 | 2690.3 | ntree = 1000, mtry = 5 | ||
Gradient boosting | 18.31 | 3141.6 | max.depth=8, eta=0.03 | ||
Long short-term memory | 14.51 | 2570.5 | seq_length = 21 hidden_dim = 5 forget_bias = 0.5 stacked_layers = 12 keep_prob = 0.5 epoch = 1000 learning rate = 0.01 | ||
Structured and Unstructured | Autoregressive exogenous | 15.05 | 2522.7 | lag = 1 | |
Vector error correction model | 17.7 | 2823.2 | lag = 1 | ||
Random Forest | 17.74 | 2859.6 | ntree = 1500, mtry = 13 | ||
Gradient boosting | 16.9 | 2863.5 | max.depth = 15, eta = 0.03 | ||
Long short-term memory | 14.59 | 2629.0 | seq_length = 21 hidden_dim = 20 forget_bias = 0.5 stacked_layers = 12 keep_prob = 0.5 epoch = 1000 learning rate = 0.01 |
Forecasting interval | Data | Algorithm | Mean Absolute Percentage Error (MAPE) | Mean Absolute Error (MAE) | Remark |
---|---|---|---|---|---|
Weekly | Structured | Autoregressive exogenous | 6.33 | 1123. 5 | lag = 1 |
Vector error correction model | 7.98 | 1460.8 | lag = 1 | ||
Random Forest | 6.84 | 1212.0 | ntree = 1000, mtry = 5 | ||
Gradient boosting | 9.78 | 1731.1 | max.depth = 8, eta = 0.03 | ||
Long short-term memory | 7.25 | 1335.8 | seq_length = 14 hidden_dim = 15 forget_bias = 0.5 stacked_layers = 12 keep_prob = 0.5 epoch = 1000 learning rate = 0.01 | ||
Structured and Unstructured | Autoregressive exogenous | 6.15 | 1091.7 | lag = 1 | |
Vector error correction model | 9.18 | 1538.6 | lag = 1 | ||
Random Forest | 9.04 | 1535.2 | ntree = 1500, mtry = 13 | ||
Gradient boosting | 7.75 | 1359.6 | max.depth = 15, eta = 0.03 | ||
Long short-term memory | 6.5 | 1163.8 | seq_length = 21 hidden_dim = 10 forget_bias = 0.5 stacked_layers = 12 keep_prob = 0.5 epoch = 1000 learning rate = 0.01 |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ryu, G.-A.; Nasridinov, A.; Rah, H.; Yoo, K.-H. Forecasts of the Amount Purchase Pork Meat by Using Structured and Unstructured Big Data. Agriculture 2020, 10, 21. https://doi.org/10.3390/agriculture10010021
Ryu G-A, Nasridinov A, Rah H, Yoo K-H. Forecasts of the Amount Purchase Pork Meat by Using Structured and Unstructured Big Data. Agriculture. 2020; 10(1):21. https://doi.org/10.3390/agriculture10010021
Chicago/Turabian StyleRyu, Ga-Ae, Aziz Nasridinov, HyungChul Rah, and Kwan-Hee Yoo. 2020. "Forecasts of the Amount Purchase Pork Meat by Using Structured and Unstructured Big Data" Agriculture 10, no. 1: 21. https://doi.org/10.3390/agriculture10010021
APA StyleRyu, G.-A., Nasridinov, A., Rah, H., & Yoo, K.-H. (2020). Forecasts of the Amount Purchase Pork Meat by Using Structured and Unstructured Big Data. Agriculture, 10(1), 21. https://doi.org/10.3390/agriculture10010021