Data | Special Issue : Data Analysis for Financial Markets

Research

Jump to: Review, Other

13 pages, 659 KiB

Open AccessArticle

Cryptocurrency Price Prediction with Convolutional Neural Network and Stacked Gated Recurrent Unit

by Chuen Yik Kang, Chin Poo Lee and Kian Ming Lim

Data 2022, 7(11), 149; https://doi.org/10.3390/data7110149 - 31 Oct 2022

Cited by 24 | Viewed by 14092

Abstract

Virtual currencies have been declared as one of the financial assets that are widely recognized as exchange currencies. The cryptocurrency trades caught the attention of investors as cryptocurrencies can be considered as highly profitable investments. To optimize the profit of the cryptocurrency investments, [...] Read more.

Virtual currencies have been declared as one of the financial assets that are widely recognized as exchange currencies. The cryptocurrency trades caught the attention of investors as cryptocurrencies can be considered as highly profitable investments. To optimize the profit of the cryptocurrency investments, accurate price prediction is essential. In view of the fact that the price prediction is a time series task, a hybrid deep learning model is proposed to predict the future price of the cryptocurrency. The hybrid model integrates a 1-dimensional convolutional neural network and stacked gated recurrent unit (1DCNN-GRU). Given the cryptocurrency price data over the time, the 1-dimensional convolutional neural network encodes the data into a high-level discriminative representation. Subsequently, the stacked gated recurrent unit captures the long-range dependencies of the representation. The proposed hybrid model was evaluated on three different cryptocurrency datasets, namely Bitcoin, Ethereum, and Ripple. Experimental results demonstrated that the proposed 1DCNN-GRU model outperformed the existing methods with the lowest RMSE values of 43.933 on the Bitcoin dataset, 3.511 on the Ethereum dataset, and 0.00128 on the Ripple dataset. Full article

(This article belongs to the Special Issue Data Analysis for Financial Markets)

► Show Figures

Figure 1

17 pages, 377 KiB

Open AccessArticle

A Comparative Analysis of Machine Learning Models for the Prediction of Insurance Uptake in Kenya

by Nelson Kemboi Yego, Juma Kasozi and Joseph Nkurunziza

Data 2021, 6(11), 116; https://doi.org/10.3390/data6110116 - 15 Nov 2021

Cited by 8 | Viewed by 4956

Abstract

The role of insurance in financial inclusion and economic growth, in general, is immense and is increasingly being recognized. However, low uptake impedes the growth of the sector, hence the need for a model that robustly predicts insurance uptake among potential clients. This [...] Read more.

The role of insurance in financial inclusion and economic growth, in general, is immense and is increasingly being recognized. However, low uptake impedes the growth of the sector, hence the need for a model that robustly predicts insurance uptake among potential clients. This study undertook a two phase comparison of machine learning classifiers. Phase I had eight machine learning models compared for their performance in predicting the insurance uptake using 2016 Kenya FinAccessHousehold Survey data. Taking Phase I as a base in Phase II, random forest and XGBoost were compared with four deep learning classifiers using 2019 Kenya FinAccess Household Survey data. The random forest model trained on oversampled data showed the highest F1-score, accuracy, and precision. The area under the receiver operating characteristic curve was furthermore highest for random forest; hence, it could be construed as the most robust model for predicting the insurance uptake. Finally, the most important features in predicting insurance uptake as extracted from the random forest model were income, bank usage, and ability and willingness to support others. Hence, there is a need for a design and distribution of low income based products, and bancassurance could be said to be a plausible channel for the distribution of insurance products. Full article

(This article belongs to the Special Issue Data Analysis for Financial Markets)

► Show Figures

Figure 1

11 pages, 3235 KiB

Open AccessArticle

A Novel Ensemble Neuro-Fuzzy Model for Financial Time Series Forecasting

by Alexander Vlasenko, Nataliia Vlasenko, Olena Vynokurova, Yevgeniy Bodyanskiy and Dmytro Peleshko

Data 2019, 4(3), 126; https://doi.org/10.3390/data4030126 - 23 Aug 2019

Cited by 19 | Viewed by 3820

Abstract

Neuro-fuzzy models have a proven record of successful application in finance. Forecasting future values is a crucial element of successful decision making in trading. In this paper, a novel ensemble neuro-fuzzy model is proposed to overcome limitations and improve the previously successfully applied [...] Read more.

Neuro-fuzzy models have a proven record of successful application in finance. Forecasting future values is a crucial element of successful decision making in trading. In this paper, a novel ensemble neuro-fuzzy model is proposed to overcome limitations and improve the previously successfully applied a five-layer multidimensional Gaussian neuro-fuzzy model and its learning. The proposed solution allows skipping the error-prone hyperparameters selection process and shows better accuracy results in real life financial data. Full article

(This article belongs to the Special Issue Data Analysis for Financial Markets)

► Show Figures

Figure 1

28 pages, 3950 KiB

Open AccessArticle

A Novel Hybrid Model for Stock Price Forecasting Based on Metaheuristics and Support Vector Machine

by Mojtaba Sedighi, Hossein Jahangirnia, Mohsen Gharakhani and Saeed Farahani Fard

Data 2019, 4(2), 75; https://doi.org/10.3390/data4020075 - 22 May 2019

Cited by 50 | Viewed by 9913

Abstract

This paper intends to present a new model for the accurate forecast of the stock’s future price. Stock price forecasting is one of the most complicated issues in view of the high fluctuation of the stock exchange and also it is a key [...] Read more.

This paper intends to present a new model for the accurate forecast of the stock’s future price. Stock price forecasting is one of the most complicated issues in view of the high fluctuation of the stock exchange and also it is a key issue for traders and investors. Many predicting models were upgraded by academy investigators to predict stock price. Despite this, after reviewing the past research, there are several negative aspects in the previous approaches, namely: (1) stringent statistical hypotheses are essential; (2) human interventions take part in predicting process; and (3) an appropriate range is complex to be discovered. Due to the problems mentioned, we plan to provide a new integrated approach based on Artificial Bee Colony (ABC), Adaptive Neuro-Fuzzy Inference System (ANFIS), and Support Vector Machine (SVM). ABC is employed to optimize the technical indicators for forecasting instruments. To achieve a more precise approach, ANFIS has been applied to predict long-run price fluctuations of the stocks. SVM was applied to create the nexus between the stock price and technical indicator and to further decrease the forecasting errors of the presented model, whose performance is examined by five criteria. The comparative outcomes, obtained by running on datasets taken from 50 largest companies of the U.S. Stock Exchange from 2008 to 2018, have clearly demonstrated that the suggested approach outperforms the other methods in accuracy and quality. The findings proved that our model is a successful instrument in stock price forecasting and will assist traders and investors to identify stock price trends, as well as it is an innovation in algorithmic trading. Full article

(This article belongs to the Special Issue Data Analysis for Financial Markets)

► Show Figures

Figure 1

8 pages, 3366 KiB

Open AccessArticle

A Business Rules Management System for Fixed Assets

by Sabina-Cristiana Necula

Data 2019, 4(2), 70; https://doi.org/10.3390/data4020070 - 17 May 2019

Viewed by 4527

Abstract

The goal of this paper is to discuss the necessity of separating decision rules from domain model implementation. (1) Background: can rules help to discover hidden connections between data? We propose a separated implementation of decision rules on data about fixed assets for [...] Read more.

The goal of this paper is to discuss the necessity of separating decision rules from domain model implementation. (1) Background: can rules help to discover hidden connections between data? We propose a separated implementation of decision rules on data about fixed assets for decision support. This will enhance search results. (2) Methods and technical workflow: We used DROOLS (Decision Rules Object Oriented System) to implement decision rules on the subject of accounting decisions on fixed assets; (3) Results: Making the model involves: the existence of a domain ontology and an ontology for the developed application; the possibility of executing specified inferences; the possibility of extracting information from a database; the possibility of simulations, predictions; the possibility of addressing fuzzy questions; and (4) Conclusions: The rules, the plans, and the business models must be implemented to allow specification of control over concepts. The editing of meta models must be directed to the user to ensure adaptation and not implemented at the level of control of the data control. Full article

(This article belongs to the Special Issue Data Analysis for Financial Markets)

► Show Figures

Graphical abstract

22 pages, 3472 KiB

Open AccessArticle

Data Preprocessing for Evaluation of Recommendation Models in E-Commerce

by Namrata Chaudhary and Drimik Roy Chowdhury

Data 2019, 4(1), 23; https://doi.org/10.3390/data4010023 - 31 Jan 2019

Cited by 6 | Viewed by 6100

Abstract

E-commerce businesses employ recommender models to assist in identifying a personalized set of products for each visitor. To accurately assess the recommendations’ influence on customer clicks and buys, three target areas—customer behavior, data collection, user-interface—will be explored for possible sources of erroneous data. [...] Read more.

E-commerce businesses employ recommender models to assist in identifying a personalized set of products for each visitor. To accurately assess the recommendations’ influence on customer clicks and buys, three target areas—customer behavior, data collection, user-interface—will be explored for possible sources of erroneous data. Varied customer behavior misrepresents the recommendations’ true influence on a customer due to the presence of B2B interactions and outlier customers. Non-parametric statistical procedures for outlier removal are delineated and other strategies are investigated to account for the effect of a large percentage of new customers or high bounce rates. Subsequently, in data collection we identify probable misleading interactions in the raw data, propose a robust method of tracking unique visitors, and accurately attributing the buy influence for combo products. Lastly, user-interface issues discuss the possible problems caused due to the recommendation widget’s positioning on the e-commerce website and the stringent conditions that should be imposed when utilizing data from the product listing page. This collective methodology results in an exact and valid estimation of the customer’s interactions influenced by the recommendation model in the context of standard industry metrics, such as Click-through rates, Buy-through rates, and Conversion revenue. Full article

(This article belongs to the Special Issue Data Analysis for Financial Markets)

► Show Figures

Figure 1

11 pages, 988 KiB

Open AccessArticle

Gaussian Mixture and Kernel Density-Based Hybrid Model for Volatility Behavior Extraction From Public Financial Data

by Smail Tigani, Hasna Chaibi and Rachid Saadane

Data 2019, 4(1), 19; https://doi.org/10.3390/data4010019 - 24 Jan 2019

Cited by 3 | Viewed by 4805

Abstract

This paper carried out a hybrid clustering model for foreign exchange market volatility clustering. The proposed model is built using a Gaussian Mixture Model and the inference is done using an Expectation Maximization algorithm. A mono-dimensional kernel density estimator is used in order [...] Read more.

This paper carried out a hybrid clustering model for foreign exchange market volatility clustering. The proposed model is built using a Gaussian Mixture Model and the inference is done using an Expectation Maximization algorithm. A mono-dimensional kernel density estimator is used in order to build a probability density based on all historical observations. That allows us to evaluate the behavior’s probability of each symbol of interest. The computation result shows that the approach is able to pinpoint risky and safe hours to trade a given currency pair. Full article

(This article belongs to the Special Issue Data Analysis for Financial Markets)

► Show Figures

Figure 1

16 pages, 3492 KiB

Open AccessArticle

Performance Analysis of Statistical and Supervised Learning Techniques in Stock Data Mining

by Manik Sharma, Samriti Sharma and Gurvinder Singh

Data 2018, 3(4), 54; https://doi.org/10.3390/data3040054 - 24 Nov 2018

Cited by 40 | Viewed by 6897

Abstract

Nowadays, overwhelming stock data is available, which areonly of use if it is properly examined and mined. In this paper, the last twelve years of ICICI Bank’s stock data have been extensively examined using statistical and supervised learning techniques. This study may be [...] Read more.

Nowadays, overwhelming stock data is available, which areonly of use if it is properly examined and mined. In this paper, the last twelve years of ICICI Bank’s stock data have been extensively examined using statistical and supervised learning techniques. This study may be of great interest for those who wish to mine or study the stock data of banks or any financial organization. Different statistical measures have been computed to explore the nature, range, distribution, and deviation of data. The different descriptive statistical measures assist in finding different valuable metrics such as mean, variance, skewness, kurtosis, p-value, a-squared, and 95% confidence mean interval level of ICICI Bank’s stock data. Moreover, daily percentage changes occurring over the last 12 years have also been recorded and examined. Additionally, the intraday stock status has been mined using ten different classifiers. The performance of different classifiers has been evaluated on the basis of various parameters such as accuracy, misclassification rate, precision, recall, specificity, and sensitivity. Based upon different parameters, the predictive results obtained using logistic regression are more acceptable than the outcomes of other classifiers, whereas naïve Bayes, C4.5, random forest, linear discriminant, and cubic support vector machine (SVM) merely act as a random guessing machine. The outstanding performance of logistic regression has been validated using TOPSIS (technique for order preference by similarity to ideal solution) and WSA (weighted sum approach). Full article

(This article belongs to the Special Issue Data Analysis for Financial Markets)

► Show Figures

Figure 1

Review

Jump to: Research, Other

17 pages, 617 KiB

Open AccessReview

Reinforcement Learning in Financial Markets

by Terry Lingze Meng and Matloob Khushi

Data 2019, 4(3), 110; https://doi.org/10.3390/data4030110 - 28 Jul 2019

Cited by 83 | Viewed by 18051

Abstract

Recently there has been an exponential increase in the use of artificial intelligence for trading in financial markets such as stock and forex. Reinforcement learning has become of particular interest to financial traders ever since the program AlphaGo defeated the strongest human contemporary [...] Read more.

Recently there has been an exponential increase in the use of artificial intelligence for trading in financial markets such as stock and forex. Reinforcement learning has become of particular interest to financial traders ever since the program AlphaGo defeated the strongest human contemporary Go board game player Lee Sedol in 2016. We systematically reviewed all recent stock/forex prediction or trading articles that used reinforcement learning as their primary machine learning method. All reviewed articles had some unrealistic assumptions such as no transaction costs, no liquidity issues and no bid or ask spread issues. Transaction costs had significant impacts on the profitability of the reinforcement learning algorithms compared with the baseline algorithms tested. Despite showing statistically significant profitability when reinforcement learning was used in comparison with baseline models in many studies, some showed no meaningful level of profitability, in particular with large changes in the price pattern between the system training and testing data. Furthermore, few performance comparisons between reinforcement learning and other sophisticated machine/deep learning models were provided. The impact of transaction costs, including the bid/ask spread on profitability has also been assessed. In conclusion, reinforcement learning in stock/forex trading is still in its early development and further research is needed to make it a reliable method in this domain. Full article

(This article belongs to the Special Issue Data Analysis for Financial Markets)

► Show Figures

Figure 1

Other

Jump to: Research, Review

10 pages, 365 KiB

Open AccessData Descriptor

Google Web and Image Search Visibility Data for Online Store

by Artur Strzelecki

Data 2019, 4(3), 125; https://doi.org/10.3390/data4030125 - 22 Aug 2019

Cited by 13 | Viewed by 7297

Abstract

This data descriptor describes Google search engine visibility data. The visibility of a domain name in a search engine comes from search engine optimization and can be evaluated based on four data metrics and five data dimensions. The data metrics are the following: [...] Read more.

This data descriptor describes Google search engine visibility data. The visibility of a domain name in a search engine comes from search engine optimization and can be evaluated based on four data metrics and five data dimensions. The data metrics are the following: Clicks volume (1), impressions volume (2), click-through ratio (3), and ranking position (4). Data dimensions are as follows: queries that are entered into search engines that trigger results with the researched domain name (1), page URLs from research domains which are available in the search engine results page (2), country of origin of search engine visitors (3), type of device used for the search (4), and date of the search (5). Search engine visibility data were obtained from the Google search console for the international online store, which is visible in 240 countries and territories for a period of 15 months. The data contain 123 K clicks and 4.86 M impressions for the web search and 22 K clicks and 9.07 M impressions for the image search. The proposed method for obtaining data can be applied in any other area, not only in the e-commerce industry. Full article

(This article belongs to the Special Issue Data Analysis for Financial Markets)

► Show Figures

Graphical abstract

6 pages, 636 KiB

Open AccessData Descriptor

Treasury Bond Return Data Starting in 1962

by Laurens Swinkels

Data 2019, 4(3), 91; https://doi.org/10.3390/data4030091 - 28 Jun 2019

Cited by 4 | Viewed by 12473

Abstract

Academics and research analysts in financial economics frequently use returns on government bonds for their empirical analyses. In the United States, government bonds are also called Treasury bonds. The Federal Reserve publishes the yield-to-maturity of Treasury bonds. However, the Treasury bond returns earned [...] Read more.

Academics and research analysts in financial economics frequently use returns on government bonds for their empirical analyses. In the United States, government bonds are also called Treasury bonds. The Federal Reserve publishes the yield-to-maturity of Treasury bonds. However, the Treasury bond returns earned by investors are not publicly available. The purpose of this study is to provide these currently not publicly available return series and provide formulas such that these series can easily be updated by researchers. We use standard textbook formulas to convert the yield-to-maturity data to investor returns. The starting date of our series is January 1962, when end-of-month data on the yield-to-maturity become publicly available. We compare our newly created total return series with alternative series that can be purchased. Our return series are very close, suggesting that they are a high-quality public alternative to commercially available data. Full article

(This article belongs to the Special Issue Data Analysis for Financial Markets)

► Show Figures

Figure 1

4 pages, 1324 KiB

Open AccessData Descriptor

Point of Sale (POS) Data from a Supermarket: Transactions and Cashier Operations

by Tomasz Antczak and Rafał Weron

Data 2019, 4(2), 67; https://doi.org/10.3390/data4020067 - 11 May 2019

Cited by 7 | Viewed by 18065

Abstract

As queues in supermarkets seem to be inevitable, researchers try to find solutions that can improve and speed up the checkout process. This, however, requires access to real-world data for developing and validating models. With this objective in mind, we have prepared and [...] Read more.

As queues in supermarkets seem to be inevitable, researchers try to find solutions that can improve and speed up the checkout process. This, however, requires access to real-world data for developing and validating models. With this objective in mind, we have prepared and made publicly available high-frequency datasets containing nearly six weeks of actual transactions and cashier operations from a grocery supermarket belonging to one of the major European retail chains. This dataset can provide insights on how the intensity and duration of checkout operations changes throughout the day and week. Full article

(This article belongs to the Special Issue Data Analysis for Financial Markets)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Data Analysis for Financial Markets

Share This Special Issue

Special Issue Editor

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (12 papers)

Research

Review

Other

Further Information

Guidelines

MDPI Initiatives

Follow MDPI