Data Analysis for Financial Markets

A special issue of Data (ISSN 2306-5729). This special issue belongs to the section "Information Systems and Data Management".

Deadline for manuscript submissions: closed (31 October 2019) | Viewed by 115289

Special Issue Editor

Special Issue Information

Dear Colleagues,

Data analysis plays a key role in the decisions made by participants in financial markets. Rapid advances in computing high amounts of data from stock exchanges has enabled the design of algorithms, which currently leads a considerable proportion of volume in international stock markets. Important research has recently addressed different approaches to take advantage of intraday data, but also any information provided by countless websites, post on Twitter, corporate reports, or daily news announcements. Extracting insight from unstructured data is also part of ongoing research.

This Special Issue will contribute to bring original research in the field of data analysis in financial markets. Suitable topics include, but are not limited to, the following: Big Data, business intelligence, sentiment analysis, text mining, financial volatility, real-time analytics, machine learning, fraud detection, operational efficiency, financial trading, high-frequency data, trading rules, stock markets, bankruptcy, and financial shocks.

Prof. Dr. Francisco Guijarro
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Data is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • algorithmic trading
  • Big Data analysis
  • stock markets
  • financial risks

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (12 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review, Other

13 pages, 659 KiB  
Article
Cryptocurrency Price Prediction with Convolutional Neural Network and Stacked Gated Recurrent Unit
by Chuen Yik Kang, Chin Poo Lee and Kian Ming Lim
Data 2022, 7(11), 149; https://doi.org/10.3390/data7110149 - 31 Oct 2022
Cited by 24 | Viewed by 14092
Abstract
Virtual currencies have been declared as one of the financial assets that are widely recognized as exchange currencies. The cryptocurrency trades caught the attention of investors as cryptocurrencies can be considered as highly profitable investments. To optimize the profit of the cryptocurrency investments, [...] Read more.
Virtual currencies have been declared as one of the financial assets that are widely recognized as exchange currencies. The cryptocurrency trades caught the attention of investors as cryptocurrencies can be considered as highly profitable investments. To optimize the profit of the cryptocurrency investments, accurate price prediction is essential. In view of the fact that the price prediction is a time series task, a hybrid deep learning model is proposed to predict the future price of the cryptocurrency. The hybrid model integrates a 1-dimensional convolutional neural network and stacked gated recurrent unit (1DCNN-GRU). Given the cryptocurrency price data over the time, the 1-dimensional convolutional neural network encodes the data into a high-level discriminative representation. Subsequently, the stacked gated recurrent unit captures the long-range dependencies of the representation. The proposed hybrid model was evaluated on three different cryptocurrency datasets, namely Bitcoin, Ethereum, and Ripple. Experimental results demonstrated that the proposed 1DCNN-GRU model outperformed the existing methods with the lowest RMSE values of 43.933 on the Bitcoin dataset, 3.511 on the Ethereum dataset, and 0.00128 on the Ripple dataset. Full article
(This article belongs to the Special Issue Data Analysis for Financial Markets)
Show Figures

Figure 1

17 pages, 377 KiB  
Article
A Comparative Analysis of Machine Learning Models for the Prediction of Insurance Uptake in Kenya
by Nelson Kemboi Yego, Juma Kasozi and Joseph Nkurunziza
Data 2021, 6(11), 116; https://doi.org/10.3390/data6110116 - 15 Nov 2021
Cited by 8 | Viewed by 4956
Abstract
The role of insurance in financial inclusion and economic growth, in general, is immense and is increasingly being recognized. However, low uptake impedes the growth of the sector, hence the need for a model that robustly predicts insurance uptake among potential clients. This [...] Read more.
The role of insurance in financial inclusion and economic growth, in general, is immense and is increasingly being recognized. However, low uptake impedes the growth of the sector, hence the need for a model that robustly predicts insurance uptake among potential clients. This study undertook a two phase comparison of machine learning classifiers. Phase I had eight machine learning models compared for their performance in predicting the insurance uptake using 2016 Kenya FinAccessHousehold Survey data. Taking Phase I as a base in Phase II, random forest and XGBoost were compared with four deep learning classifiers using 2019 Kenya FinAccess Household Survey data. The random forest model trained on oversampled data showed the highest F1-score, accuracy, and precision. The area under the receiver operating characteristic curve was furthermore highest for random forest; hence, it could be construed as the most robust model for predicting the insurance uptake. Finally, the most important features in predicting insurance uptake as extracted from the random forest model were income, bank usage, and ability and willingness to support others. Hence, there is a need for a design and distribution of low income based products, and bancassurance could be said to be a plausible channel for the distribution of insurance products. Full article
(This article belongs to the Special Issue Data Analysis for Financial Markets)
Show Figures

Figure 1

11 pages, 3235 KiB  
Article
A Novel Ensemble Neuro-Fuzzy Model for Financial Time Series Forecasting
by Alexander Vlasenko, Nataliia Vlasenko, Olena Vynokurova, Yevgeniy Bodyanskiy and Dmytro Peleshko
Data 2019, 4(3), 126; https://doi.org/10.3390/data4030126 - 23 Aug 2019
Cited by 19 | Viewed by 3820
Abstract
Neuro-fuzzy models have a proven record of successful application in finance. Forecasting future values is a crucial element of successful decision making in trading. In this paper, a novel ensemble neuro-fuzzy model is proposed to overcome limitations and improve the previously successfully applied [...] Read more.
Neuro-fuzzy models have a proven record of successful application in finance. Forecasting future values is a crucial element of successful decision making in trading. In this paper, a novel ensemble neuro-fuzzy model is proposed to overcome limitations and improve the previously successfully applied a five-layer multidimensional Gaussian neuro-fuzzy model and its learning. The proposed solution allows skipping the error-prone hyperparameters selection process and shows better accuracy results in real life financial data. Full article
(This article belongs to the Special Issue Data Analysis for Financial Markets)
Show Figures

Figure 1

28 pages, 3950 KiB  
Article
A Novel Hybrid Model for Stock Price Forecasting Based on Metaheuristics and Support Vector Machine
by Mojtaba Sedighi, Hossein Jahangirnia, Mohsen Gharakhani and Saeed Farahani Fard
Data 2019, 4(2), 75; https://doi.org/10.3390/data4020075 - 22 May 2019
Cited by 50 | Viewed by 9913
Abstract
This paper intends to present a new model for the accurate forecast of the stock’s future price. Stock price forecasting is one of the most complicated issues in view of the high fluctuation of the stock exchange and also it is a key [...] Read more.
This paper intends to present a new model for the accurate forecast of the stock’s future price. Stock price forecasting is one of the most complicated issues in view of the high fluctuation of the stock exchange and also it is a key issue for traders and investors. Many predicting models were upgraded by academy investigators to predict stock price. Despite this, after reviewing the past research, there are several negative aspects in the previous approaches, namely: (1) stringent statistical hypotheses are essential; (2) human interventions take part in predicting process; and (3) an appropriate range is complex to be discovered. Due to the problems mentioned, we plan to provide a new integrated approach based on Artificial Bee Colony (ABC), Adaptive Neuro-Fuzzy Inference System (ANFIS), and Support Vector Machine (SVM). ABC is employed to optimize the technical indicators for forecasting instruments. To achieve a more precise approach, ANFIS has been applied to predict long-run price fluctuations of the stocks. SVM was applied to create the nexus between the stock price and technical indicator and to further decrease the forecasting errors of the presented model, whose performance is examined by five criteria. The comparative outcomes, obtained by running on datasets taken from 50 largest companies of the U.S. Stock Exchange from 2008 to 2018, have clearly demonstrated that the suggested approach outperforms the other methods in accuracy and quality. The findings proved that our model is a successful instrument in stock price forecasting and will assist traders and investors to identify stock price trends, as well as it is an innovation in algorithmic trading. Full article
(This article belongs to the Special Issue Data Analysis for Financial Markets)
Show Figures

Figure 1

8 pages, 3366 KiB  
Article
A Business Rules Management System for Fixed Assets
by Sabina-Cristiana Necula
Data 2019, 4(2), 70; https://doi.org/10.3390/data4020070 - 17 May 2019
Viewed by 4527
Abstract
The goal of this paper is to discuss the necessity of separating decision rules from domain model implementation. (1) Background: can rules help to discover hidden connections between data? We propose a separated implementation of decision rules on data about fixed assets for [...] Read more.
The goal of this paper is to discuss the necessity of separating decision rules from domain model implementation. (1) Background: can rules help to discover hidden connections between data? We propose a separated implementation of decision rules on data about fixed assets for decision support. This will enhance search results. (2) Methods and technical workflow: We used DROOLS (Decision Rules Object Oriented System) to implement decision rules on the subject of accounting decisions on fixed assets; (3) Results: Making the model involves: the existence of a domain ontology and an ontology for the developed application; the possibility of executing specified inferences; the possibility of extracting information from a database; the possibility of simulations, predictions; the possibility of addressing fuzzy questions; and (4) Conclusions: The rules, the plans, and the business models must be implemented to allow specification of control over concepts. The editing of meta models must be directed to the user to ensure adaptation and not implemented at the level of control of the data control. Full article
(This article belongs to the Special Issue Data Analysis for Financial Markets)
Show Figures

Graphical abstract

22 pages, 3472 KiB  
Article
Data Preprocessing for Evaluation of Recommendation Models in E-Commerce
by Namrata Chaudhary and Drimik Roy Chowdhury
Data 2019, 4(1), 23; https://doi.org/10.3390/data4010023 - 31 Jan 2019
Cited by 6 | Viewed by 6100
Abstract
E-commerce businesses employ recommender models to assist in identifying a personalized set of products for each visitor. To accurately assess the recommendations’ influence on customer clicks and buys, three target areas—customer behavior, data collection, user-interface—will be explored for possible sources of erroneous data. [...] Read more.
E-commerce businesses employ recommender models to assist in identifying a personalized set of products for each visitor. To accurately assess the recommendations’ influence on customer clicks and buys, three target areas—customer behavior, data collection, user-interface—will be explored for possible sources of erroneous data. Varied customer behavior misrepresents the recommendations’ true influence on a customer due to the presence of B2B interactions and outlier customers. Non-parametric statistical procedures for outlier removal are delineated and other strategies are investigated to account for the effect of a large percentage of new customers or high bounce rates. Subsequently, in data collection we identify probable misleading interactions in the raw data, propose a robust method of tracking unique visitors, and accurately attributing the buy influence for combo products. Lastly, user-interface issues discuss the possible problems caused due to the recommendation widget’s positioning on the e-commerce website and the stringent conditions that should be imposed when utilizing data from the product listing page. This collective methodology results in an exact and valid estimation of the customer’s interactions influenced by the recommendation model in the context of standard industry metrics, such as Click-through rates, Buy-through rates, and Conversion revenue. Full article
(This article belongs to the Special Issue Data Analysis for Financial Markets)
Show Figures

Figure 1

11 pages, 988 KiB  
Article
Gaussian Mixture and Kernel Density-Based Hybrid Model for Volatility Behavior Extraction From Public Financial Data
by Smail Tigani, Hasna Chaibi and Rachid Saadane
Data 2019, 4(1), 19; https://doi.org/10.3390/data4010019 - 24 Jan 2019
Cited by 3 | Viewed by 4805
Abstract
This paper carried out a hybrid clustering model for foreign exchange market volatility clustering. The proposed model is built using a Gaussian Mixture Model and the inference is done using an Expectation Maximization algorithm. A mono-dimensional kernel density estimator is used in order [...] Read more.
This paper carried out a hybrid clustering model for foreign exchange market volatility clustering. The proposed model is built using a Gaussian Mixture Model and the inference is done using an Expectation Maximization algorithm. A mono-dimensional kernel density estimator is used in order to build a probability density based on all historical observations. That allows us to evaluate the behavior’s probability of each symbol of interest. The computation result shows that the approach is able to pinpoint risky and safe hours to trade a given currency pair. Full article
(This article belongs to the Special Issue Data Analysis for Financial Markets)
Show Figures

Figure 1

16 pages, 3492 KiB  
Article
Performance Analysis of Statistical and Supervised Learning Techniques in Stock Data Mining
by Manik Sharma, Samriti Sharma and Gurvinder Singh
Data 2018, 3(4), 54; https://doi.org/10.3390/data3040054 - 24 Nov 2018
Cited by 40 | Viewed by 6897
Abstract
Nowadays, overwhelming stock data is available, which areonly of use if it is properly examined and mined. In this paper, the last twelve years of ICICI Bank’s stock data have been extensively examined using statistical and supervised learning techniques. This study may be [...] Read more.
Nowadays, overwhelming stock data is available, which areonly of use if it is properly examined and mined. In this paper, the last twelve years of ICICI Bank’s stock data have been extensively examined using statistical and supervised learning techniques. This study may be of great interest for those who wish to mine or study the stock data of banks or any financial organization. Different statistical measures have been computed to explore the nature, range, distribution, and deviation of data. The different descriptive statistical measures assist in finding different valuable metrics such as mean, variance, skewness, kurtosis, p-value, a-squared, and 95% confidence mean interval level of ICICI Bank’s stock data. Moreover, daily percentage changes occurring over the last 12 years have also been recorded and examined. Additionally, the intraday stock status has been mined using ten different classifiers. The performance of different classifiers has been evaluated on the basis of various parameters such as accuracy, misclassification rate, precision, recall, specificity, and sensitivity. Based upon different parameters, the predictive results obtained using logistic regression are more acceptable than the outcomes of other classifiers, whereas naïve Bayes, C4.5, random forest, linear discriminant, and cubic support vector machine (SVM) merely act as a random guessing machine. The outstanding performance of logistic regression has been validated using TOPSIS (technique for order preference by similarity to ideal solution) and WSA (weighted sum approach). Full article
(This article belongs to the Special Issue Data Analysis for Financial Markets)
Show Figures

Figure 1

Review

Jump to: Research, Other

17 pages, 617 KiB  
Review
Reinforcement Learning in Financial Markets
by Terry Lingze Meng and Matloob Khushi
Data 2019, 4(3), 110; https://doi.org/10.3390/data4030110 - 28 Jul 2019
Cited by 83 | Viewed by 18051
Abstract
Recently there has been an exponential increase in the use of artificial intelligence for trading in financial markets such as stock and forex. Reinforcement learning has become of particular interest to financial traders ever since the program AlphaGo defeated the strongest human contemporary [...] Read more.
Recently there has been an exponential increase in the use of artificial intelligence for trading in financial markets such as stock and forex. Reinforcement learning has become of particular interest to financial traders ever since the program AlphaGo defeated the strongest human contemporary Go board game player Lee Sedol in 2016. We systematically reviewed all recent stock/forex prediction or trading articles that used reinforcement learning as their primary machine learning method. All reviewed articles had some unrealistic assumptions such as no transaction costs, no liquidity issues and no bid or ask spread issues. Transaction costs had significant impacts on the profitability of the reinforcement learning algorithms compared with the baseline algorithms tested. Despite showing statistically significant profitability when reinforcement learning was used in comparison with baseline models in many studies, some showed no meaningful level of profitability, in particular with large changes in the price pattern between the system training and testing data. Furthermore, few performance comparisons between reinforcement learning and other sophisticated machine/deep learning models were provided. The impact of transaction costs, including the bid/ask spread on profitability has also been assessed. In conclusion, reinforcement learning in stock/forex trading is still in its early development and further research is needed to make it a reliable method in this domain. Full article
(This article belongs to the Special Issue Data Analysis for Financial Markets)
Show Figures

Figure 1

Other

Jump to: Research, Review

10 pages, 365 KiB  
Data Descriptor
Google Web and Image Search Visibility Data for Online Store
by Artur Strzelecki
Data 2019, 4(3), 125; https://doi.org/10.3390/data4030125 - 22 Aug 2019
Cited by 13 | Viewed by 7297
Abstract
This data descriptor describes Google search engine visibility data. The visibility of a domain name in a search engine comes from search engine optimization and can be evaluated based on four data metrics and five data dimensions. The data metrics are the following: [...] Read more.
This data descriptor describes Google search engine visibility data. The visibility of a domain name in a search engine comes from search engine optimization and can be evaluated based on four data metrics and five data dimensions. The data metrics are the following: Clicks volume (1), impressions volume (2), click-through ratio (3), and ranking position (4). Data dimensions are as follows: queries that are entered into search engines that trigger results with the researched domain name (1), page URLs from research domains which are available in the search engine results page (2), country of origin of search engine visitors (3), type of device used for the search (4), and date of the search (5). Search engine visibility data were obtained from the Google search console for the international online store, which is visible in 240 countries and territories for a period of 15 months. The data contain 123 K clicks and 4.86 M impressions for the web search and 22 K clicks and 9.07 M impressions for the image search. The proposed method for obtaining data can be applied in any other area, not only in the e-commerce industry. Full article
(This article belongs to the Special Issue Data Analysis for Financial Markets)
Show Figures

Graphical abstract

6 pages, 636 KiB  
Data Descriptor
Treasury Bond Return Data Starting in 1962
by Laurens Swinkels
Data 2019, 4(3), 91; https://doi.org/10.3390/data4030091 - 28 Jun 2019
Cited by 4 | Viewed by 12473
Abstract
Academics and research analysts in financial economics frequently use returns on government bonds for their empirical analyses. In the United States, government bonds are also called Treasury bonds. The Federal Reserve publishes the yield-to-maturity of Treasury bonds. However, the Treasury bond returns earned [...] Read more.
Academics and research analysts in financial economics frequently use returns on government bonds for their empirical analyses. In the United States, government bonds are also called Treasury bonds. The Federal Reserve publishes the yield-to-maturity of Treasury bonds. However, the Treasury bond returns earned by investors are not publicly available. The purpose of this study is to provide these currently not publicly available return series and provide formulas such that these series can easily be updated by researchers. We use standard textbook formulas to convert the yield-to-maturity data to investor returns. The starting date of our series is January 1962, when end-of-month data on the yield-to-maturity become publicly available. We compare our newly created total return series with alternative series that can be purchased. Our return series are very close, suggesting that they are a high-quality public alternative to commercially available data. Full article
(This article belongs to the Special Issue Data Analysis for Financial Markets)
Show Figures

Figure 1

4 pages, 1324 KiB  
Data Descriptor
Point of Sale (POS) Data from a Supermarket: Transactions and Cashier Operations
by Tomasz Antczak and Rafał Weron
Data 2019, 4(2), 67; https://doi.org/10.3390/data4020067 - 11 May 2019
Cited by 7 | Viewed by 18065
Abstract
As queues in supermarkets seem to be inevitable, researchers try to find solutions that can improve and speed up the checkout process. This, however, requires access to real-world data for developing and validating models. With this objective in mind, we have prepared and [...] Read more.
As queues in supermarkets seem to be inevitable, researchers try to find solutions that can improve and speed up the checkout process. This, however, requires access to real-world data for developing and validating models. With this objective in mind, we have prepared and made publicly available high-frequency datasets containing nearly six weeks of actual transactions and cashier operations from a grocery supermarket belonging to one of the major European retail chains. This dataset can provide insights on how the intensity and duration of checkout operations changes throughout the day and week. Full article
(This article belongs to the Special Issue Data Analysis for Financial Markets)
Show Figures

Figure 1

Back to TopTop