1. Introduction
The prevention of money laundering (ML), financial crime (FC) and terrorism financing (TF) has been subject to an increasing amount of attention recently. For example, Canadian authors Hilal et al. [
1] reviewed research works on the anomaly detection techniques indicating financial fraud and stated that acts of financial frauds have become much more prevalent these days than ever. The economic growth and the continued development of technology in modern society results in the evolution of fraudsters’ approaches to exploit the vulnerabilities of the current preventative measures, on the other hand it evokes the need for improvement of prevention and detection of financial frauds. In recent years, efforts to suppress tax evasion and illegal economic activity led to the promotion of the idea of eliminating cash [
2], as cash (perceived as “anonymous money”), can play the role of a facilitator of such activities. The socio-economic system is currently very dynamic; the trend of dominance of cashless payment systems and convergence to cryptocurrencies, bitcoins, etc. is evident [
3,
4]. All this, in addition to the positives, also has its negatives, which causes heated debate among experts [
5,
6]. Cohen et al. [
2] even point to the unwanted consequences of eliminating cash. In their study they construct a simple general equilibrium model in order to demonstrate how elimination of cash-paying can lead to a misallocation of resources in a naturally segmented economy.
In 2020, the United States enacted The Anti-Money Laundering Act, expanding the FinCEN’s power [
7]. The increasing level of this attention is also evident in Europe, where the requirement to strengthen the anti-money laundering (AML) institutional framework was evoked due to a series of banking scandals in several countries (Cyprus, Estonia, Latvia, Denmark, Finland, the Netherlands and the United Kingdom) [
8].
From the point of view of the institutions that investigate the fight against money laundering and terrorism financing, financial and state institutions were mainly active in the past [
9,
10]. Financial institutions still spend a significant amount of their resources on automated information systems aimed at tracking illicit transactions [
1,
10]. Efforts to identify money laundering and sophisticated detection of non-standard financial flows (especially through data mining) have accelerated the development of automated information systems and special software [
1,
10,
11,
12], which are associated with huge financial costs. Some of these systems are implemented quickly, so-called “out of the box” to satisfy regulators, and are only later calibrated to detect serious suspicious activity [
13]. However, experience shows that institutions are still dissatisfied with the current automated tracking efforts, i.e., they are still looking for software that can reduce the burden on regulators and compliance. After the tightening of legislation and the intensification of the fight against money laundering and terrorism in recent years, universities are gradually taking the initiative through research projects.
Nowadays, the applied literature on ML/FC/TF uses much more mathematical modelling as a solving strategy to estimate the risk of illicit money flows or the differences in ML-related risk-perception across countries [
1,
14,
15,
16]. Overall, mathematical modelling is often used for risk and uncertainty management and modelling [
17,
18,
19,
20,
21,
22,
23,
24,
25,
26,
27]. As the modus operandi becomes more sophisticated for the financial fraudsters and money launderers, regulators and financial institutions have to fight back by applying innovative countermeasures. To the most effective weapons currently available belong advanced risk-rating models, using statistical analysis and machine learning to provide better quality data analysis [
10,
23,
24,
25,
26,
27]. However, we perceive that in this research field (ML/FC/TF modelling) there is preference for linear models of economical dependences [
9,
28,
29,
30,
31] and sometimes lack of acceptance of nonlinearity of such investigated economic systems, so this situation has supported and stimulated our interest in this research topic.
Contemporary approaches to the risk assessment of ML and TF are related to the Basel AML index, by which scholars characterize the risk of legalization of proceeds from criminal activities [
32,
33]. In accordance with the authors’ effort to contribute to the solution of this issue and optimize (reduce) the risk of legalization of illicit income in the EU countries, the main (the first) research goal of this study is set to create a new computational nonlinear mathematical model that will describe the Basel AML Index behavior. The AML index represents the investigated dependent variable, which is affected by the values of the independent variables, in this case with the global indices: WCI, CPI, EFI, GII, SEDA, DBI, GSCI, HDI, VAT
GAP and GDP per capita; all indices defined in EU countries. To create the models, the authors worked with a bank of real data obtained between 2012 and 2019 within the EU member states. According to available sources, until now researchers have mainly focused on the use of panel regression analysis when creating such models [
28,
29,
30,
31,
34,
35,
36,
37,
38]. The authors of the article deepen previous knowledge and, in addition to panel regression analysis, apply nonlinear regression analysis to model the behavior of the investigated system (showing nonlinearity). As part of solving the first research task, the authors show how the new model behaves, pointing out the differences in the description of the behavior of the investigated system when using panel regression analysis and nonlinear regression analysis. Addressing the second research question, i.e., using the implementation of nonlinear optimization methods to find values of input variables at which the risk of legalization of income from criminal activity will be minimal, brings interesting results that are interpreted and discussed in this study.
The remainder of the study is organized as follows. In
Section 2, we provide contextual setting and the empirical methodology. The empirical findings and linear model (obtained by panel regression) with corresponding descriptive statistics, analysis and interpretation are discussed in
Section 3.
Section 4 deals with development of the nonlinear computational mathematical model and its estimation results; and with comparing both of these compiled and suitable models. In addition, the optimization process of determining the minimum level of the ML/TF risk (symbolized by AML index) is conducted and its results reported at the end of
Section 4. Finally, the paper is concluded in
Section 5.
4. Nonlinear Modelling and Optimization—Results and Discussion
Nonlinear Regression Model Development
Firstly, it is worth introducing some reasons for the nonlinear modelling approach to investigate the relationship between the ML/TF risk (symbolized by the AML index) and the mentioned global indices. Firstly, panel data analysis provides only linear dependencies of the investigated variables according to the type of model used. However, real economic situations are often nonlinear [
74,
75,
76]. The era of the industrial age (characterized by vertical integration, economies of mass production, hierarchical organization based on command and control) is on the wane. It is being replaced by cooperation with external suppliers, minimization of seriality, profit centers and network structures that are related to the globalization of markets. For the management of the industrial age, it was typical that modelling of economical relationships was based on the assumptions of linearity, equilibrium and a high degree of quantification, it was like a dominant paradigm. On the contrary, nowadays the wealth creation system of the third wave is dominant (characterized by hyper-competition, a series of technological revolutions, social displacements and conflicts), which results in dominated of considerable unpredictability and the nonlinear nature of economies. These are other reasons why we proceeded to create a nonlinear model for estimating the value of the studied Basel AML index depending on the previously mentioned regressors. The already known regression models (PRM, FEM and REM) give very unsatisfactory results. It was our intention to highlight this. Nevertheless, they are popularly applied in econometrics. Although multiple linear regression modelling provides useful models for many applications, sometimes it is necessary to apply nonlinear regression techniques to develop models that cannot be transformed into a linear format. The assumption of linearity between related variables when modelling economic phenomena often leads to a distorted explanation of the real relationships within applied models. In this study we provide an example of such a situation.
It was our previous analyses that led us to the assumed regression model (polynomial multivariate model with cross product or interaction). Skills at modelling are developed primarily through experience, because every problem has its own unique characteristics due to the types of variables considered and the relationships between them. It is generally known, that when developing multiple linear regression or nonlinear regression models, the key task is to identify which of the input variables are needed to provide the best fit (which subset of the k input variables is required to model the dependent variable y in the best and most useful manner). The mathematics of the model fitting procedure has been performed (we have also applied computer packages with implementation of a stepwise procedure combining the backwards elimination modelling procedure and forward selection procedures). Based on this, the predictors were determined: eight global indices, the amount of the tax gap and the gross domestic product per capita value.
Table 4 presents a summary of the results of the statistical analysis focused on the creation of an appropriate statistical-mathematical model expressing the influence of individual factors on the change in the value of the Basel AML index.
Table 4 reports that the proportion of the variability of the explained variable Basel AML that the regression model (21) is able to describe to the total variability of AML (RSquare) is 70.2151%. The adjusted index of determination (RSquare Adj) conditioning the degree of explanation of data variability by the given model reaches a value of 66.1374%. At this point, it should be said that due to the relatively high value of the modified determination index, it can be stated that there is a strong mutual relationship between the explanatory variables and the explained parameter (the Basel AML index). The model average error is 0.4419 and the average value of the Basel AML index is 4.5058. The analysis of variance (ANOVA) of the observed data is reported in
Table 5. Based on
Table 5, it can be concluded that the variability caused by random errors is significantly smaller than the variability of the measured values explained by the model, and the value of the level of significance reached (Prob >
F) indicates the adequacy of the used model, based on Fisher–Snedecor’s test criterion. The reason is the very nature of the test. The null statistical hypothesis that is being tested states that none of the effects (members) in the model have an impact on the value of the variable under investigation. Since we are working with a significance level of 5% and the achieved value of Prob >
F is smaller than the significance level, we can therefore say that there is at least one non-zero term in the model that has an impact on the value of the investigated variable.
After fulfilling the basic requirement when creating the model, i.e., the model is adequate, we can create a table of estimates of model parameters, which is presented in
Table 6. For the sake of clarity of data and subsequent optimization, we will use the following labels for the input independent variables in the article:
x1—WCI,
x2—CPI,
x3—EFI,
x4—GII,
x5—SEDA,
x6—DBI,
x7—GSCI,
x8—HDI,
x9—VAT
GAP and
x10—GDP per capita. In order to eliminate the influence of the scale of the factors, especially for the size of the tax gap (VAT
GAP) and gross domestic product per capita (GDP per capita), we standardized the input data using the arithmetic mean and standard deviation.
Within the very selection of a specific form of the regression model, the several variants of its form have been analyzed based on general Equation (9). The main criteria for choosing a specific form of the model describing the investigated dependence of the Basel AML index on the selected predictors are the adequacy of the regression model, the maximum achieved value of the coefficient of determination and the adjusted value of the coefficient of determination, the minimum value of RMSE, AICc and BIC. Based on these criteria, the “most suitable” model has been subsequently decided upon. The estimates of regression coefficients, estimates of the standard error of the estimate of regression coefficients, their confidence intervals and their statistical significance are shown in
Table 6 and the final form of the regression model is expressed by Equation (21). Although the determination index of model (21) reaches the value of 0.702151, according to the conclusions of Meloun et al. [
77], it is acceptable for economic and humanitarian oriented research. At the same time, it must be said that within the analyzed variants of the models, the developed model (21) reached the highest value precisely for this suitability indicator. It is a natural question to think about additional regressors that would increase the accuracy of the model, but this represents the next stage of planned research in the subject area.
Based on the obtained results of the performed analysis, we can proclaim the prediction model (21) as statistically and numerically correct. According to the results shown in
Table 6, it is possible to compile a nonlinear regression model (21) in a standardized form expressing the dependence of the studied Basel AML index on the considered independent variables. Specifically, we consider as input factors, global indices, the size of the tax gap (VAT
GAP) and the value of gross domestic product per capita (GDP per capita) within destination EU countries in the evaluated period from 2012 to 2019.
However, the relatively complicated model (21) yields several interesting findings compared to the analyzed model (16) obtained through panel regression analysis. It is primarily the nonlinear effect of some predictors as well as the existence of significant interactions of independent variables.
First of all, it is possible to conclude from model (21) that the model’s absolute term is statistically significant at the selected level of significance α = 0.05 (p = 0.0001) with a total of 38.846% share. Next, we will consider the influence of individual members of the model without considering the influence of the intercept. In this case, the most significant member of model (21) with a 9.808% share is the regressor x2 (CPI), whose effect in the model, unlike model (16), is nonlinear. However, the quadratic action of the analyzed index occurs in interaction with the predictor x10 (GDP per capita) with influence value of 2.889% on the investigated variable. The predictor x2 influence trend is similar to that in model (16), that is, as the value of the CPI index decreases, the conditional value of the dependent variable (Basel AML index) also decreases.
The first significant regressor with a nonlinear effect on the change in the Basel AML value is the DBI index (x6) with a 5.192% share. This trend has a parabolic character. From the analysis of the predictor x6 impact it is clear that when the value of the DBI index increases from the minimum value of 71.37, the conditional value of the studied Basel AML index decreases to the value of 4.335. After exceeding this value and its further increase (DBI), the conditional value of the Basel AML index begins to increase parabolically.
The second significantly nonlinear member of model (21) is the GSCI index (x7). In the range of values of the GSCI index, from the minimum achieved value to the value of approximately 45.33, there is a nonlinear growth, where the value of the studied Basel AML index reaches its maximum value over the entire studied interval, namely, 4.586. When the value of the GSCI index increases further in the interval from 45.33 to the value of 57.48, there is a nonlinear decrease in the dependent variable (Basel AML), while at this limit value the Basel AML index reaches the value of 4.085. After exceeding this value (GSCI = 57.48), the value of the Basel AML index begins to rise again.
The effect of the size of the VATGAP tax gap (x9) is interesting. Its influence in the model represents 5.369% as a separate linear effect, 3.002% as a quadratic effect and 3.736% as a cubic effect. By increasing the value of the amount of tax evasion from the minimum value to the value of 0.9308%, the conditional value of the examined Basel AML index decreases and reaches the value of 4.069 at the border point of the first interval. On the interval of values of the amount of tax evasion expressed as VATGAP 0.9308% to 3.306%, there is a conditional growth of the Basel AML value to a maximum value of 4.514. After exceeding the value of the amount of tax evasion above the value of 3.306%, there is a sharp decrease in the value of Basel AML to the value of 3.037.
Considering the interactions’ points of view, omitted in model (16), the most significant interaction occurs for DBI (x6), SEDA (x5) and GII (x4) with a 6.411% share in the change in the value of the studied variable Basel AML. It is followed by the interaction of the DBI index (x6) and the value of the VATGAP (x9) as a measure of tax evasion with 5.430% and by the interaction of the CPI index (x2) and the square of the HDI index (x8) with a 5.519% share in model (21). If we return to model (16) as an acceptable result of the panel regression analysis, predictor x8 (HDI index) does not have a significant impact on the change in the Basel AML value. However, if we look at the nonlinear model (13), the same predictor (x8), as a separate linear effect, has a significant influence on the change in the value of the Basel AML index (p = 0.0003) with a 4.197% share and it also appears as a quadratic term in a significant interaction with the CPI index (x2).
Although linear models can provide relatively satisfactory results and answers in some cases, their application to real-world problems and questions leads to considerable distortion. On the other hand, nonlinear models, obtained by regression analysis, are more complicated but allow a better and more detailed understanding of the interrelationships of the investigated processes and systems expressed in the form of regression equations. They make it possible to find even subtle variations and differences, which, however, can be crucial for understanding the investigated phenomena. As an example, we present in
Figure 3 the dependence of the Basel AML index value change on the change of two significantly nonlinear regressors of model (21), namely, input variable VAT
GAP (
x9) and the DBI index (
x6). The other considered predictors of model (10) are observed by their mean values, which are expressed by the arithmetic mean.
Figure 3 shows the significantly nonlinear behavior of the examined dependence. It is interesting to note that with the minimum value (62.431) of the Doing Business Index (DBI), and the minimum value of the VAT
GAP at the level of 0.051%, the studied Basel AML index reaches its maximum value (6.295). At the minimum modelled level of the VAT
GAP size, by increasing Doing Business Index to the value of 68.089, the conditional value of the investigated variable will drop to the level of 5.028. With a further increase in the value of DBI to the values of 73.746 and 79.043, a further decrease in the conditional value of the Basel AML index to the level of 4.442 is observed. However, by increasing DBI to the its maximum modelled value of 85.059, there is an increase in the conditional value of the Basel AML index to the value of 5.025. We observe these principally interesting changes within the entire interval of values of the size of the tax gap (VAT
GAP) for individual modelled values of the DBI business performance index.
In the next part of the presented study, the focus will remain on the mutual comparison of the two suitable models, namely, model (16), as the result of panel regression analysis, and model (21), as the result of nonlinear regression analysis. Due to the common European space and common legislation that is adopted in individual EU member states at approximately the same time, we will analyze the “accuracy” of models (16) and (21) within individual years. For clarity, in
Figure 4 a graphical display of the relative residuals of the two analyzed models is presented for the selected years 2012, 2015, 2016 and 2019, respectively. The main analysis of the relative residuals for the entire research sample set (all EU countries under analysis between 2012 and 2019) points to the fact that model (16) created by panel regression analysis reaches an average value of −4.430%, while the maximum negative value is −69.684% and the maximum positive value is 28.279%. The range therefore represents a value of 97.964% and an interquartile range of 15.509%. Model (21), created by nonlinear regression analysis, achieves an average value for relative residuals of −1.029% with a maximum negative value at the level of −58.222% and a maximum positive value of 25.196%. The range of relative residuals of model (21) therefore represents 83.419% and the interquartile range is 10.700%.
In the first analyzed year, 2012, the average relative value of the difference between the actual and calculated value of the Basel AML index by model (16) is −0.590% with a maximum negative value at the level of −38.693% and a maximum positive deviation of 27.917%. In contrast, the average relative value of the difference between the actual and calculated values of the Basel AML index by model (21) is 0.999%. This deviation of model (21) in absolute value represents a value higher than model (16) by 0.409%, but the maximum negative value of the deviation calculated by model (21) is −26.057% and the maximum positive deviation is 25.196%. The minimum average deviation is observed in 2014 with model (21) where the value reached is −0.011%. On the contrary, the maximum average values of the difference between the actual and calculated values are reached for model (16) in 2018 with a value of −11.448% and also in 2018 for model (21) with a value of −6.436%. The summarized statistical properties of the analyzed models according to individual years are presented in
Table 7.
A certain problem of both models, PRM model (16) and model (21), are countries with a very low value of tax evasion, a high value of gross domestic product per capita (GDP per capita) and extreme values of the other monitored indices. A graphic representation of these four “problematic or critical” countries (Finland, Estonia, Slovenia and Sweden) from the point of view of the deviation of the actual and calculated value of the studied Basel AML index is presented in
Figure 5 and a summary of statistical values in
Table 8.
In the third stage of the analytical part of the presented research study, we attempt to resolve the question of at which values of the independent variables (x1 to x10) the minimum value of the investigated variable (the Basel AML index representing the ML/TF risk in the context of EU member countries) will be reached. In other words, to find out the appropriate combination of input factors (x1 to x10) affecting the Basel AML index at which the level of risk of legalization of income from criminal activity and financing of terrorism will be minimal.
According to
Section 2.6, a mathematical programming problem (MP) in the broader sense expressed by Equation (15) is used for our purpose, hence presence of the constraints and nonlinearities have to be taken into account in our OP. Nonlinear regression model (21) is used as the objective function and optimization constraints are given by the minimum and maximum values of individual predictors (
x1 to
x10), as seen in
Table 9 (standardized values) and
Table 10 (natural scale). The objective function in the general form is as follows:
To run the optimization procedure for solving the defined OP—the minimization problem with objective function (21) subject to the optimization constraints (
Table 9 and
Table 10), the criterion function
and optimization constraints have been rewritten into a form suitable for optimization in the MATLAB2019b software environment. It should be mentioned that we used the “fmincon ()” solver for constrained nonlinear minimization and the interior point method (IPM) algorithm.
A graphical representation of the optimization process for the modeled value of ML/TF risk symbolized by the AML index is given in
Figure 6.
From the results of data analysis and outputs of the optimization process, it is clear that the optimum (minimum) value of the ML/TF risk symbolized by the Basel AML index is Basel AML opt = 4.568. This value is achieved at the following values of individual predictors: WCI (x1) = 75.203, CPI (x2) = 65.208, EFI (x3) = 71.772, GII (x4) = 50.208, SEDA (x5) = 76.647, DBI (x6) = 76.775, GSCI (x7) = 50.185, HDI (x8) = 90.621, VATGAP (x9) = 0.0168% and GDP per capita (x10) = EUR 6 973.52.
Based on the abovementioned, it can be concluded that the optimal level of the value of the risk of legalization of income from criminal activity and financing of terrorism (expressed by Basel AML) can be achieved in the following way. Firstly, it is necessary for a certain country to achieve the minimum level of tax evasion expressed in the form of the size of the tax gap (VATGAP) and simultaneously to hold the value of GDP per capita at the lower limit. This first combination creates a logical conclusion that if there are legislative means that can minimize the amount of tax evasion and in parallel the performance of the country’s economy is not great, there is a small space for committing criminal activity in the area of legalizing income from criminal activity.
At the same time, however, we must also consider the influence of the value of the Corruption Perceptions Index (CPI), which at the optimal value of the studied variable Basel AML is just below the average value (65.604) at the level of 65.207. Another conclusion can be drawn from the performed analysis. It can be said that it is not important that the index of perception of corruption be as small as possible, but that it moves at the level of the average of the studied EU member countries. On the other hand, to achieve the optimal value of the Basel AML index, it is necessary that the predictors—World Competitiveness Index (WCI), Economic Freedom Index (EFI), Sustainable Economic Development Index (SEDA), Doing Business Index (DBI) and Human Development Index (HDI)—would be higher than their average value within the studied EU member countries, as it is evident from the results of the monitored period of years.