1. Introduction
Progressing climate change is one of the main problems in international relations. Rising temperatures around the world (approx. 0.2 °C per decade) lead to a shortage of natural resources and the intensification of natural disasters such as droughts, floods, fires, and hurricanes [
1]. The above phenomena can also cause the so-called climate conflicts resulting from the need to compete for natural resources, increased population migration, or irreversible geographic changes. Climate change also generates economic losses that result from casualties or the devastation of infrastructure [
2]. It is estimated that only in 2017, costs of EUR 283 billion were incurred as a result of weather disasters around the world. In addition, research conducted by the National Bureau of Economic Research in the US indicated that an average annual temperature increase of 0.04 °C could reduce global GDP by 7.22% by 2100. In the context of advancing negative climate changes, it is very important to undertake actions leading to the reduction of the negative human impact on the environment. This is possible by promoting ecological safety and striving to achieve climate neutrality, understood as a state of balance between the systems emitting CO
2 and its absorbers.
Ecological security is defined as the state of social relations, as well as the content, forms, and ways of organizing international relations, which lead to the reduction or even elimination of environmental threats and promotion of positive actions, enabling the implementation of values essential for the existence and development of the state [
3]. In other words, it is a desired state of the natural environment that is free from threats to the balance of ecosystems and the biosphere. One of the main threats to ecological safety is the increased emission of greenhouse gases. Currently, there are five main sectors of high carbon dioxide emissions; i.e., transport, heating, mining, agriculture, and industry [
2], where immediate action is needed to reduce CO
2 emissions. One of the examples of such activities is promoting the purchase of electric vehicles and increasing their share in the entire car equipment fleet of a given country.
Achieving climate neutrality is possible by reducing greenhouse gas emissions, investing in renewable energy and energy efficiency, and implementing clean, low-emission technology. The measure of the effectiveness of a country’s climate policy is the Environmental Performance Index (EPI), which enables the assessment of the state of sustainability of the country. It is calculated based on data related to parameters such as gross domestic product (GDP) per capita, voice, accountability, political stability, absence of violence, government effectiveness, control of corruption, regulatory quality, rule of law, services (% of GDP), export (% of GDP), manufacturing (% of GDP), the ease of doing business index, and the index of economic freedom [
4]. The conducted analysis indicated that the climate neutrality of a given country was significantly influenced by the level of economic development expressed in GDP, which enabled investing in a climate protection policy.
Current reports indicate that Denmark is the most climate-friendly country, while Poland is in 37th place among the countries of the world and 30th in Europe. The above result was influenced by the high greenhouse gas emissions compared to the European Union (
Figure 1). During the last decade, Poland emitted an average of 308 million tons of CO
2 into the atmosphere each year. This level was three times higher than the average annual greenhouse gas emissions of all European Union countries, and nine times higher than Denmark’s emissions of these gases.
Another factor that influences the low value of the eco-friendliness index for Poland is the low level of investment in low-emission technologies, including electric vehicles (
Figure 2). Currently, the number of electric vehicles in Poland accounts for only 0.1% of all motor vehicles registered in this country, and 1.5% of all electric vehicles registered in the European Union. Moreover, currently in Poland there are 83 electric vehicles per 100,000 inhabitants on average, with the average in the European Union at a level of 640.
The above factors make it important to undertake all activities leading to the promotion of both ecological safety and the concept of climate neutrality in Poland. One possible action is to promote the purchase of electric vehicles as environmentally friendly. In previous years, one of the factors that discouraged the purchase of this type of vehicle was their high price. Currently, the purchase prices of electric vehicles are decreasing. In addition, it is estimated that the prices of electric and internal combustion vehicles may become even in 2026 [
5].
Some nontechnical factors also affect the prices of electric cars. Governments promote the development of electric cars by grants, subsidies, and changing the applicable law. The Polish government began to strongly support electromobility only in 2019. Due to the relatively short duration of the subsidy program for electric cars (from mid-2019), the authors decided to omit this factor in the analysis. If a similar analysis is made (regarding the prices of electric cars) over the next few years, it will be necessary to take into account the impact of the subsidies on the price of electric cars. In the near future, this factor could be one of the decisive factors for the development of electromobility in Poland. However, at the moment, the interest of Poles in the subsidy program “My Electric Car” is quite low, mainly due to the high price of electricity in Poland, which is a huge barrier to the development of electromobility.
The aim of the article is to model the prices of electric vehicles as one of the elements of promoting climate security in Poland. The construction of a model for forecasting prices of electric vehicles will increase the interest in purchasing this type of vehicles among undecided consumers. The ranking of predictors and the analysis of their impact on the dependent variable may provide support during the decision-making process, indicating which variant of the electric vehicle should be selected depending on the available financial resources. Additionally, increasing the number of registered electric vehicles in Poland will contribute to reducing the level of greenhouse gas emissions, and will also contribute to promoting ecological neutrality. For the purposes of the study, an analysis of data from electric vehicle sales advertisements on one of the Polish automotive services was carried out. On this basis, an analysis of important factors influencing the value of the vehicle was made. For this purpose, mathematical models were built based on neural networks and chosen models of decision trees: CART algorithm, boosted trees, and random forest. In the final part of the study, the developed models were assessed and their prognostic abilities were compared.
2. Literature Review
Forecasting is one of the important elements of supporting business decision making, including in the automotive industry [
6]. Models of vehicle purchase price prediction, depending on the adopted parameters, maybe a tool that consumers can use at the stages of variant assessment and purchase decision making [
5]. In addition, they enable price comparisons for vehicles with different types of fuel used, greenhouse gas emissions, or available additional equipment, so they can be one of the elements of promoting ecological safety, and encourage the achievement of climate neutrality. The conducted literature review showed a large number of studies aimed at predicting vehicle prices. The most common models were used to predict the selling prices of used vehicles, mainly with internal combustion engines. However, there are no studies aimed at forecasting the prices of electric vehicles (new and used), in particular those relating to the automotive market in Poland.
The analysis of the literature was carried out in two main aspects concerning the most common mathematical methods used to model vehicle prices and the factors that statistically affect their values. The most common methods of modeling the studied phenomenon were linear regression models, generalized linear models, and time series analysis [
7,
8]. Studies are also available that were based on more advanced mathematical models to forecast vehicle prices. Lessmann et al. [
9] and Lian et al. [
10] used integrated methods based on neural networks and support vector regression with an evolutionary algorithm (genetic algorithm and evolution strategy, respectively). Additionally, Jian Da-Wu et. al., in their study, presented the possibility of using the expert system using an adaptive neuro-fuzzy inference system (ANFIS) [
11]. The conducted literature review also showed the popularity of the use of machine learning methods to predict the prices of used internal combustion vehicles. Pudaruth presented a comparative analysis of the possibility of using supervised machine learning models to forecast vehicle prices, using the following models: K-nearest neighbors (kNN), decision trees, and naive Bayes model [
12]. On the other hand, among the explanatory variables for research, the most frequently used variables were: production year, vehicle displacement, mileage, and vehicle model [
13,
14].
With regard to the accuracy of the prognostic models available in the literature, it should be noted that the most common difficulty in conducting research was the nonlinearity of the influence of explanatory variables on the dependent variable. This nonlinearity was most often due to:
High vehicle depreciation in the early stage of their use [
15];
Emerging announcements regarding new vehicle models [
16];
Customer loyalty to specific vehicle brands, which translated into their price [
17].
Linear regression methods, widely used in the literature, required analyses, which were used mainly in previous works, to accurately determine nonlinear relationships. For this purpose, we used the replacement of the dependent variable with the natural logarithm or by using other nonlinear transformations, which could affect the accuracy of these methods [
18]. Conversely, advanced forecasting methods, such as neural networks and decision trees, take into account nonlinear relationships in a data-driven fashion, reducing the amount of data transformation operations, thereby making them more accurate.
It should be emphasized that the literature does not contain any detailed analyses of electric vehicle price modeling, in particular concerning the country of Poland. This fact became the basis for the following analysis. The legitimacy of the study was emphasized by the fact that Poland is in the group of three European Union countries with the highest greenhouse gas emissions. Therefore, it is extremely important to undertake all activities leading to the promotion of ecological safety and the use of alternative technologies. One example of such activities is the following publication, which aims to promote climate neutrality by building models to predict electric vehicle prices. The structure of the article is as follows: the first part presents an analysis of the environmental safety problem and a review of the literature on vehicle price forecasting; the second part of the study presents the principles of construction and the results of models using decision trees and artificial neural networks; and at the end of the article, a comparison of the developed models is made and conclusions are presented.
3. Materials and Methods
The aim of this article was to model the prices of electric vehicles as an important element of promoting ecological safety and climate neutrality. The study was based on a set of data on vehicle sales announcements that were posted on one of the largest websites in Poland from July 2020 to June 2021. A total of 1500 observations were analyzed, divided into training and test samples at a proportion of 80% to 20%. The first group of observations was used to build predictive models based on decision tree models based on the CART algorithm, boosted trees, random forest, and the model of artificial neural networks, respectively. Additionally, the test set was used to assess the quality of the developed models and to compare their predictive abilities.
3.1. Decision Trees
Decision trees are a method of multivariate analysis that enables the study of the relationship between the dependent variable and independent variables measured on the weak scale; i.e., nominal or ordinal, and the strong scale; i.e., interval and quotient [
19]. Classification trees are referred to in the case of examining a dependent variable expressed on a nominal or ordinal scale, while in the case of regression trees, the examined variable is quantitative, or at least interval [
20].
A decision tree is a graphical model that is created as a consequence of the recursive division of the set of observations A into n disjoint subsets A
1, A
2, …, A
n. The above allows for obtaining such subsets that will be characterized by maximum homogeneity from the point of view of the dependent variable value. The model building process is multistage, and in each subsequent step, a different explanatory variable can be used to obtain the optimal division. According to the procedure, in each subsequent step, the predictor is selected that guarantees the best division of the node, separating the most homogeneous subsets [
21].
Thus, decision trees are a visual representation of the character model [
22]:
where
Y is the dependent variable,
is the independent variable,
L is the number of independent variables,
K is the number of segments,
(k = 1, …,
K) is the segment of the space of independent variables,
is the observations from the analyzed set,
is the model parameters, and
I is the indicator function.
The method of defining the index function is selected depending on the type and nature of the explanatory variables. In the case of metric variables, the
segments are defined by their limits in the space
according to the function below:
where the values
and
represent the upper and lower limits, respectively, of the segment in the
l-th dimension of the space.
In the case of the nonmetric form of the explanatory variables, the subspace
should be described using a functional dependence:
where
is a subset of
variable categories.
If, on the other hand, the dependent variable is measured on numerical scales, from the group of strong scales, then a regression model should be used, the visualization of which will be a regression tree. The model parameters are calculated according to the dependence:
where
is the number of observations in the
segment, and
is the values of the dependent variable in the
segment.
Decision tree models are assessed by analyzing the quality of the division of the space of explanatory variables by applying two main assessment measures:
3.2. Neural Networks
Artificial neural networks (ANNs) are organized as coherent structures resembling a simplified model of how the human brain works. Networks are made up of a large number of elements called neurons, which process information. The name of these elements is a reference to the actual nerve cells existing in the human brain. This article considered the multilayer perceptron (MLP), which is one of the most commonly used unidirectional neural networks. The considered neural networks consisted of layers that connected neurons. These layers were divided into the input layer (x), hidden layer/s, and output layer (y).
Figure 3 presents a schematic diagram of an MLP with one hidden layer [
24].
The basic mathematical relationships of the described MLP neural network are described below. The output signal y of the neural network can be described by the following equation:
where
is the input vector of the neural network (
is the unit signal of polarization),
is the output vector of the neural network,
is the matrix with weights of the neural network, k is the number of neurons in the hidden layer, f(v) is the output layer activation function, and g(u) is the input layer activation function [
24].
The number of layers with the specification of the used parameters; i.e., the number of neurons or activation functions, constitute the architecture of a specific MLP. Hidden layers also have parameters called weights in which all the knowledge of the neural network is stored. Moreover, they are modified during the process of learning neural networks.
Artificial neural networks have several applications in various technical fields, which include, among others, sales forecasting, optimization of commercial activities, sensitivity analysis, localization, and control of industrial processes. The discussed structures are most often used in solving two characteristic problems:
Prediction-based on certain input data—specific output data is predicted [
25,
26,
27];
Classification-based on the input data—a division into classes and divisions is made according to a specific rule [
28,
29].
In addition, neural networks, thanks to the ability to learn, adapt, and generalize, allow for the automation of inference processes based on the collected data, and enable the detection of the most significant connections in the case of large data sets, which makes them a very frequently used method of data analysis in the context of Big Data.
4. Results of Forecasting the Price of Electric Vehicles
This section is divided into two parts, using:
4.1. Decision Trees
The first step of the analysis was the initial selection of dependent and independent variables, which were further analyzed. Price was indicated as the dependent variable, while quantitative variables were indicated as independent variables: mileage, power, number of amenities; and qualitative variables: color, condition (used or new), number of doors (from 2 to 5), type (small cars, city cars, compact, SUV, coupe, sedan, minivan, convertible), drive (front- and rear-wheel, 4 × 4), vehicle brand, and vehicle model and production year (from 2010 to 2021). In the first step, basic statistics for qualitative variables were calculated, and the statistical significance of the influence on the modeled phenomenon was checked (
Table 1 and
Table 2). The correlation coefficients calculated for individual variables indicated the significance of their influence on the examined variable. Both power HP and number of amenities were stimulants, which meant an increase in their value increased the price of the vehicles, while the mileage was a de-stimulant, which meant that the price of a vehicle decreased as the mileage increased.
In the next step, the significance of the influence of the selected qualitative variables on the dependent variable was assessed. This was possible through using analysis of variance (ANOVA) [
30]. The condition for application was that the dependent variable in individual groups had a normal distribution. Therefore, in the first step, the compliance of the price variable distribution with the normal distribution in the groups related to color, condition, number of doors, type, drive, vehicle brand, vehicle model, and production year was checked. For this purpose, the Shapiro–Wilk test was used, in which the null hypothesis assumed that the distribution of the examined feature was consistent with the normal distribution, while the alternative hypothesis that the distribution of the tested feature was different from the normal distribution. The obtained results indicated that the studied distributions were not consistent with the normal distribution, therefore there were no grounds for applying the analysis of variance. In such a case, it was possible to use its nonparametric counterpart; i.e., the Kruskal–Wallis test, the results of which for each group are presented in
Table 3.
The results of the Kruskal–Wallis test indicated a significant influence of the studied variables on the modeled phenomenon. The p-value in each case was lower than the adopted significance level α = 0.05. Additionally, the conducted analysis of multiple comparisons of mean ranks for all samples showed that statistically significant differences existed between individual groups of observations of a given variable.
The next step was to build decision tree models based on the CART algorithm, random forest, and boosted trees. Before starting the analysis, the boundary settings of the presented solutions were defined:
Split rule for a decision tree (CART)—Gini measure;
A priori probability—estimated from the learning sample;
Misclassification costs—equal;
Learning stop—with the incorrect classification of CART decision trees and a percentage decrease in error 5% for boosted trees and random forest;
The proportion of the random test sample for boosted trees and random forest—20%;
Proportion for subsamples for boosted trees and random forest—50%;
Number of trees—100 for the random forest, 200 for boosted trees.
Based on the obtained results, the predictor importance ranking was constructed, which enabled the assessment of the influence of individual explanatory variables on the studied phenomenon (
Figure 4). According to the obtained results, it should be stated that the vehicle model, power (hp), and vehicle brand had the greatest impact on the price of an electric vehicle. On the other hand, the variables that were the least influential were color, number of amenities, and condition.
Then, the discussed models were implemented for modeling the prices of electric vehicles based on a test sample that was not included in the model construction stage. On this basis, the model was validated and the quality of the prediction was assessed.
Figure 5,
Figure 6 and
Figure 7 show the plots of the value of the observed price dependent variable and the values predicted by the decision trees, boosted trees, and random forest models. From the charts (predicted and observed values), it was concluded that the predicted price values followed the observed values of the price variable. A detailed assessment of the estimated forecast errors is presented in
Section 4.3, on the comparison of decision tree models and neural network models.
The models described above were created in Statistica software and saved in the form of PMML code, which allowed for the subsequent implementation of models to forecast new data.
4.2. Neural Networks
The models of neural networks were created in Statistica software. Because neural networks have universal approximation abilities, they can be used for modeling the prices of electric vehicles. Neural network learning is a long-term experimental process in which the impact of many architectural parameters on the quality results of individual variants of neural networks is tested. First, the data set was divided into two subsets: training data and test data. In the analyzed case, it was assumed that the training subset constituted 80% of randomly selected records of the data set, while the testing set constituted the remaining 20%. Before starting the learning of the neural network, the following parameters were adopted:
Number of input neurons: 140;
Number of output neurons: 1;
Number of hidden layers: 1;
Number of neurons in the hidden layer: from 2 to 50;
Membership functions of input neurons: Hyperbolic tangent (Tanh); Exponential (Exp.); Logistic (Logist.); Linear (Lin.);
Membership functions of hidden neurons: Hyperbolic tangent (Tanh); Exponential (Exp.); Logistic (Logist.); Linear (Lin.);
Learning algorithm: Broyden Fletcher Goldfarb Shanno (BFGS);
Error function: Sum of Squares (SOS).
Then, 3000 neural networks with various combinations of the above parameters were created and taught, from which the best 100 networks characterized by the maximum values of the quality of teaching and testing were selected. These results are presented in
Table 4.
The network Id 44 (MLP 140-25-1), characterized by 25 neurons in the hidden layer, and the activation functions of exponential and logistic, obtained the highest value of the learning quality, while the network Id 88 (MLP 140-3-1), characterized by 3 neurons in hidden layer, and the activation functions of logistic and hyperbolic tangent, obtained the highest value of testing quality. Detailed characteristics of neural networks are presented in
Appendix A.
Figure 8 shows the results of data prediction with the use of the neural network Id 44 (MLP 140-25-1) for the data from the test set.
On the other hand,
Figure 9 shows prediction results analogous to
Figure 10 using the network Id 88 (MLP 140-3-1) of data from the test set.
Figure 8 and
Figure 9 show the high concordance of the predictive capabilities of neural network models in the context of electric cars price modeling.
Then, based on the created neural networks, a global sensitivity analysis was performed in Statistica software [
31]. The aforementioned analysis enabled creating a ranking of the importance of variables in the analyzed data set [
32]. In addition, it allowed identifying which of them brought essential information to the network, and which could be permanently removed without any loss of quality in the artificial neural network. The described method worked as follows: after creating and learning the assumed number of neural networks, the prediction error of selected networks (the best in terms of quality) was tested on the arithmetic mean values of individual variables, except for one tested variable. In this way, the network reacted only to changes in a given parameter (it was characterized by a certain sensitivity to a given variable), while the process was repeated analogously for all of them. The coefficient describing the increase in the aforementioned error was calculated (the quotient of the network error without a given variable to the error with the set of inputs). In case the value was 1 or less than 1, the analyzed input variable could be permanently deleted without losing network quality. The results of the global sensitivity analysis are presented for the two best variants of neural networks: Id 44 (MLP 140-25-1) and Id 88 (MLP 140-3-1).
Figure 10 and
Figure 11 show the results of the global sensitivity analysis for network Id 44 (MLP 140-25-1) and Id 88 (MLP 140-3-1).
The obtained results showed that in the case of both networks, the variables that had the greatest impact on the price of electric cars were, successively: production year, power, model, model, brand, and type of the electric car. The remaining variables brought much less information to the neural networks, and therefore their further order differed. The analysis showed that the condition of the vehicle had the least impact on the price in both cases.
Due to the discrepancy of the obtained results of the global sensitivity analysis for the networks with Ids 44 and 88, the arithmetic mean of the results of the above-mentioned analysis was determined for the previously mentioned 100 neural networks characterized by the highest values of the quality of learning and testing. The obtained results are shown in
Figure 12.
Based on the results from the model, the following ranking was created in the order of decreasing importance of the variables on the price of an electric car:
Power;
Year of production;
Brand;
Model;
Type;
Type of drive;
Number of doors;
Color;
Mileage;
Number of amenities;
Condition.
The variables in places 1–5 were identical for the Id 44 and Id 88 networks, which confirmed their high values for quality of learning and testing. The analysis showed that the vehicle condition variable (global sensitivity analysis result equal to 1) could be completely removed in further analyses. This variable had no impact on the price of the electric cars under consideration. In addition, the number of amenities (1054) and mileage (1226) had a negligible impact on the price. On the other hand, the color, number of doors, and the type of drive had a minimal effect on the price, but they could not be rejected (values from about 2 to about 5) because they provided a significant amount of data for analysis.
4.3. Comparison of the Obtained Results
In order to summarize the obtained results, four characteristic measures of accuracy of the obtained forecasts were determined using decision trees (boosted trees, tree model, random forest) and neural networks:
Mean square error (MSE);
Mean absolute error (MAE);
Mean relative squared error (MRSE);
Mean relative absolute error (MRAE).
The values of these measures are listed in
Table 5.
The presented results showed that neural networks were the best predictive method among all those analyzed in terms of the minimum value of error coefficients. They obtained the minimum values for MSE (2.81 × 108), MAE (9995.3), MRSE (0.03), and MRAE (0.11). Among the decision trees, the lowest value of MSE (2.14 × 109) was obtained by the tree model, while the minimum value of the remaining coefficients was obtained by random forest: MAE (24,658.0), MRSE (0.04), and MRAE (0.14).
However, it should be noted that the presented mathematical models of decision trees and artificial neural networks also had disadvantages that may have affected the obtained results. In the case of decision trees, the following should be specified:
The credibility of the decision-making process based on the decision tree depends on the quality of the data (training set) used in the tree construction process;
Decision trees can be sensitive to changes in the structure of the training set; i.e., they show a high sensitivity of the input data to changes in the structure of the decision tree;
Decision tree algorithms are not immune to the so-called curse of multidimensionality. As a result, making a decision based on a decision tree with many branches and splits can be significantly slowed down;
Moreover, with regard to artificial neural networks, these are:
The problem with choosing the right structure of the neural model (type of network, activation functions, number of neurons, way of their connection);
The necessity to select the appropriate network learning algorithm;
Time consumption of the neural model estimation process;
Most often, it is not possible to directly interpret the coefficients of the neural model.
The knowledge of the above drawbacks is important in the context of the correct interpretation of the developed models.
5. Conclusions
This article presented electric vehicle price forecasting with the use of decision trees and neural networks. The presented study was conducted in order to promote environmental safety and climate neutrality using the example of Poland. In Poland, consumers are still hesitant to purchase electric vehicles due to the prevailing belief about their high price. The developed models showed the ranking and the impact of the selected explanatory variables on the studied phenomenon (i.e., the price of the vehicle). Therefore, they can be an important tool supporting the decision-making process, encouraging consumers to buy electric vehicles, and indicating those vehicle parameters for which their price is adequate to the available financial resources.
The obtained results indicated that neural networks obtained the minimum values of error rates for MSE, MAR, and MRSE, which allowed them to be considered the best forecasting method among the considered ones. In the case of decision trees, the least accurate forecasts were made by boosted decision trees. In turn, the decision tree and random forest models obtained very similar results.
The implementation of forecasting models with the use of the discussed methods also allowed us to analyze the impact of the chosen variables on the price of an electric vehicle. In the case of decision trees, the vehicle model, power (hp), and vehicle brand had the greatest impact on the price of an electric vehicle; while color, number of amenities, and condition have the least impact. On the other hand, in the case of neural networks, power, year of production, vehicle brand, and model had the greatest impact; while the number of amenities, mileage, and condition of the vehicle had the lowest impact.
The overlapping key factors across all prognostic models were vehicle model, power, and vehicle brand. Moreover, neural networks indicated the year of production as an important variable influencing the price of electric vehicles. Conversely, the least significant overlapping factors were the condition and the number of amenities. The aforementioned differences resulted from the different mathematical models of the individual methods used.
Based on the considerations, it should be stated that the analysis of electric car prices in the future can be extended and should take into account other parameters of electric cars; i.e., maximum speed, average energy consumption, range, battery capacity, car luggage compartment capacity, maximum torque, etc. Moreover, some nontechnical factors should also be considered. Therefore, future research should also check the impact of government programs on electric vehicle prices, including information on the amount of subsidies allocated for this purpose. Additionally, as part of future research, we plan to compare more mathematical methods in terms of their ability to predict vehicle prices, including mainly comparisons of simple time series models with machine learning models.
Author Contributions
Conceptualization, M.G. and M.R.; methodology, M.G.; software, M.R.; validation, M.G. and M.R.; formal analysis, M.G. and M.R.; investigation, M.G. and M.R.; resources, M.G. and M.R.; data curation, M.G.; writing—original draft preparation, M.G. and M.R.; writing—review and editing, M.R.; visualization, M.R.; supervision, M.R. and M.G.; project administration, M.R.; funding acquisition, M.G. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by Military University of Technology grant number 869.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
The data presented in this study are openly available in a publicly accessible Mendeley Data repository: Grzelak, Małgorzata; Rykała, Magdalena (2021), “Modeling the Price of Electric Vehicles”, Mendeley Data, V1, doi:10.17632/bsfkmdjjzb.1 (accessed on 11 November 2021).
Conflicts of Interest
The authors declare no conflict of interest.
Appendix A. Influence of the Number of Neurons on the Quality of Neural Networks
Figure A1 and
Figure A2 show a graph of the dependence of the quality of teaching and testing depending on the number of neurons in the hidden layer for the best 100 variants of the created neural networks.
Figure A1.
Influence of the number of neurons in the hidden layer on the quality of learning neural networks.
Figure A1.
Influence of the number of neurons in the hidden layer on the quality of learning neural networks.
Figure A2.
Influence of the number of neurons in the hidden layer on the quality of testing neural networks.
Figure A2.
Influence of the number of neurons in the hidden layer on the quality of testing neural networks.
Moreover, in
Figure A1 and
Figure A2, a linear approximation of the collected data was made. It was concluded that with the increase in the number of neurons in the output layer, the quality of learning slightly increased, while the quality of testing slightly decreased.
References
- International Panel on Climate Change. Climate Change 2021 The Physical Science Base, United Nations. Available online: https://www.ipcc.ch/report/ar6/wg1/downloads/report/IPCC_AR6_WGI_Full_Report.pdf (accessed on 5 October 2021).
- Wendling, Z.A.; Emerson, J.W.; de Sherbinin, A.; Esty, D.C. 2020 Environmental Performance Index; Yale Center for Environmental Law & Policy: New Haven, CT, USA, 2020; pp. 31–33. [Google Scholar]
- Pietraś, M. Bezpieczeństwo Ekologiczne w Europie; Wydawnictwo Uniwersytetu Marii Curie-Skłodowskiej: Lublin, Poland, 2000; pp. 42–50. (In Polish) [Google Scholar]
- Kotler, P.; Keller, K.L. Marketing Management, 14th ed.; Prentice Hall: Hoboken, NJ, USA, 2015; pp. 40–85. [Google Scholar]
- Bloomberg NEF, Hitting the EV Inflection Point, Transport & Environment. 2021. Available online: https://www.transportenvironment.org/wp-content/uploads/2021/08/2021_05_05_Electric_vehicle_price_parity_and_adoption_in_Europe_Final.pdf (accessed on 5 October 2021).
- Wolf, S.; Teitge, J.; Mielke, J.; Schütze, F.; Jaeger, C. The European Green Deal—More Than Climate Neutrality. Intereconomics 2021, 56, 99–107. [Google Scholar] [CrossRef] [PubMed]
- Jerenz, A. Revenue Management and Survival Analysis in the Automobile Industry; Gabler: Wiesbaden, Germany, 2008; pp. 20–31. [Google Scholar]
- Lessmann, S.; Voß, S. Car resale price forecasting: The impact of regression method, private information, and heterogeneity on forecast accuracy. Int. J. Forecast. 2017, 33, 864–877. [Google Scholar] [CrossRef]
- Lessmann, S.; Listiani, M.; Voß, S. Decision Support in Car Leasing: A Forecasting Model for Residual Value Estimation. In Proceedings of the International Conference on Information Systems (ICIS’2010), Saint Louis, MO, USA, 12–15 December 2010; p. 17. [Google Scholar]
- Lian, C.; Zhao, D.; Cheng, J. A Fuzzy Logic Based Evolutionary Neural Network for Automotive Residual Value Forecast. In Proceedings of the International Conference on Information Technology: Research and Education, Newark, NJ, USA, 7–8 November 2013. [Google Scholar]
- Wu, J.D.; Hsu, C.C.; Chen, H.C. An expert system of price forecasting for used cars using adaptive neuro-fuzzy inference. Expert Syst. Appl. 2009, 36, 7809–7817. [Google Scholar] [CrossRef]
- Pudaruth, S. Predicting the price of used cars using machine learning techniques. Int. J. Inf. Comput. Technol 2014, 4, 753–764. [Google Scholar]
- Gegic, E.; Isakovic, B.; Keco, D.; Masetic, Z.; Kevric, J. Car price prediction using machine learning techniques. TEM J. 2019, 8, 113. [Google Scholar]
- Chen, C.; Hao, L.; Xu, C. Comparative analysis of used car price evaluation models. In Proceedings of the AIP Conference Proceedings, Haungzhou, China, 15–16 April 2017; Volume 1839, p. 20165. [Google Scholar]
- Desai, S.; Buetow, M. How electrical test requirements will affect tomorrow’s designs. Electron. Eng. 1998, 70, 77–80. [Google Scholar]
- Purohit, D. Exploring the relationship between the markets for new and used durable goods: The case of automobiles. Mark. Sci. 1992, 11, 154–167. [Google Scholar] [CrossRef]
- Kooreman, P.; Marco, A.H. Price anomalies in the used car market. De Econ. 2006, 154, 41–62. [Google Scholar] [CrossRef] [Green Version]
- Du, J.; Xie, L.; Schroeder, S. Practice Prize Paper—PIN Optimal Distribution of Auction Vehicles System: Applying Price Forecasting, Elasticity Estimation, and Genetic Algorithms to Used-Vehicle Distribution. Mark. Sci. 2009, 28, 637–644. [Google Scholar] [CrossRef]
- Payne, M. Modern Social Work Theory, 3rd ed.; Palgrave: London, UK, 2015; pp. 12–15. [Google Scholar]
- Łapczyński, M. Drzewa Klasyfikacyjne i Regresyjne w Badaniach Marketingowych; Wydawnictwo Uniwersytetu Ekonomicznego: Kraków, Polska, 2010; Volume 14, pp. 20–35. (In Polish) [Google Scholar]
- Kozłowski, E.; Borucka, A.; Swiderski, A.; Skoczyński, P. Classification Trees in the Assessment of the Road–Railway Accidents Mortality. Energies 2021, 14, 3462. [Google Scholar] [CrossRef]
- Breiman, L. Random forests. Mach. Learn 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
- Breiman, L. Bagging predictors. Mach. Learn 1996, 26, 123–140. [Google Scholar] [CrossRef] [Green Version]
- Osowski, S. Sieci Neuronowe do Przetwarzania Informacji; Oficyna Wydawnicza Politechniki Warszawskiej: Warsaw, Poland, 2020. (In Polish) [Google Scholar]
- Typiak, R.; Rykała, Ł.; Typiak, A. Configuring a UWB Based Location System for a UGV Operating in a Follow-Me Scenario. Energies 2021, 14, 5517. [Google Scholar] [CrossRef]
- Rykała, Ł.; Typiak, A.; Typiak, R. Research on developing an outdoor location system based on the ultra-wideband technology. Sensors 2020, 20, 6171. [Google Scholar] [CrossRef] [PubMed]
- Rykała, M.; Rykała, Ł. Economic Analysis of a Transport Company in the Aspect of Car Vehicle Operation. Sustainability 2021, 13, 427. [Google Scholar] [CrossRef]
- Rykała, Ł. Application of Hybrid Neural Network System in Image Processing. Syst. Logistyczne Wojsk 2019, 51, 141–153. [Google Scholar] [CrossRef]
- Kijek, M.; Rykała, Ł.; Zelkowski, J.; Brzezinski, M. Neural Algorithm of Driver Selection for Transport Tasks. In Proceedings of the 22nd International Conference, Transport Means, Kaunas, Lithuania, 3–5 October 2018; pp. 489–494. [Google Scholar]
- Grzelak, M. Health Security Survey of the Population of Warsaw Using the Logistic Regression Model. In Proceedings of the 35th International-Business-Information-Management-Association Conference (IBIMA) Proceedings, Seville, Spain, 1–2 April 2020. [Google Scholar]
- Statsoft Electronic Statistics Textbook. Available online: https://www.statsoft.pl/textbook/stathome.html (accessed on 11 October 2021).
- Rabiej, M. Statistical Analysis with Statistica and Excel; Helion: Warsaw, Poland, 2018; pp. 355–372. (In Polish) [Google Scholar]
Figure 1.
Greenhouse gas emissions in the European Union in 2011–2020.
Figure 1.
Greenhouse gas emissions in the European Union in 2011–2020.
Figure 2.
Number of registered electric cars in Poland in 2019–2021.
Figure 2.
Number of registered electric cars in Poland in 2019–2021.
Figure 3.
A simplified diagram of a unidirectional neural network with one hidden layer.
Figure 3.
A simplified diagram of a unidirectional neural network with one hidden layer.
Figure 4.
Predictor importance ranking based on the CART algorithm, random forest, and boosted trees.
Figure 4.
Predictor importance ranking based on the CART algorithm, random forest, and boosted trees.
Figure 5.
Results of forecasting the price of electric cars using decision tree model.
Figure 5.
Results of forecasting the price of electric cars using decision tree model.
Figure 6.
Results of forecasting the price of electric cars using boosted tree model.
Figure 6.
Results of forecasting the price of electric cars using boosted tree model.
Figure 7.
Results of forecasting the price of electric cars using random forest model.
Figure 7.
Results of forecasting the price of electric cars using random forest model.
Figure 8.
Prediction results for the test set for the Id 44 network (MLP 140-25-1).
Figure 8.
Prediction results for the test set for the Id 44 network (MLP 140-25-1).
Figure 9.
Prediction results for the test set for the Id 88 network (MLP 140-3-1).
Figure 9.
Prediction results for the test set for the Id 88 network (MLP 140-3-1).
Figure 10.
The results of the sensitivity analysis for the Id 44 network (MLP 140-25-1).
Figure 10.
The results of the sensitivity analysis for the Id 44 network (MLP 140-25-1).
Figure 11.
The results of the sensitivity analysis of the Id 88 network (MLP 140-3-1).
Figure 11.
The results of the sensitivity analysis of the Id 88 network (MLP 140-3-1).
Figure 12.
Arithmetic mean values of the importance of variables in the global sensitivity analysis determined based on the best 100 network variants.
Figure 12.
Arithmetic mean values of the importance of variables in the global sensitivity analysis determined based on the best 100 network variants.
Table 1.
Descriptive statistics of analyzed variables based on EV sales advertisements.
Table 1.
Descriptive statistics of analyzed variables based on EV sales advertisements.
Variable | Mean | Minimum | Maximum | Std. Dev. |
---|
Price (PLN) | 185,452.4 | 6990.0 | 969,831.0 | 152,008.8 |
Mileage (km) | 16,036.3 | 0.0 | 229,000.0 | 28,787.6 |
Power (hp) | 208.6 | 4.0 | 1398.0 | 140.8 |
No. of amenities | 31.1 | 1.0 | 70.0 | 15.7 |
Table 2.
Correlation between the dependent variable and independent, quantitative variables.
Table 2.
Correlation between the dependent variable and independent, quantitative variables.
Variable | Price | Mileage | Power | No. of Amenities |
---|
Price | 1.000 | −0.315 | 0.828 | 0.135 |
Mileage | −0.315 | 1.000 | −0.040 | −0.034 |
Power | 0.828 | −0.040 | 1.000 | 0.190 |
No. of amenities | 0.135 | −0.034 | 0.190 | 1.000 |
Table 3.
Kruskal–Wallis test results.
Table 3.
Kruskal–Wallis test results.
Variable | K–W Test Statistic Value | p-Value |
---|
Condition | 35,911 | 0.00 |
Vehicle brand | 1033.19 | 0.00 |
Vehicle model | 1033.69 | 0.00 |
Production year | 738.79 | 0.00 |
Drive | 518.88 | 0.00 |
Type | 710.97 | 0.00 |
Doors number | 240.29 | 0.00 |
Color | 121.39 | 0.00 |
Table 4.
Summary of learning of selected neural networks.
Table 4.
Summary of learning of selected neural networks.
Netw. Id | Netw. Name | Quality (Learn.) | Quality (Test.) | Error (Learn.) | Error (Test.) | Learn. Alg. | Error Fun. | Activ. (Hidden) | Activ. (Output) |
---|
1 | MLP 140-19-1 | 0.9926 | 0.9926 | 164,307,905 | 449,297,269 | BFGS 52 | SOS | Tanh | Logist. |
12 | MLP 140-50-1 | 0.9938 | 0.9832 | 137,598,704 | 490,868,864 | BFGS 110 | SOS | Tanh | Lin. |
22 | MLP 140-29-1 | 0.9931 | 0.9845 | 153,496,765 | 464,471,490 | BFGS 55 | SOS | Exp. | Logist. |
35 | MLP 140-46-1 | 0.9914 | 0.9839 | 191,273,306 | 491,413,641 | BFGS 77 | SOS | Logist. | Lin. |
44 | MLP 140-25-1 | 0.9966 | 0.9842 | 74,716,212 | 431,955,380 | BFGS 110 | SOS | Exp. | Logist. |
54 | MLP 140-31-1 | 0.9947 | 0.9846 | 116,964,887 | 456,267,802 | BFGS 116 | SOS | Logist. | Tanh |
63 | MLP 140-44-1 | 0.9914 | 0.9851 | 190,678,341 | 417,735,228 | BFGS 43 | SOS | Exp. | Logist. |
76 | MLP 140-4-1 | 0.9911 | 0.9835 | 196,903,052 | 493,361,835 | BFGS 45 | SOS | Exp. | Exp. |
88 | MLP 140-3-1 | 0.9892 | 0.9869 | 238,772,960 | 393,472,290 | BFGS 111 | SOS | Logist. | Tanh |
100 | MLP 140-12-1 | 0.9925 | 0.9836 | 166,265,419 | 486,413,789 | BFGS 68 | SOS | Tanh | Lin. |
Table 5.
Comparison of the prognostic capabilities of the models.
Table 5.
Comparison of the prognostic capabilities of the models.
| Boosted Trees | Tree Model | Random Forest | Neural Networks |
---|
Mean square error | 3.20 × 109 | 2.14 × 109 | 2.62 × 109 | 2.81 × 108 |
Mean absolute error | 28,558.1 | 24,956.1 | 24,658.0 | 9995.3 |
Mean relative squared error | 1.28 | 0.06 | 0.04 | 0.03 |
Mean relative absolute error | 0.29 | 0.15 | 0.14 | 0.11 |
| Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).