1. Introduction
Investing in young people and supporting their education, starting from an early age, can be beneficial in various ways for individuals as well as society. Education has acquired an unprecedented role at the national level in fostering economic growth, poverty reduction, and increasing overall social well-being. Faster economic development of a country requires a better-educated population. Consequently, there has to be a strong connection to higher budget items for effective training and schooling. From this perspective, investing in human capital should be a priority for government spending. It is a driver for creating more sustainable jobs, thus reducing social inequality and poverty, as suggested by Abdulah, Harun and Jali [
1].
Similarly, Pirim, Ownings and Kaplan [
2] have found that, from a long-term perspective, investing in education can improve the quality of human capital and affect employment positively. Hence, there should be no doubt that creating better conditions for equality in education should be among the top priority areas of government development and education policy worldwide. It makes it accessible for every child and enhances quality, flexibility, and consequently, the efficiency of the education system.
In developed countries, all children are obliged to attend school and receive an education. Compulsory education is an important milestone every child must go through to become an active adult member of society, prepared to embrace reality and accept and deal with it. The number of years that children are legally obliged to attend school varies by country. Compulsory education usually covers two phases of basic education: primary and lower secondary phase. With completing the lower secondary education phase, most young people between ages 15–16 find themselves at one of the most important crossroads of their lives. They must make study choices or decisions on which career path to pursue.
According to the generalised classification of individual education systems, ISCED2 (developed by the UNESCO), lower secondary education refers to junior secondary education and is considered the second stage of compulsory education. In general, students finish their primary education and enter lower secondary education at the age of 11–12. They complete the lower secondary phase around the age of 15–16, depending on the individual education program, as mentioned in Iwamoto et al. [
3].
The study aims to investigate the efficiency of lower secondary education and examine the impact of education spending on educational outcomes represented by the students’ achievement in PISA tests. Second, we explore the inefficiency in education, carrying out a regression analysis to provide valuable evidence for a discussion about the contribution of environmental variables to the improvement in education outcomes.
2. Literature Review
The “Programme for International Student Assessment” (PISA) started in 2000 and is coordinated by the OECD. The main goal is to measure the skills of students at the end of lower secondary education (15 year olds), particularly in reading, math and science. Students have been assessed in collaborative problem solving and financial literacy starting from 2012. PISA questionnaires are not constructed to evaluate specific teaching plans and fixed syllabus in schools. Questions are based on real-life situations to measure skills important for effective learning. The social or cultural status of students or type of schools are not considered. In general terms, PISA measurements give a clear picture of how successful different education systems among participating countries are in giving fair and equal education opportunities to young people, regardless of their social, cultural or economic background. Thus, it allows monitoring and comparing education systems’ quality, equity, and efficiency across OECD and partner countries [
4].
There have been many relevant sources of literature providing an international comparison of education outcomes at different levels in recent years. Such researchers as Bessent and Bessent [
5] and Charnes, Cooper, and Rhodes [
6] were among the first who focused their attention on measuring the efficiency of education systems using the “data envelopment analysis” (DEA) technique. They argue that the data envelopment analysis methodology is properly useable for measuring the efficiency of “decision-making units” (DMUs) in different sectors. Much of the current literature considers the achievement in PISA tests for the worlds’ metric of education outcome at the lower secondary level [
7,
8,
9,
10,
11,
12,
13]. While examining efficiency in education, various aspects, including pupils’ socioeconomic background, school equipment, or education funding, are considered the relevant inputs entering the education process, as is visible in Santin and Sicilia [
14] or Lurcu and Bolat [
15].
The research to date has tended to estimate efficiency in education using a two-stage DEA approach, usually selecting inputs related to financial resources, school environment and family background. The semiparametric two-stage model is an approach that identifies best practices of peer decision-making units (so-called DMUs) for providing efficiency scores and explaining the sources of inefficiency, taking into account contextual variables. In general, the two-stage approach requires estimating the efficiency score of DMUs based on the DEA model set out in the first stage and regressing them against a set of environmental variables in the second stage. The main difference between two-stage models lies in the regression methodology applied during the second stage of analysis. The censored regression (Tobit model), or truncated regression, is mostly carried out when exploring the sources of inefficiency in DEA. For instance, Ramzi, Afonso and Ayadi [
16] have been tempted to investigate the factors enhancing the efficiency of basic and secondary education in Tunisia, running the two-stage data envelopment analysis with multiple inputs and outputs. In their analysis, they considered three main resources being employed in the education process, namely: physical resources (the number of classes per hundred students, or the number of schools), human resources (the teacher–student ratio per hundred students), and financial resources (the amount spent on education per student, respectively). However, they have found that none of the mentioned school resources significantly affect efficiency scores. Nevertheless, running the Tobit regression analysis as the second stage of their research has shown a significant relationship between efficiency scores and non-discretionary variables: employment and poverty rates. Likewise, Alexander, Haug and Jaforullah [
17] have analysed the efficiency of 394 schools in New Zealand and carried out a double bootstrap two-stage DEA analysis with several input variables related to the spending on education and learning processes on the one side and output variables related to the secondary school student’s achievement in the national examination on the other side. The second step of their research—the truncated regression analysis—consisted of variables associated with the school type, institution location, and the teachers’ experience and qualifications. Authors have demonstrated that socioeconomic deprivation is negatively related to efficiency. Contrariwise, the teachers’ experience and qualification positively affected efficiency.
Similarly, Afonso and Aubyn [
18] examined the role of non-discretionary variables to explain the inefficiency of secondary education in OECD countries. Running a censored Tobit regression and using a bootstrap algorithm, they have identified GDP per capita and parental education attainment as major contributing factors for the improving efficiency in education. Furthermore, they have demonstrated that the richer and the more educated the country is, the more efficient it tends to be with the student’s achievement in PISA tests. Analogously, Agasisti and Zoido [
19] used data from the PISA 2012 edition to estimate cross-country efficiency of school systems in 30 countries. They have run a two-stage DEA analysis with inputs as: index of economic, social and cultural status (ESCS), human resources available in school (student–teacher ratio), as well as the number of materials employed in the education process (number of computers per student in school); the outputs used in the analysis were the test scores achieved in math and reading during the PISA tests. In the second stage of analysis, the factors associated with efficiency were examined following a bootstrapping procedure and Tobit regression algorithm. The second stage of their analysis revealed that a higher school’s efficiency may be positively associated with less diversity in the student population and a lower proportion of low-performance students.
In most cases, when applying a data envelopment analysis, the bootstrapping procedure as proposed by Simar and Wilson [
20,
21,
22] has been followed as a prep-step of the second-stage regression analysis. Initially developed by Simar and Wilson [
20], the bootstrapping approach has been widely used to decrease the sample bias and overcome some drawbacks related to the serial correlation among residuals or efficiency scores in conventional estimation models to improve statistical efficiency in the second-stage regression.
Some studies suggest that carrying out the quantile regression in the second stage of DEA analysis is a valuable alternative to the conventional regression models. For example, two-stage DEA models with quantile regression have been widely used in studies on environmental or ecological efficiency [
23,
24], bank sector efficiency [
25,
26,
27,
28], or in other economics related papers [
29,
30,
31,
32]. In our study, we propose using this alternative two-stage approach for assessing the efficiency in secondary education, combining the output-oriented DEA model with variable returns to scale and quantile regression estimation referring to methodology and discussion on quantile regression, as described by Kroenker and Hallock [
33], and Kroenker [
34].
3. Materials and Methods
In our data sample, we used the Online Education OECD database, initially consisting of 36 OECD countries that participated in PISA 2018. The mean values of the selected variables are visible in
Table 1. However, because of the omitted values, we reduced the originally selected dataset to the 24 countries listed in
Table 2.
Our research methodology follows numerous studies that investigated the efficiency of educational systems at different levels. Many academics and researchers performed data envelopment analysis (DEA) to measure public sector efficiency, mainly in health and education-related issues. The methodology used in our paper, as well as the specification of inputs and outputs, were based on several studies proposed by Sopek [
35], Agasisti [
36], and Aristovnik and Obadić [
37], among others. Together, these studies outline that the two-stage DEA method is a popular practice for explaining variations in DEA scores considering the exogenous effects on efficiency.
In our study, we follow the main goal to estimate the efficiency of compulsory secondary education, considering three selected variables, which may affect students’ performance at the national level in a significant way. Second, we have carried out a quantile regression analysis to measure the impact of contextual variables on the technical efficiency scores computed in the first step. Third, we explored the relationships among variables used in this study and student achievement by performing correlation analysis and exploring the associations between output and input variables graphically using scatterplots. Finally, we assumed that student scores achieved in PISA testing are significantly determined by the number of monetary funds spent on education, class size representing students per teaching staff, and the annual number of hours spent in school. Therefore, to explain inefficiency in education, we performed quantile regression analysis considering the country’s wealth and parental background.
The student educational achievement in PISA testing was considered as an output variable. Thus, the output in our study was measured corresponding to the performance of 15 year olds in the PISA 2018. Afonso and Aubyn [
18] pointed out that student performance is likely to depend on resources employed in testing and previous years. Following their study, we have taken the average time of the following three input variables:
The time of schooling spent in lower-secondary education in hours per year for the 12 to 14 year olds, on average for 2014–2017;
The average class size in school based on the student–teacher ratio and considering the full-time equivalents, on average for 2014–2017;
Annual expenditure per student, in equivalent USD, converted using purchasing power parities for GDP, based on full-time equivalents, on average for 2014–2017.
In the next step of our analysis, we investigated the possible effects of non-discretionary environmental variables representing the wealth of the country and parental background in a broad sense:
The GDP per capita, representing the country’s wealth, in PPPs USD, on average for 2014–2017;
Parental education attainment is the population that has attained at least upper secondary education in percentage of the population aged 35–44 years, on average for 2014–2017.
The following table summarises the key characteristics of the final data sample.
The data envelopment model used in this study is based on the methodology proposed by Afonso and Aubyn [
18]. However, in their paper, they refer to the initial scientific effort made by Charnes, Cooper and Rhodes [
6]. By its nature, data envelopment analysis enables to measure performance through evaluation of relative efficiency of the decision-making units (DMUs). Therefore, in our calculation of technical efficiency, we supposed an output-oriented DEA model, which we also used in our conference paper in Dancaková and Glova [
13], with variable returns to scale (VRS), described with the formula shown below:
The output-oriented DEA model evaluates by how much the output measures proportionally increase, while the proportion of the input remains unchanged. The Greek letter
stands for the output efficiency. We assume the hypothesis of variable returns to scale (VRS) in our model, based on the formula above. This allows us to estimate efficiencies, i.e., whether an increase or decrease in selected output or input variables is determined by a proportional change in the output or input units correspondingly, as discussed in Cooper et al. [
38]. Similar to our previous conference paper in Dancaková and Glova [
13], in Formula (1), the efficiency
for a group of peers’ decision-making units (
j = 1, …,
n) is estimated for the specific output variables (
r = 1, …, s) and input variables (
,
i = 1, …, m). The peer’s weight is a sign with the Greek letter
. When a DMU is efficient, the
value would be equal to 1. The signs
and
represent input and output slacks. The negative sign indicates reduction, while the positive sign on output slacks requires enlargement of outputs.
Many researchers regress non-parametric values of technical efficiency against non-discretionary factors in two-step procedures for investigating the effect of environmental factors on DMU performance. However, Simar and Wilson [
20,
21,
22] pointed out that it is inappropriate to use conventional approaches to inference due to unknown serial correlation among the estimated efficiency scores. Furthermore, running several experiments with Monte Carlo resampling, the authors have demonstrated that applying a bootstrapping procedure can improve the accuracy of DEA efficiency analysis in the second-stage regression. Thus, we have followed their second algorithm for bias-correction of technical efficiency scores in input- or output-oriented DEA models to compute the bias-corrected DEA efficiency scores using R software.
In the next step of our analysis, we examined how the environmental variables affect the bias-corrected technical efficiency scores by running quantile regression. The quantile regression represents a valuable alternative to conventional least-squares estimation models, as discussed by Koenker and Bassett [
39]. As the second stage of DEA analysis, quantile regression has been used in several studies on environmental efficiency [
23,
24] and the bank sector efficiency analysis [
25,
26,
27,
28]. This method advances the traditional regression models by making no specific assumptions about the distribution of the residuals. Hence, quantile regression is more robust to outliers than a least-squares regression. For explaining inefficiency in education, cross-sectional data for
countries has been assumed, using
different inputs indexed by
and expressed as a vector
to produce a single output
. Dependent variable
is characterised by its distribution function
, for any
, where
is a parameter that represents quantile level. The basic model of quantile regression used in this paper is shown below, following the methodology as described by Koenker and Basset [
39] and can be expressed as follows:
The objective function for efficient estimation of
corresponding to the
sample quantile of the dependent variable
is defined as any solution to the minimalisation of the following problem solved via linear programming:
According to Mamatzakis et al. [
40], the median estimator corresponding to the quantile regression estimator for
θ = 0.5 is similar to the least squares for Gaussian linear models, except that it minimises the sum of absolute residuals rather than the sum of squared residuals. Conversely, Moutinho et al. [
23] seem to be convinced that using standard linear regression techniques such as ordinary least squares estimation provides a solely partial view of the relationship between variables examined. In contrast, the quantile regression method allows one to look at the full conditional distribution of dependent variables at different quantiles and to analyse the relationship from different perspectives.
4. Results
In this section, we present the results of the model obtained by standard DEA analysis with variable-returns-to-scale and output orientation, which are visible in
Table 2.
In
Table 2, we see that the peer group of seven countries (see rank equals 1; for instance, Estonia and Hungary) might be the most efficient because of their highest efficiency scores. Conversely, the least efficient country, according to educational achievement and considering the selected inputs, is Greece. Therefore, the mean value of the efficiency score is 0.96.
When computing the technical efficiency performing the traditional DEA method, the problem with determining a ranking of DMUs often occurs. In other words, the different DMUs employing the different amounts of inputs obtain the same efficiency score and consequently obtain the same rank position, which does not allow to compare DMUs in a meaningful way fully and could bias the interpretation of the results. Another serious drawback of the traditional two-stage DEA model is that the DEA efficiency estimates are serially correlated, as argued by Simar and Wilson [
22]. The possible solution is to be found in a bootstrapping procedure that could help overcome the sampling error problem [
41] and enable a more precise cross-country comparison, as shown in
Table 3.
Following the goals of our study, we investigated what relationships may exist between variables and students’ achievement in PISA tests, which we see in
Table 4 below.
In order to explain inefficiencies in education, the quantile regression against bootstrapped technical efficiency scores was run. The table below shows the correlation strength between variables used in our study. The value of 0.6463 indicates a positive, linear relationship of moderate strength between students’ achievement (PISA) and parental education (Parents). The students’ achievement moves in the same positive direction with financial resources employed in the education process (Spending), represented by the coefficient of correlation of 0.4928 and to a lesser extent with the country’s economic output per head (GDP) corresponding to the correlation coefficient of 0.2564. The time spent in the classroom (Time) and the number of students in a class (Class) show weak and negative correlations with the student’s achievement in PISA tests. There may be no linear relationship between variables, considering the low numerical values of the correlation coefficients.
Based on the OLS analysis, only the variable GDP per capita, representing economic performance and wealth of the country, would appear to be positively and significantly related to the technical efficiency in education considering variable returns to scale and output orientation. However, due to the heteroscedasticity and non-normality issues of residuals, we decided to apply a non-parametric approach of robust estimation of the regression model by running a quantile regression. Regression estimates in both models (OLS and quantile regression) showed a positive effect of GDP per capita on students’ achievement across countries, as shown in
Table 5. However, the positive impact of GDP per capita is significant only for the least efficient countries (Q (0.10)). When looking at how parents’ education affects efficiency in education, it can be observed that the level of impact of parental education is much stronger and more positive for the inefficient countries (Q (0.10; 0.25)) and is proven to be negative for more efficient countries.
5. Discussion
We might assume that a higher technical efficiency score of standard DEA would indicate better student achievement in PISA testing. In
Figure 1, we have visualised the relationship between the DEA score and mean PISA results in a scatter plot.
As the figure above shows, most countries follow a clear trend of a positive, linear relationship between the average score in PISA testing and DEA efficiency scores correspondingly. However, the countries that proved to be the most efficient ones do not follow a similar pattern.
A widely debated topic is the question of how money spent on education affects education outcomes. Spending on education is a critical issue since the money employed in the education process is a key determinant of success for students. However, a large volume of empirical research recognised a need to improve the efficiency of educational spending. Hanushek [
42] seems to be convinced that expenditures on education and school resources are not good measures of educational quality. Moreover, in his opinion, an increase in education funding will not automatically result in a significant improvement in student performance.
In
Figure 2, we can see that the relationship between students’ achievement in PISA 2018 and spending on education per student is visible. At first sight, it might seem that the students from developed countries tended to score better because of higher spending on education. The most striking result is Luxemburg. Students there scored in PISA testing below an average overall score with only 483 points. In contrast, the annual expenditure on education reached the maximum value of USD 20.588,32 per student among observed countries. Our data visualisation shows that the higher expenditure on education does not guarantee better student performance, as witnessed in Luxemburg and other countries.
Moreover, Hanushek [
43] suggested that expenditure on education does not explain well cross-country differences in learning outcomes. Similarly, Mandl et al. [
44] reported that there is not a clear relationship between spending on education and student achievement in PISA test. The author has analysed an international dataset and pointed out that those countries that spent approximately the same amount of money on education achieved different results in PISA tests, meaning that non-monetary aspects of youths’ education should be considered besides traditional financial and physical resources.
The importance of the class size factor in the average PISA test result for each country is shown in
Figure 3. As we can see, this effect is not significant.
Another important factor in students’ success is the school environment, including school equipment, class size, learning curriculum, and teacher’s qualification. It is generally believed that smaller classes are better for students learning and performance. Attending a smaller class could be beneficial in many ways; students gain more attention from a teacher since they are individually treated in the learning process. Consequently, they could perform better on exams and have better grades. However, the issue of the real impact of smaller classes on education outcomes remains a widely discussed topic. Hanushek et al. [
45] provided an extensive analysis of the possible effects of class size on the students’ achievement on the international dataset. The most striking result to be observed from his study was a little gain from a general reduction in class size. That means that large differences in students’ achievement exist regardless of whether differences in the class size between countries are found. However, it is necessary to notice that the mentioned conclusion does not deny the assumption of the possible beneficial effect of smaller classes, but rather on the level of individuals and schools. Kirjavainen and Loikkanen [
46] pointed out that, despite the fact that the class size was also among the factors affecting the efficiency significantly, according to Tobit analysis, the schools with smaller classes proved to be less effective than those with larger ones regardless of the school size.
The average value of the class size at the lower secondary level is about 22–23 students. However, there are significant differences between countries (see the minimal value of 32 students in Japan and Korea and compare to 10 students per class in Sweden), without indicating any linear relationship between achievement in PISA tests and class size. On average, across selected countries, there are five teachers per hundred students. Japan had reached the best student achievement with an average of 529 points earned in PISA 2018 with approximately thirty-three pupils in class; contrariwise, Sweden, with the smallest number of students in the classroom, achieved an education outcome slightly above the average score of 491 points, with 496 points overall. On average, the classroom sizes vary between 10 up to 35 pupils in class for most countries. Studies such as the one conducted by Fuchs and Wößmann [
47] and Alhabri and Stoet [
48] have also shown that the smaller classes do not guarantee a higher student’s achievement. However, better school equipment, more sophisticated school curriculum, as well as better-qualified teachers do. Authors have raised an open question of whether increasing the number of students in classes would have, contrariwise, a positive effect on the results achieved in PISA.
When looking at the possible link between education achievement and instruction time, an important topic in school reform discussions, it is necessary to mention a study conducted by Rivkin and Schiman [
49]. They have provided an empirical analysis of whether the additional time spent in class will raise student achievement correspondingly. Authors found that a causal relationship between education time and student achievement depends on the quality of school curriculum, classroom environment, and teacher quality. This means that schools with low-quality classroom environments are more likely to gain little or no benefit from additional instruction time.
In countries such as Australia and Germany, the students spend more than nine hundred hours annually in public and private educational institutions, as shown in
Figure 4. Conversely, countries such as Poland and Sweden have students spending less than six hundred hours studying in institutions. From our point of view, time spent in the classroom is crucial for successful learning outcomes. However, contradictory to our perspective, correlation analysis does not show a significant dependency between the time of education and performance in the PISA test.
Family background plays an important role in an individual’s personal development and educational success. According to Bourdieu’s cultural capital reproduction theory (1977, 1984), the socioeconomic inequities in education are caused by the existing differences in the social class structure. This statement follows the assumption that parents with a higher level of education would rather support their children in gathering more skills and knowledge through different types of education than the less educated parents, as mentioned by Dumais [
50].
The following
Figure 5 shows a positive, linear association between students’ achievement and parental education in most countries, with a few potential outliers, meaning that the PISA score would tend to increase with the higher level of parents’ education attainment. In addition, the value of R-squared is 0.418, indicating a moderate strength relationship between these two variables.
A large volume of published studies on students’ background and educational achievement have shown that children’s educational performance is strongly associated with their parents’ education, literacy, and other family socioeconomic factors. Kirjavainen and Loikkanen [
46] were among the researchers who investigated the possible effect of parental education on the efficiency in education using the data of 291 Finnish senior secondary schools and performing the data envelopment analysis as the Tobit regression analysis in order to explain inefficiencies in education. It has been demonstrated that treating “parental education” as an additional input in DEA may increase an average efficiency score by at least 7–9 percentage points. The authors concluded that the higher level of parental education affects efficiency scores positively. Similarly, such studies as Hanushek and Kimko [
51], McEwan and Marshall [
52], Kassim et al. [
53], and Li and Qui [
54] have reported a positive association between parental education and student achievement.
6. Conclusions
The basic question of economics is the efficiency of spending scarce resources. Similar to other areas, education is also confronted with this issue. In this paper, we look at the effectiveness of lower secondary education and the resources spent on it, expanding on variables that also impact educational success. We work with the average values of the OECD’s standardised PISA 2018 tests and with inputs in the form of the already mentioned expenditures on education, the size of the class and the time that the evaluated pupils spend at school. After applying the output-oriented DEA model, results have shown that the technical efficiency of the selected countries does not vary greatly. The average efficiency value at 0.96 means that countries with lower scores (below the frontier) might relatively improve success in PISA by approximately 4.17% with the currently available resources. However, our findings are based on a limited number of countries concerned. Thus, the results of our analysis should therefore be treated with considerable caution. Based on the OLS analysis, only the variable GDP per capita, representing economic performance and wealth of the country, would appear to be positively and significantly related to the technical efficiency in education, considering variable returns to scale and output orientation.
We believe that running a quantile regression analysis rather than using a conventional regression model would bring a more detailed view of the relationship between variables from a few different perspectives. Running a correlation analysis, we found a linear relationship of moderate strength between the variable “parental education” and students’ performance in the PISA test. Nevertheless, this variable appeared to be significant only in countries that have proved less effective in education. We also believed that the students in smaller classes would tend to score higher compared to those in larger classes. The higher spending on education and time spent educating would be positively associated with students’ achievement in PISA. However, we have not proved any positive or negative relationship or even evidence of any association in any of the cases mentioned. Our findings throw up many questions in need of further investigation. What is now required is to further examine how the absence of a strong dependency between input and output variables might bias the results in a cross-national effectiveness analysis. It is altogether appropriate to consider such variables on the input side that may not directly affect the output considered in the DEA analysis. These questions remain unanswered at present. This fact is a topic for later research. There is a necessity for an alteration to the standard DEA analysis to account for the qualitative input variables rather than the quantitative ones.
There are also limitations of the research we are obligated to mention, specifically in the processed data. We decided to transform annual expenditures per student considering purchasing power parity, enabling from our point of view a fairer comparison between selected countries, which might be seen as controversial. We also consider parental education by the percentage of the population aged 35 to 44 with at least upper secondary education because we think there is a significant difference between students whose parents are without and with at least upper secondary education, which is also documented by several pieces of research, discovering significant differences between the students whose mothers graduated from at least upper secondary school and from a secondary or primary school. The ages between 35–44 years are relevant considering the mean figures in the EU, according to which the parents (at least mothers) of pupils aged 12 to 14 are at the average age of 41–43 years. In the US, the parents have an average age of 38–41 years. Last but not least, we reduced the original dataset because of the omitted values; thus, we do not consider specifically low-income countries in our results.