The utility standard is the most important part of the meta-evaluation standards. The reasons for this are as follows: on the one hand, the maximum weight was assigned to the utility standard by the evaluation team (see
Table 3); on the other hand, researchers believed that the utility standard should be ensured prior to the feasibility, propriety, and accuracy standards [
18]. Regardless of the performance of other standards, the lack of utility of EEM means that it cannot bring any changes to environmental management. Therefore, the poor performance of the utility standard directly leads to the failure of the EEM scheme. According to the previous studies, when more than half of the utility criteria are not met, it is considered that the utility standard has not been met [
2,
18]. However, this measurement does not take into account the different weights of each criterion. In this study, based on the assigned weights, if the score of the utility standard is less than half, it is considered to have failed, and the relevant EEM scheme will be deemed to have serious defects.
4.1. The Meta-Evaluation for Urban Sewage Treatment Management Evaluation Scheme
The meta-evaluation results for the USTM evaluation schemes of the Guangxi, Shaanxi, and Inner Mongolia provinces are shown in
Table 4.
In overall, it was found that all three provinces have neglected some meta-evaluation criteria, including stakeholder identification, information scope, political viability, cost effectiveness, conflict of interest, valid and reliable information, justified conclusions, respect to the environment, and collaboration mechanism. Among them, some are important, such as stakeholder identification. There is no doubt that the public is an important stakeholder in urban wastewater management [
49]. The lack of public participation will lead to many problems: (i) the conclusion drawn from the evaluation may be unfair; (ii) the monitoring of the evaluation may be questionable; and (iii) the proposed strategy for improving urban sewage treatment may not meet the actual needs of the society. It is easy to find that problems of sewage treatment in these three provinces have existed for a long time [
50,
51,
52,
53]. Note that currently, the involvement of extensive public participation to promote environmental conservation has been widely recognized [
54]. In this study, meta-evaluation provided a further scientific basis for supporting public participation in environmental conservation. Therefore, if the criterion “stakeholder identification” is not emphasized in the evaluation scheme, such a scheme should be considered to have serious shortcomings. The government should encourage public participation in EEM.
Some criteria have been taken into account in the three provinces, including values identification, practical procedures, formal agreements, complete and fair evaluation, fiscal responsibility, program documentation, defensible information sources, analysis of quantitative information, and human responsibility. However, the number of these criteria did not exceed one third of the total, and most criteria were not considered (see
Table 4). The evaluation schemes of USTM in Guangxi, Shaanxi and Inner Mongolia are internal, administrative, and bottom-up. The department of Housing and Urban–Rural Development of the three provinces required their subordinate units to evaluate the USTM under administrative orders, and then submit the evaluation results to them [
42,
43,
44]. As a result, the evaluation schemes focus on the operational steps, responsibilities, completed conclusions and information records. Although such an approach can ensure the completion of the EEM activities, it has many weak points, as follows: (i) the internal evaluation lacks public participation and insufficient information dissemination; (ii) evaluation in the form of administrative order may make it difficult to ensure the personal dignity of the lower-level evaluators during the evaluation process; (iii) bottom-up evaluation may cause the evaluator to become a mere information transmitter rather than an evaluator; and (iv) the continuity of evaluation is limited. In addition, the final evaluation results, based on the self-evaluation of the lower-level administrative departments, may be unreliable.
In order to better understand our conclusions, we have interviewed some evaluators. It was found that (i) most evaluators treated the evaluation information as the internal information prescribed by the higher-level authorities, which cannot be released; (ii) most evaluators cared more about whether they have completed the assigned tasks than about the reliability of the evaluation information and the environmental concerns; and (iii) a lack of cooperation among the evaluators made most of them unwilling to conduct the next round of evaluation. These problems are further explored and confirmed by our meta-evaluation, particularly concerning the lack of consideration of the evaluation criteria “respect to the environment” and “collaboration mechanism”. Therefore, the EEM meta-evaluation standards we designed are reliable for problem identification. To solve these problems, the integration of “service orientation” and the “evaluation continuity” should be strengthened, and the administration management approach should be improved. Introducing multiple evaluation subjects, peer review, and random sampling evaluation are feasible solutions for administration, which can together improve the quality of EEM.
The problems mentioned above are common to the three provinces. In order to observe the scheme quality more intuitively and to facilitate a comparison among provinces, we calculated the scores of the three provinces’ meta-evaluation standards based on the criterion’s weight, as shown in
Figure 4.
Through meta-evaluation, the utility standard of Guangxi scored less than half (a score < 23.34), which shows that Guangxi’ evaluation scheme of USTM has serious defects. The evaluation scheme of Shaanxi is qualified (score of the utility standard = 25.85), and the evaluation scheme in Inner Mongolia performed even better (score of the utility standard = 37.76). In addition, one finding worth noting is that the more abundant water resources in the province are, the worse the evaluation scheme of the USTM will be. Guangxi has abundant water resources, but its meta-evaluation scores are the lowest among the three provinces. This demonstrated that the abundance of natural resources may cause local people to lack an awareness of sustainable development. Researchers stated that a lack of natural replenishment could threaten the sustainability of water resources [
55]. Studies found that sustainable development has been highly down-valued, or even overlooked, in natural resource-endowed areas [
56]. Without awareness of sustainable development, even the abundant resources will eventually be exhausted [
40]. Therefore, the efficiency of the natural resource management in various regions should be improved, especially in those regions with rich natural resources, in order to maintain sustainable development.
In summary, according to the meta-evaluation, it was found that there are many problems with the existing evaluation schemes of USTM in the Guangxi, Shaanxi, and Inner Mongolia provinces, since each province only obtained a low score (the full score is 100). In particular, Guangxi had the lowest score. According to the backward mechanism of meta-evaluation [
2], we could see that the performance of evaluation directly determines the effect of USTM, which further determines the performance of urban sewage treatment. In other words, the results of the meta-evaluation indirectly reflected the performance of urban sewage treatment. Meta-evaluation has played an important role in improving the accuracy of EEM. In order to examine this effect, we demonstrated the situation of untreated urban sewage in the three provinces [
57], as shown in
Figure 5.
Since 2010, after the Ministry of housing and urban rural development of China launched the urban sewage management evaluation [
38], the proportion of untreated urban sewage in the three provinces has declined. However, to date, the effect of urban sewage treatment is still not ideal. For example, although the untreated rate of urban sewage in Inner Mongolia dropped to about 5% in 2018, there were still over 17,200,000 cubic meters of urban sewage left untreated [
57]. This clearly showed that the quality of USTM in three provinces had to be improved, which was one of the conclusions drawn from our meta-evaluation. Moreover, there were differences in sewage treatment effect among the provinces. Guangxi had the largest number of sewage treatment plants among the three provinces [
57], but the efficiency of urban sewage treatment was the lowest. This indicated that the effect of urban sewage management in Guangxi was the lowest, which is also consistent with our meta-evaluation results.
4.2. The Meta-Evaluation for National Key Ecological Function Areas Management Evaluation Scheme
In this case, meta-evaluation was used to evaluate EEM from the perspective of the evaluation process. According to the guideline of JCSEE [
18], the evaluation process contains eight sections, including defining evaluation issues, collecting information, analyzing information, proposing an evaluation report, making an evaluation budget, making an evaluation agreement, managing an evaluation, and hiring an evaluator. Each section corresponds to specific meta-evaluation criteria. This means these criteria can be used to evaluate part of the evaluation process. As such, a matrix of the meta-evaluation criteria and the evaluation process is obtained, and a meta-evaluation of the evaluation scheme of NKEFAM in Shaanxi is carried out. The results are shown in
Table 5.
After reviewing the evaluation process, it was found that the evaluation scheme of collecting and analyzing information was relatively good. However, there were obvious deficiencies in the work of managing evaluation and hiring an evaluator, since most of the corresponding criteria were not met. Without effective management, evaluation will be passive and result in blindness in practice. This is evidenced by the lack of criteria and poor work in defining evaluation issues. One of the problems of the Shaanxi NKEFAM evaluation program came from the evaluators, who worked in government departments. Due to the awareness of stakeholders, the evaluators intended to demonstrate the positive effects of public services [
58]. This led to unreliable evaluation data and reports (lack of criteria of “valid and reliable information” and “impartial reporting”). This is consistent with the findings of Ding et al. [
14]. In addition, the budget evaluation should be improved. Insufficient investment can also limit the effectiveness of the evaluation. Faced with these challenges, we should first strengthen the management of the evaluation system and promote the requirements as regards the qualifications of the evaluators.
In summary, the reliability and practicability of the meta-evaluation have been verified by the above two cases. Since meta-evaluation could discover important problems in environmental management and put forward targeted improvement measures, it can be widely used in the EEM field. In particular, carrying out meta-evaluation before implementing an evaluation scheme can provide diagnostic feedback to improve the EEM activities.