2.1. Literature Review and Research Motivation
Logistics efficiency, as an important indicator for evaluating the logistics industry, has been adopted by several scholars and studied for different logistics areas. Among the measures of logistics efficiency are the extensive use of DEA models based on input–output ratios and other synthetic methods. For example, Zhang and Cui analyzed the synergy of the urban logistics industry based on the results of the DEA model to measure the logistics efficiency of 17 cities in Shandong Province and constructed a spatial network model to analyze the development of the inter-city logistics industry [
3]. However, the traditional DEA model has a single measurement and does not meet the needs of other aspects of research, so many scholars have used the “DEA+” model to conduct research. For example, Cao argues that the logistics efficiency values measured by traditional DEA models cannot predict the pollution caused by the logistics industry in advance, with this problem being effectively solves by combining DEA models and Bayesian methods [
4]. On the other hand, Tao et al. used a combination of principal component analysis (PCA) and DEA models to empirically analyze a variety of key factors in 18 smart logistics parks in the Yangtze River Economic Zone and determined the impact of collaborative innovation on logistics efficiency [
5]. Quan et al. used a combination of a DEA model and Malmquist model to calculate the TFP production index to analyze the inputs and outputs of listed logistics enterprises in China [
6]. With recent developments in the field of machine learning, a new risk management approach consisting of the DEA model and machine learning was used as a case study to improve the identification of decision units and contribute to the sustainable development of a company’s logistics business [
7]. On the other hand, certain scholars have studied output indicators and proposed the SBM-DEA model based on non-expected outputs as indicators. For example, Choi used the SBM-DEA model to measure logistics efficiency using non-expected outputs, such as the use of ineffective operations in Chinese logistics parks as output indicators, to discover the importance of non-expected output factors for the purpose logistics efficiency improvement and to draw on the development of improved logistics parks in Korea [
8]. In contrast, Deng et al. used the logistics industry carbon emission indicator as a non-desired output indicator to measure and analyze 30 Chinese provinces using both PCA and SBM-DEA [
9,
10]. Meanwhile, a number of scholars conducted a phased DEA to address specific issues, for example, in the case of Wohlgemuth et al. to conduct a technical efficiency analysis of logistics service operators in Brazil based on a two-stage DEA model, leading to the conclusion that there are different effects associated with logistics service package provision and logistics technical efficiency [
11]. Cavaignac analyzed 3PLs in France through a two-stage DEA analysis and concluded that only a portion of the 3PLs in this market improved their efficiency [
12]. As the DEA model continues to be studied in depth, it has been found that the three-stage DEA model can effectively remove the influence of environmental variables on efficiency and refine a more pure efficiency value. Therefore, the scholar Gan measured the green logistics efficiency of 11 cities in Jiangxi Province, China from the perspective of green logistics efficiency and analyzed its evolutionary characteristics using a three-stage DEA model [
13]. While different “DEA+” and different stage DEA models have been used for different specific problems, the three-stage DEA model can yield pure logistics efficiency values, which is why this study used this model for research purposes.
However, the logistics efficiency values measured using the three-stage DEA model reflect the pure efficiency values resulting from the combined influence of several factors, and the policy factors underlying changes in efficiency values cannot be measured. The difference-in-differences method (DID) is an effective tool for measuring the impact of a policy on the subject of study before and after its implementation, and the logistics industry, as the global bloodline, is naturally affected by policy shocks. Therefore, according to our analysis of the domestic and international literature, one of the first scholars to use the DID analysis method used it to empirically analyze whether the tax burden on logistics enterprises has been significantly reduced following the enactment and implementation of the “National Article 9” policy on the logistics industry in China [
14]. Additionally, in the context of the global concern surrounding carbon emissions, scholars applied the DID method to empirically analyze whether the implementation of smart logistics policies can effectively curb carbon emissions [
15]. Li et al. used the National Distribution Node City Layout Plan (2015–2020) issued by the Ministry of Commerce and other departments as a policy evaluation node and used the DID model to analyze the impact of distribution node cities on urban logistics production efficiency and its mechanism of action [
16]. He used the DID method to empirically test the positive impact of the establishment of innovative cities on the efficiency of the urban logistics industry [
17]. Zhou and Zhang used the DID model to empirically analyze the rapid growth of China’s product imports and exports after the opening of the China–Europe Class Train (CEB) [
18].
The above analyses were based on the common traditional DID method. However, to make DID experimental data more reasonable, certain scholars have used the PSM method, which can be a more effective matching method in terms of treatment and control groups, in combination with the DID model for combined analysis. For example, Sun et al. used the PSM–DID method to effectively match and empirically analyze treatment and experimental groups, and it was found that the implementation of China’s Belt and Road Program increased the GDP of participating countries [
19]. Dong used the “Logistics Industry Adjustment and Revitalization Plan” promulgated by the State Council in March 2009 as the policy evaluation node, and a double-difference propensity score matching model (PSM–DID) was constructed to test the effectiveness of the policy and its mechanism of action [
20]. Zhang used the new urbanization policy as the evaluation node, and a difference-in-differences and propensity score matching method (PSM–DID) was constructed to analyze and study whether the new urbanization had an impact on the development of the logistics industry [
21]. Based on the above, to make the DID experiment more effective, this study used the PSM–DID model to analyze the Greater Bay Area.
Since the establishment of the Guangdong–Hong Kong–Macao Greater Bay Area in 2017, research on it has been extensive and fruitful, but there is a lack of high-quality research results in the field of logistics. An earlier article combining the national strategy of Guangdong–Hong Kong–Macao Greater Bay Area and the high-quality development of the logistics industry as the research hotspot analyzed the main problems faced by the high-quality development of logistics industry in the Guangdong–Hong Kong–Macao Greater Bay Area, conducting an in-depth analysis [
22,
23]. In the same year, Xiao proposed that to improve the quality of logistics development in the Guangdong–Hong Kong–Macao Greater Bay Area, it was necessary to analyze the reform of the ecosystem, technological innovation, the deepening of industrial development, and the acceleration the agglomeration of the logistics industry to promote the high-quality development of the logistics industry in the Greater Bay Area [
24]. Liu and Liu used the improved AHP algorithm and the expert scoring method to analyze various indicators and improve the calculation of the service level of port logistics in the Greater Bay Area [
25]. The only recent study on logistics efficiency in the Greater Bay Area is a study by Qin, which used a three-stage DEA model to empirically study the efficiency of the logistics industry in the Guangdong–Hong Kong–Macao Greater Bay Area city cluster and analyzed it in terms of both time and space dimensions [
26].
According to the above review, although the research on logistics efficiency and the DEA+ model, PSM–DID+ logistics, and the Greater Bay Area and the logistics industry has achieved fruitful results, there is still room for further research, mainly in the following three areas. First, according to the literature analysis, although there are various “DEA +” models that can be used, there are few studies that use the three-stage DEA model for measurements, and the use of the three-stage DEA for the Greater Bay Area is rarer. Secondly, in recent years, DID methods have been widely used in academic research, but the matching of treatment and control groups, which is the basis of experimental DID studies, has not been subjected to rigorous scientific analysis. The propensity score matching method (PSM) can be an effective solution to this problem, but there is no literature on the use of the PSM approach for the purpose of match analysis in the Greater Bay Area. Thirdly, there is also no relevant literature that combines the PSM and DID methods and conducts research and analysis on the logistics efficiency of the Greater Bay Area. Therefore, for the purpose of further analysis on the above problems and to obtain the following marginal contribution, this study consisted of the following: First, through an analysis of the literature and a comparison and PSM equilibrium test, it was scientifically proven that the Guangdong–Hong Kong–Macao Greater Bay Area and the Yangtze River Delta urban agglomeration are highly similar and exhibit the basic conditions necessary to be used as a DID experiment, providing a scientific research basis for other related academic studies that need to compare urban agglomerations; second, through the use of the PSM–DID method, the effect of policy on the Greater Bay Area was empirically analyzed to improve the logistics efficiency of the region and fill a policy-based research gap concerning the logistics efficiency of the Greater Bay Area.
2.2. Research Hypothesis
This paper uses the three-stage DEA model and the propensity score matching—difference-in-differences method (PSM–DID) to empirically study the policy impact of the Greater Bay Area establishment on regional logistics efficiency. According to the detailed user instructions of the difference-in-differences method [
27], the DID experimental basis shall have two basic conditions Condition 1 is that it must meet the parallel trend assumption, also known as the common trend assumption. This means that if the individual in the treatment group does not receive the intervention or impact, the changing trend of the results is the same as that of the individual results in the control group, with the trend varying after the impact of policy and the experimental group and the control group following the principle of “randomization”. Condition 2 is the stable unit treatment values assumption (SUTVA), which measures whether the different individuals impacted by the policy are independent of each other and whether one individual being affected under policy impact (treatment status) does not affect the results of any other individual, in other words, it determined whether the treatment group and the control group are strictly separated and do not interfere with each other. Both of the above conditions involve a core problem, namely the “randomization grouping” problem, which makes it difficult to make the experimental group and the control group reach condition 2 and not interfere with each other, with the reality being that it is difficult to achieve strict non-interference between the two groups, especially in today’s tightly linked global economic landscape. However, the idea of a ‘quasi-natural experiment’ is implicit in the difference-in-differences method and does not strictly require that the randomization conditions between the treatment and control groups are met.
Based on the above analysis, this study took the establishment of the Greater Bay Area as the research object. The ideal grouping would be that the nine cities in the Greater Bay Area were the treatment group, and the random urban agglomeration not affected by the policy of the Greater Bay Area would be set up as the control group. However, this grouping would be unrealistic and unreasonable. Taking into account the current 265 cities in China, the control group would be too large, and it is difficult for the cities near the Greater Bay Area not to be affected by the Greater Bay Area, so the analysis would be more biased. Therefore, to increase the scientific comparable evidence of the two groups, this study adopted the PSM–DID method proposed by Heckman et al. [
28]. The basic logic of this method is to find the control group with similar characteristics to the treatment group through the PSM method and then use the DID method under the requirement of a balance test, which can effectively avoid endogenous interference and isolate the policy effect as purely as possible. The PSM idea stems from the matching estimator proposed by Rosenbaum and Rubin [
29]. The basic idea of this paper was to use Logit regression to calculate the propensity score of the experimental group and the control group, to use the kernel matching method according to the propensity score, and finally to conduct the propensity matching balance test. If the balance test is passed, the grouping of the treatment and control groups can be proved credible, so that the DID analysis can be continued, and, finally, the robustness test can be conducted to verify its reliability again.
Based on the above analysis, we searched for the keyword “urban agglomerations comparison” in several Chinese academic databases based on various indicators of Chinese urban agglomerations (population, GDP, land area, development system model, scientific research, etc.), and found 8920 relevant papers which contain the keywords. When we further selected the keywords “Yangtze River Delta, Pearl River Delta, population, economy, urbanization”, the literature search reached 1151 articles. However, a search of the Web of Science for this keyword revealed only a dozen or so relevant articles, suggesting that comparative analysis of the two regions in China is still restricted to Chinese scholarship. Among the relevant scholars, Li undertook a comparative analysis of the Yangtze River Delta and the Pearl River Delta urban agglomeration from the perspective of environmental technology efficiency, green productivity, and sustainable development [
30]. Wei and Wang conducted a comparative study on the technical isomorphism of the Yangtze River Delta and Pearl River Delta regions [
31]. Ma and Zhu provided a comparative analysis of the Yangtze River Delta and the Pearl River Delta clusters from the perspective of the business environment and R&D behavior of enterprises [
32]. Zhang and Ma made a comparative analysis of the Yangtze River Delta and the Pearl River Delta urban agglomeration from the perspective of regional innovation model research [
33,
34]. Pi and Yang analyzed the comparison system of the development model of the Yangtze River Delta and Pearl River Delta [
35]. Zhang and Sun undertook a comparative study on the relationship between R&D investment and investor return in enterprises in the Yangtze River Delta and Pearl River Delta [
36]. Xie et al. undertooks comparative study on the spatial and temporal changes in population aging in the Yangtze River Delta and Pearl River Delta regions [
37]. Zhong and Qin conducted comparative research on the service industry collaborative agglomeration in China’s urban agglomeration [
38], etc. The large number of studies mentioned above, which make use of a variety of data and almost comprehensive comparative analysis, are enough to show that although there are differences between the two core regions in China, the comparability is very strong. If the PSM balance test was to be continued in the following analysis, the hypothesis of the Yangtze River Delta as a control group would be fully proved. Therefore, we proposed the following hypothesis:
Hypothesis 1. Through PSM matching and its balance test, experimental analysis with the Yangtze River Delta city cluster as the control group and the Guangdong–Hong Kong–Macao Greater Bay Area as the treatment group is scientifically comparable, which is in line with the experimental basis of DID.
With the establishment of the Guangdong–Hong Kong–Macao Greater Bay Area, the governments in the region will inevitably introduce a series of complementary policies to promote the development of the logistics industry in the Greater Bay Area under the general policy of the Outline of the Plan, which has been analyzed by researchers at different levels. Analyzed from the perspective of talent strategy, the talent policy of the Greater Bay Area helps to attract more service talents, and the logistics industry, which is a production service industry, will naturally also be attractive to logistics talents, thus stimulating the vitality of logistics innovation and driving the development of other related industries in the region, illustrating the importance of the talent policy in enhancing the level of the logistics industry [
39]. In terms of industrial coordination policy, using MATLAB for simulation analysis, Wan et al. found that the urban logistics efficiency with the best synergy degree in the Greater Bay Area is higher than the other cities with low coordination degrees [
40], and through an analysis of the political system, economic system, and internal links of the Greater Bay Area, Yang found that the coordinated construction of the circulation system of the Greater Bay Area through a top-level design would greatly improve the logistics efficiency in the region, indicating that regional coordination policies can significantly affect improvements in logistics efficiency [
41]. Concerning the analysis of industrial policy, through an in-depth analysis of the comparative advantages of the high-quality development of the logistics industry in the Greater Bay Area, Huang found that the rapid transformation and upgrading of the logistics industry is conducive to the effective improvement of logistics efficiency [
22]. Meanwhile, by making use of 2007–2019 big bay nine cities basic data, based on the modified gravity model and social network analysis of a large bay area urban agglomeration logistics network structure evolution, Shi and Hu found that an urban agglomeration logistics network can promote regional industrial structural upgrades and improve the efficiency of regional comprehensive logistics [
42]. From the perspective of logistics trade and technology policy, by building a trade logistics competitiveness evaluation system and using factor analysis of the nine core cities in Guangdong province, Li found that the trade logistics integration degree will greatly improve logistics efficiency [
43], while by exploring the application of blockchain technology in Guangdong, Li et al. found that the technology helps to greatly improve logistics efficiency [
44]. To sum up, the governments the Greater Bay Area at all levels will be under the guidance of the planning outline in terms of talent, industrial structure, government coordination, trade, science, and technology, and various aspects of the development of logistics industry policies and measures will greatly integrate and optimize logistical resources, further promoting logistics enterprise operation optimization, which, as it continues to improve, directly or indirectly promotes logistics efficiency. Based on the above analysis, we proposed hypothesis 2:
Hypothesis 2. The establishment of the Guangdong–Hong Kong–Macao Greater Bay Area has had significant policy effects in terms of improving the logistics efficiency of the logistics industry in the Bay Area.