1. Introduction
Sanctioning institutions play fundamental roles in societies, such as safeguarding people’s daily lives [
1,
2], protecting natural resources [
3,
4] and implementing foreign policy [
5,
6]. These institutions punish wrong-doers to promote cooperation in everyday life [
1,
7,
8,
9], but the poor performance of these institutions [
10,
11,
12,
13] constantly tells us that sanctions do not always work. For example, the U.S. tariffs imposed on imports from China since 2018 have resulted in U.S. consumers paying higher prices for goods and services imported from China. Judging from the current effect, however, it is uncertain whether the tariffs will solve the trade dispute between China and the United States. Additionally, the well-known case of the U.S. sanctions on Cuba has been going on for decades, causing economic losses of more than
trillion. Moreover, a series of rigorous analyses indicate that sanctioning senders achieved their objectives “only” around
of the time [
14]. In general, although sanctioning institutions have their own intentions on how to sanction, it would be apt to not expect the target results.
Existing research on sanctioning institutions is mainly conducted from two perspectives. On the one hand, the evolutionary gaming of social dilemmas considers sanctioning institutions as enforcers of pool punishment [
15,
16,
17,
18,
19]. Since pool punishment can enhance cooperation [
15,
20,
21,
22], sanctioning institutions are explored to pursue the mechanisms by which pool punishment works [
16,
17,
18,
19,
23,
24,
25,
26,
27,
28,
29,
30]. On the other hand, empirical research on sanctioning institutions has been devoted to revealing the impacts of their operation [
3,
5,
31,
32,
33,
34,
35,
36,
37], which enlightens the application of sanctioning institutions in real life. Despite the richness of these theoretical and practical studies, there are few studies on “how sanctioning institutions execute punishment in order to achieve group cooperation at a small cost.” The cases above show that significant time lost, substantial economic costs and uncertainty about outcomes all contribute to the poor performance of sanctions. Thus, in this paper, we (1) abstract the operation of sanctioning institutions into a basic model; (2) explore a time-saving, low-cost and stable sanctioning approach that helps sanctioning institutions to perform well.
We extended the standard public goods game (PGG) with pool punishment as the basic model to explore the operation of sanctioning institutions. The public goods game, which abstracts the trade-off between public and individual interests, embodies the cooperative social affairs in daily life. Pool punishment, on the other hand, corresponds to the sanctions of sanctioning institutions. Integrating PGG with pool punishment, we can observe the evolutionary process of people’s behavior under the dual pressure of social dilemma and sanctioning institutions. As for the sanctioning institution, the operation of a sanctioning institution is like guiding people to walk. The goal of the sanctioning institution indicates where and how people go, and sanctions are the tool by which the institution guides people. The good performance of a sanctioning institution then requires efficiently guiding people to their destinations at a small cost. Here, we focus on helping sanctioning organizations enforce sanctions to achieve good performance.
There are many punishment methods that are used to transform punishment intentions into punishment intensity in real life. The constant punishment intensity approach, where the punishment intensity is entirely determined by the punishment intention, is the most commonly employed punishment method in everyday life, such as data breach penalties in the EU General Data Protection Regulation and anti-dumping duties in trade wars. Punishment within a certain percentage range is an extension of the constant punishment intensity approach. For instance, in China, tax evaders are subject to penalties of one to five times the amount of tax evaded. Additionally, penalties within a certain range are widely used in enforcement; for example, drunk drivers in Japan are fined up to 1 million yen. Considering the abuse of discretion by law enforcement officers, punishing with a constant amount is also very common, such as the fixed penalty of HK$5000 for violations of the Prevention of Disease Regulation in Hong Kong in December 2020. In addition to these common sanctioning methods, we propose a negative feedback sanctioning method that we hope will provide a time-saving, low-cost and stable sanctioning method for sanctioning institutions.
We propose the negative feedback punishment approach for sanctioning institutions by combining the feedback control principle and the negative correlation principle. The feedback control principle [
38] refers to making the next move based on the comparison of the current state with the goal. Inspired by this principle, a sanctioning institution can implement performance-triggered sanctions to ensure people behave correctly while avoiding unnecessary consumption. Specifically, the sanctioning institution does not have to punish people when they are going in the right direction; only when they go off course does the institution enforce sanctions. In contrast to institutions with regular time intervals, a performance-triggered institution is more cost effective. With the feedback control principle, the negative correlation principle is used to deliver sanctions at a small cost. In this principle, the group’s cooperation proportion is negatively correlated with the punishment intensity, which determines the amount of punishment. As people become better behaved under the guidance of the sanctioning institution, the cooperation proportion becomes larger. The negative correlation principle implies that the punishment intensity becomes smaller, and then people suffer less monetary loss. In addition, this negative correlation puts a constraint between punishment intensity and group performance, resulting in a one-to-one mapping between these two. Thus, the sanctioning outcomes at the same punishment intensity are expected to have less variation and be accordingly more stable. The negative feedback punishment approach is a combination of the feedback control principle and the negative correlation principle. In this paper, we explore whether and why the negative feedback punishment approach is a time-saving, low-cost and stable sanctioning method.
Through evolutionary simulation and theoretical analysis, we show that our proposed negative feedback punishment approach can help sanctioning institutions achieve more stable, time-saving and low-cost performances. The operation of sanctioning institutions is modeled by the PGG with pool punishment to analyze different sanctioning rules. Simulation results show that people’s performance with the negative feedback punishment approach varies less and is hence more stable than under a constant punishment intensity. On the other hand, the operation of sanctioning institutions based on a negative feedback punishment approach is less costly in both time and cost dimensions compared to other punishment methods. Moreover, theoretical analysis suggests that the reason for stable group performance under the negative feedback punishment approach is that the negative correlation between punishment intensity and group cooperation proportion constrains group behavior. Further comparisons illustrate the generality of the negative feedback punishment approach by showing group performances with different negative correlation forms. Overall, our proposed negative feedback punishment approach provides a more feasible and effective punishment method for real-life sanctions, and may be instructive for the operation of government departments and the management of various programs.
3. Results
3.1. Punishment Intention Affects GROUP Performance
The constant punishment intensity approach and the negative feedback punishment approach are compared by analyzing evolutionarily stable strategies of the group under different punishment intentions. This comparison help us sort out the characteristics of these two punishment methods. We found that the negative feedback punishment approach can help groups have more stable performances than the constant punishment intensity approach.
Figure 2a shows the results under the constant punishment intensity approach. The red line represents
, and the std value of
t (the shaded part) is 0. Red and gray dots represent the punishment intensity and cooperation proportion at the end of each evolution, correspondingly. The solid lines and shading indicate the mean and std value over 10 repetitions of each punishment intention. The optimal point is marked by a red cross. The group performance is measured as the percentage of cooperators in this group. In terms of the mean, the cooperation proportion steps from 0 to 1 around the punishment intention of
. Additionally, the std value around the intention value
is particularly large compared to other values. Taken together, we can see that the group performance exhibits great instability in the range of intention around
. With this range as the separation, we divided the whole range of
k into three regions from small to large: the completely uncooperative region (
), the extremely unstable region (
) and the completely cooperative region (
). The blue dotted line in
Figure 2a is the dividing line between three regions. The completely uncooperative region and the completely cooperative region correspond to stable group performances with the cooperation proportions of 0 and 1, respectively. The extremely unstable region exhibits two characteristics: On the one hand, the group is either completely cooperative or completely defective, so that the std value is extremely large. On the other hand, the length of this region is extremely small, making the group performance show a large change.
In contrast to the constant punishment intensity approach, group performance under the negative feedback punishment approach can be more stable and less costly. The punishment intensity and group performance under the negative feedback punishment approach are depicted in
Figure 2b. The points and lines in
Figure 2b have the same meaning as in
Figure 2a. As the punishment intention
k increases from 0 to 1, the punishment intensity first increases and then presents a u-shape around
; the proportion of cooperation remains at 0 at the beginning and then increases continuously until 1. Importantly, the small std values of the punishment intensity and the cooperation proportion suggest that both the punishment from sanctioning institutions and the performances of groups are more stable under the negative feedback punishment approach than under the constant punishment intensity approach. Another salient feature is that the maximum punishment intensity only goes up to about
as
k varies from 0 to 1. The reason for the low punishment intensity is that the negative feedback punishment approach specifies a negative correlation between the cooperation proportion of group and the punishment intensity. Due to this negative correlation, when the group performs well, the punishment intensity
t would not go very high when compared to other punishment methods. Then, the group would bear less monetary costs.
3.2. Operation of the Sanctioning Institution
Sanctioning institutions are designed to lead groups to a high level of cooperation at a low cost. The higher the punishment intensity, the greater the cost to the group. Thus, the punishment intention that achieves a high level of cooperation with a small punishment intensity is the optimal strategy that enables the best group performance, as marked in
Figure 2a,b with red crosses. The role of the sanctioning institution is then to constantly input the punishment intention
k into the punishment method until locating the optimal point over the entire
k range. The binary search in computer science helps the institution continuously determine
k (as introduced in Methods), and on this basis, different punishment methods are analyzed through the comparison of corresponding group performances.
We explored the constant punishment intensity approach and the negative feedback punishment approach by repeating the institution’s operation 50 times, as shown in
Figure 2c,d. One colorful line depicts one operation, and the black line shows an average over 50 repetitions. The maximum value of the x-axis is determined by the maximum evolutionary rounds among all these operations. The constant punishment intensity approach shows significant fluctuations in
Figure 2c, which can be attributed to the extremely unstable region where
. Specifically, the group exhibits either all cooperation or all defection when
k is in this region. Once the sanctioning institution gets feedback from the group that they are currently all cooperative or defective, the institution will accordingly narrow the searching interval. Since the group is not in the completely uncooperative region or the completely cooperative region, the updated searching interval is likely to incorrectly exclude the optimal point, and then the search needs to be restarted. This is the reason why the whole processes fluctuates a lot. These contradictions and recurrence mean that “actions” from the sanctioning institution have no clear direction and often change dramatically.
For the negative feedback punishment approach, the punishment intentions rapidly converge to about
and are then fine-tuned. Despite the long process of fine-tuning, the proportion of cooperation around the intention of
is stable and close to 1, as can be seen from
Figure 2b. Compared to searching processes in
Figure 2c, sanctioning institutions operate with less volatility, and the group shows a higher percentage of cooperation under the negative feedback punishment approach in
Figure 2d. Thus, people under the negative feedback punishment approach can perceive a purposeful and reliable sanctioning institution, and the group performance remains harmonious and stable over time.
The comparison on the sanctioning institution’s operation is performed in two dimensions: time and cost. Time refers to the mean evolutionary rounds from the beginning of the searching to the end. Cost means the money or resources people lose in the searching process. Since people’s penalties depend on the punishment intensity, we use the average of the cumulative punishment intensity over 50 repetitions here to be a proxy for cost. The time and money losses during operations are presented in
Figure 2e,f. Dark gray bars correspond to the constant punishment intensity approach, and light gray bars represent the negative feedback punishment approach. Groups under the negative feedback punishment approach spend significantly less time and money than groups with the constant punishment intensity approach. This demonstrates that the negative feedback punishment approach is a time-saving and low-cost method for sanctioning institutions compared to the constant punishment intensity approach.
In addition to the constant punishment intensity approach, we also compare the time losses and monetary losses of several other common punishment approaches, including punishing within a certain percentage range, penalizing within a certain amount range and punishing with a constant amount. Punishing within a certain percentage range means that the punishment intensity varies within 98–102% of the punishment intention
k. Punishing by amount means that the punishment intensity solely determines the amount of the fine, rather than the punishment rate. When the sanctioning institution penalizes within a certain amount range, the punishment intensity
. When punishing with a constant amount,
. The operations of the sanctioning institution under these punishment methods are compared in
Figure 3. Error bars indicate standard errors. From left to right, the punishment methods are: punishing within a certain percentage range, penalizing within a certain amount range, punishing with a constant amount and the negative feedback punishment approach. We can see that the negative feedback punishment approach shows advantages in terms of both time and money compared to other punishment methods.
3.3. Theoretical Analysis
The theoretical analysis based on replication dynamics theory was performed to reveal the underlying reasons why the two methods, the negative feedback punishment approach and constant punishment intensity approach, presented the above results. There are two strategies in the group: cooperation and defection. We analyzed the evolutionarily stable strategies of the group by the expected utility of these two strategies.
The proportion of cooperation in the group is denoted here by x. For a cooperator, his expected utility , and the expected utility for a defector is . Then, the average utility of an individuals is . Replicator dynamics of the group is . The evolutionarily stable strategies correspond to x that satisfies and .
For the method with constant punishment intensity, the punishment intensity t is a constant and is independent of x. There are three possible equilibrium points that satisfy : , and . For all these three points, let us analyze the conditions for being evolutionarily stable strategies:
For , when , so is a stable equilibrium point when .
For , when , so is a stable equilibrium point when .
For , it is impossible to satisfy both and in any simulation settings. Accordingly, is an unstable equilibrium point in the constant punishment intensity approach.
and
correspond to the completely uncooperative and completely cooperative regions in
Figure 2a, respectively. Conditions satisfying
eventually evolve to either
or
, which unveils the reason for the existence of the extremely unstable region.
For the negative feedback punishment approach, has a direct impact on , and this is the biggest difference between these two punishment methods. With in this case, there are also three possible equilibrium points that satisfy : , and . here refers to all points satisfying , where . All these three points are analyzed below:
For
,
when
, so
is a stable equilibrium point when
. In our simulation setup, this means that
. Thus,
corresponds to the part in
Figure 2b where the cooperation proportion equals 0.
For
,
when
, so
is a stable equilibrium point when
.
In our simulation corresponds to the part in
Figure 2b where the cooperation proportion equals 1.
For
, all the
x satisfying
and
are equilibrium points. The conditions of the equilibrium point are related to the punishment intensity function
. In our simulation, as shown in
Figure 4a, for any given
, there always exists one
such that
holds. For
that satisfies
,
Figure 4b depicts the points meeting
. Thus, along with the corresponding
k, any
can be an equilibrium point. These equilibrium points between 0 and 1 correspond to the part where the group’s cooperation proportion is between 0 and 1 in
Figure 2b.
Comparing the two methods, the reason
can be the ESS in the negative feedback punishment approach is that
adds a constraint between punishment intensity and group performance. Specifically,
is a monotonic function. As shown in
Figure 4a, there is only one
k such that
for every
x, which means the evolutionarily stable point
x corresponds to
one-to-one. Thus, we can say that negative correlation makes the negative feedback punishment approach more stable, compared to the constant punishment intensity approach.
Since the negative feedback punishment approach has a good performance, it is natural to think of whether the positive feedback punishment approach also works. The positive feedback punishment approach means that the punishment intensity is positively correlated with the group performance. The equilibrium point
under the negative feedback punishment approach implies that each
x can be a stable point when
, so the cooperation proportion continuously changes from 0 to 1 and stable. We test whether
is an ESS under the positive feedback punishment approach to reveal the performance of this approach. It turns out that there is no stable equilibrium point when the cooperation proportion is between 0 and 1 under the positive feedback punishment approach. Therefore, the theoretical analysis suggests that group performance under the positive feedback punishment approach may also be unstable, just like the group performance under the constant punishment intensity approach. A detailed simulation comparison between the positive and negative feedback punishment approaches is shown in
Appendix A.3. Overall, group performances with the negative feedback punishment approach are better than those under both the constant punishment intensity approach and the positive feedback punishment approach.
5. Discussion
It is obvious that a time-saving and low-cost punishment method can help people reduce their losses. Then what are the benefits of stability? For an enforcement agency, unstable group performance can easily lead to misjudgment of the current sanction, which in turn leads to misformulation of the next sanction. Being inaccurately sanctioned from time to time can be a disaster for the community. If people are sanctioned harshly one moment and then punished slightly the next, the agency is then imperceptible and fickle in people’s minds, and people will lose trust in the agency. This is how the group feels under the constant punishment intensity approach. On the contrary, if the group performance is stable, as it is under the negative feedback punishment approach, the sanctioning institution is able to quickly lock the range of intentions around a certain value and then fine-tune it. During the long fine-tuning process, the percentage of group cooperation remains high. For people, the group becomes increasingly cooperative, and the sanctioning agency is competent and reliable. The society is then positive and harmonious. Thus, the stability of the punishment method is critical for both the sanctioning institution and the group.
In addition to the practical inspiration for sanctioning institutions to achieve stable, time-saving and low-cost performance, the negative feedback punishment approach also has theoretical implications for the further study of punishment methods. Many common punishment methods act as open-loop controls in which group performance affects only the punishment intention and not the punishment method. In contrast, in the negative feedback punishment approach, the cooperation proportion of group has a direct impact on both the punishment intention and the punishment method. In this case, the sanctioning institution, the punishment method and the group together form a system. Then, stability, time loss and monetary cost become the system performance indicators; and the real-life limitations such as jurisdiction and law serve as institutional constraints. Good performance by the institutions requires a high level of cooperation with low cost during the whole operation, and helping sanctioning institutions achieve good performance can be understood as an optimal control problem. Then, developing a good punishment method becomes a matter of finding the optimal control strategy that makes the system performance indicators optimal under the given constraints.
Although we designed the negative feedback punishment approach for sanctioning institutions, the deployment of this approach is not limited to those enforcement agencies that impose sanctions. A variety of management departments can also apply negative feedback methods. For example, HR departments typically use employee performance as the “input” to employee evaluations. On this basis, we recommend applying employee performance to guide the development of employee evaluation rules, just as the negative feedback punishment approach uses group performance to develop the punishment intensity. Furthermore, the “input” of various policies in real life is mainly the performance of individuals or groups. However, it has been witnessed that many policies still fail to achieve the desired results. Inspired by the negative feedback punishment approach, in addition to using group performance as a policy input, we should also consider the application of performance to policy development. Commonly, many institutions actually use their perceptions of system performance as a basis for intervening in the system. However, the negative feedback punishment approach allows us to have new perceptions of system performance in policy making, so that we can be more efficient and harvest good results.