1. Introduction
Identifying cycles in games is easy, but it is not easy to analyse the qualitative behaviour of these cycling structures. It is a priori not clear, if these systems lead to convergence to an interior fixed point, to an periodic attractor or to chaotic behaviour. For example even the structure of the simple RSP game can either lead to convergence to a Nash equilibrium or to convergence to a periodic orbit for the best response dynamics. In [
1] the RSP game was analysed—besides some other low dimensional games—for the best response dynamics and the results were compared to results for the replicator equation. A strong connection was found between the limit set of time averages of the orbits for the replicator equation and the
limits of the best response dynamics. A later paper [
2] shows that this connection was not specific for these games but is true in a more general sense. It is shown in this paper that the limit set of the time averages (of orbits starting in the interior) for the replicator equation is a subset of the maximal invariant set for the best response dynamics. Cycles for low dimensional games for the replicator equation are for example thoroughly analysed in [
3,
4]. In these papers conditions are given under which so called heteroclinic cycles are attracting or repelling. Permanence is an important property in evolutionary game theory and the existence of an attracting heteroclinic always excludes the possibility that a system is permanent. Permanent means—biologically speaking—that all species are safe from going extinct. We will show equivalence between permanence for the replicator equation and the non existence of Shapley polygons for the best response for so called monocyclic payoff matrices.
We will also analyse games with embedded RSP cycles for the best response dynamics and give a classification as complete as possible for them. A comparison of the results for the best response dynamics to those for the replicator dynamics is made with respect to [
2] where the strong connection between the time average for the replicator equation and the invariant sets for the best response dynamics is shown. For the analysis of the best response dynamics we construct a two dimensional return map. Surprisingly this return map is very similar to the return map for the analysis of the replicator equation in [
3], to be more precise the transition matrices are identical. These transition matrices play an important role in the analysis and as they are identical we get the equivalence of the existence of Shapley polygons for the best response dynamics and the existence of a relatively asymptotically stable heteroclinic cycle
1. In this paper some counterexamples are given for some conjectures that might be drawn from the analysis of RSP game for the best response dynamics. It is shown that the invariant set
for the RSP does not have to be invariant in
games. It is also shown that
is not always a good Ljapunov function. As a ‘side product’we get that for this class of games no chaotic behaviour can occur.
The paper is structured as follows: It starts with some general assumptions on the payoff matrix. The main results of this paper and their discussion with respect to earlier results for the replicator equation can be found in
Section 3. In
Section 4 examples are provided for different asymptotical behaviour for these kinds of games. The remaining sections contain the construction and analysis of the return map.
2. Preliminaries and Assumptions on the Payoff Matrix A
In this work we will deal with finite, symmetric 2 person normal form games. The player’s payoffs are summarised in the (
) matrix
A where
describes the payoff of strategy
i against strategy
j.
The expression
always describes the payoff matrix for a game restricted to the strategies contained in the index set
I. (If there is no risk of confusion the index set
I will not be written down explicitly.)
We use the following notation in this work
which represents the
-dimensional simplex. Vectors are written in bold letters e.g
. The
i-th vertex of the simplex is denoted with
. The standard scalar product of two vectors
and
is written in the following way
Lastly we define the set
by
If there is no risk of confusion, we will call the vertex
sloppily
i. In this paper we will analyse games for the best response (or best reply—used equivalently) dynamics (see [
5]), which is of the following form
where
stands for the set of best responses to a given state
. As mentioned we will compare the results of the qualitative behaviour to the behaviour of the replicator equation [
6].
We define the set
, which is the set of all
for which
i is the
unique best reply against
. More generally, we define
with
, the set of all
for which all pure strategies in
K are a best response against
and there are no other pure best responses,
Definition 1 In this paper we call a game generic
, ifhold. Following [
6] and [
7] we state the following lemma
Lemma 1 Letand let the payoff matrix be normalised in the way that holds for all then V satisfies for all and hence is strictly decreasing along the piecewise linear solutions to Equation (
1)
inside each .
Moreover, if is in for almost all we get as .
In other words the orbits converge to the set , which is defined as Definition 2 A periodic orbit Γ under the best response dynamic is called a Shapley polygon.
Let us now introduce better and best reply cycles.
Definition 3 If for all and hold, we call this a best response arc
2 from to and write symbolically . This simply means that j is the unique best reply to i and j is a better reply against itself than i. This gives rise to a directed graph for all best response arcs.
Definition 4 We call a cycle in this directed graph a best response cycle.
For example if the graph is , then the best response cycle connects and . Clearly, a game can have more than one best response cycle, but these cycles are disjoint. Note that neither a best response cycle nor a best response arc have to exist.
In analogy we define a better reply arc from to , if and hold and write symbolically . Again we get a directed graph and cycles in this graph are called better reply cycles. Note that better reply cycles do not have to be disjoint and every best reply cycle is also a better reply cycle. If there is no risk of confusion we will write only the best response (or reply) cycle 1234 instead of . We also write the better reply cycle 1234 instead of .
In this paper we are especially interested in games with a full better reply cycle, which means a cycle exists using all pure strategies. This also means that no pure strategy can become a Nash equilibrium. Despite this cycling behaviour, there is no need for a periodic orbit to exist for example because Nash equilibria can attract all orbits (see for example
Section 7). On the one hand it will turn out that the number of Nash equilibria is very hard to predict (see
Section 4); on the other hand it will turn out that for certain classes of payoff matrices only solutions for a set of measure 0 of starting values converge to a Nash equilibrium.
Throughout this paper we consider the following payoff matrix
A
and assume the following for all
i for the payoff matrix
A, unless stated otherwise
This notation is from [
3], in which the stability of so called heteroclinic cycles for the replicator dynamics was analysed. A heteroclinic cycle corresponds to a better reply cycle. The
s stand for the expanding eigenvalues in direction of the heteroclinic cycle, the
s correspond to the contracting eigenvalues along the heteroclinic cycle and the
s are the eigenvalues transverse to the heteroclinic cycle. Note that we can only speak of eigenvalues with respect to the replicator equation
3.
Remark The assumptions (
9) and (10) guarantee the existence of a heteroclinic cycle for the replicator equation. A heteroclinic cycle is a union of orbits
(together with their
α and
ω limits, which are fixed points for the system), for which the
ω-limit of
is the
α-limit of
(with
taken modulo
n). The stability of such heteroclinic cycles has thoroughly been studied in [
3,
4].
Remark (
9) and (10) assure that a better reply cycle
exists for
A. Whereas (11) assures that no restricted two strategy game has a stable interior Nash equilibrium. This is easy to see: If both
and
are greater than zero, we get the following structure in the corresponding matrix of the restricted game
which means that the interior Nash equilibrium of this restricted game is attracting. This is the only way to get a stable interior equilibrium. If we have two opposite signs, there is no interior equilibrium and if there are two minus then the interior equilibrium is repelling.
The assumption in (11) seems to be somehow arbitrary, but we want to exclude a kind of forced movement
4 as example 1 should explain.
Example 1 We take the following payoff matrix ANote that our assumptions (9)-(10) hold for this matrix, but (11) is violated, because all the are positive. This example also shows that is not always a good Ljapunov function. This matrix has a unique Nash equilibriumBut the minimum of is not attained at the equilibrium , but at the point .
Its value there iswhereas its value at the Nash equilibrium isAlong the line segment s, which connects and the pure strategies 2, 3 and 4 are best responses.
Figure 1.
Sufficiently close to the line segment s, the dynamics is similar to the dynamics for the restricted game . Every orbit hits the set , shown by the arrow in the figure. Inside this set there is a forced movement towards the unique Nash equilibrium of the restricted game.
Figure 1.
Sufficiently close to the line segment s, the dynamics is similar to the dynamics for the restricted game . Every orbit hits the set , shown by the arrow in the figure. Inside this set there is a forced movement towards the unique Nash equilibrium of the restricted game.
Therefore if we start in this segment the solution heads towards the unique Nash equilibrium of the restricted game and is reached in finite time (Along such an orbit does not decrease!). Moreover there is a two dimensional manifold, in which 2 and 4 are the best responses, hence every orbit inside this set tends towards the Nash equilibrium of the restricted game. After finite time s is reached and is reached in finite time again. The movement inside is forced in the way that orbits close by want to cross from both sides. Thus the resulting movement is towards . We can describe the dynamics for the game as follows. Some orbits that head for pure strategies cycle towards and reach it in finite time. All other orbits (especially those , with converge via s to in finite time.
3. Main Results
In this section the main results of this paper are summarised. The following lemma is a central part for the analysis. It shows the form the return map which plays a crucial role in the analysis of the best response dynamics. One big advantage is that this map is of lower dimension than the game. Due to the lower dimension the map is easier to analyse, but no information on the global dynamics is lost. This is in contrast to the analysis of the replicator equation in [
3] where the cross sections for the return map are placed near fixed points (the vertices) via a linearisation. These fixed points cannot be Nash equilibria of the game and hence no information on the behaviour close to Nash equilibria can be achieved directly by this method. This is especially true with respect to the interior Nash equilibrium. Whereas for the best response dynamics these cross-sections for the return map are given more naturally and no linearisation is needed. Almost all orbits for the best response dynamics cross a transition face and so this allows us for the best response dynamics to give results for almost all orbits. Moreover for some cases we are also able to give some general results about the existence and stability of Nash equilibria. Especially for the interior Nash equilibrium
, if it exists, we can give very precise results about its stability. Clearly if we can state a result on the existence of a Nash equilibrium for the best response dynamics it automatically applies for the replicator equation. The focus in [
3,
4] was on the study of the heteroclinic cycle and hence no results on the Nash equilibria were given.
Lemma 2 If there is an orbit from to the map from to can be written in the form (after an appropriate parametrisation):where is a matrix. This is a central projection and the center of the projection (see Section 6.3) is the best response vertex. If there is an orbit from to following the full cycle the map from to can be written in the form (after an appropriate parametrisation):where P is a matrix. Surprisingly the transition matrices for the stability analysis of the replicator equation in [
3] are identical to the transition matrices found for the best response dynamics (after a careful choice of variables). Some technical difficulties arise in the analysis of the best response dynamics. This results from the placement of the cross-sections and the fact that the return map Π is a fractional linear map and hence a priori the map is not defined for all
. For the replicator equation there must exist an orbit from one cross-section to the next, if only the neighbourhood is chosen small enough. This follows directly from the form of the differential equation:
. (In fact there are some other technical difficulties for the replicator equation, regarding the mapping from one cross-section to the next. These problems concern the shape of the image of the domain, which plays an important role in the analysis.) For the best response dynamics it is not clear if there is always an orbit from one cross-section to the next. Preliminarily this can only be guaranteed if an interior Nash equilibrium exists (because in that case no strategy is dominated and so each pure strategy is used for some
), but as the examples in
Section 4 below show it is not always true that an interior Nash equilibrium
exists. Thus we have to show that no strategy is dominated to prove that there is in fact always an orbit from one cross-section to the next. The problem with the denominator of the map can be solved by using a compactification of
by introducing the real projective plane
(in short notation
). This is a detour, which was not necessary for the analysis of the replicator equation. But the gain for the best response is that we get more global results. To be more precise in [
3,
4] all results are local results. In contrast all statements found in this work apply at least to all orbits following a specific cycle. This means in a lot cases a result about almost all orbits.
Let us now turn to the concrete results on the best response dynamics. We start with a result on so called
monocyclic matrices. The terminology follows [
7] and it describes a payoff matrix for which all
s are negative and hence the matrix only contains one better reply cycle.
where
and
for
. In [
10] is shown that for games with monocyclic payoff matrix and an interior Nash equilibrium
all orbits converge to the interior Shapley polygon if
holds. We used a different approach for the return map as in [
10] and get more information on the global dynamics. Proposition 16 below gives a complete classification of the attractors for monocyclic matrices and hence a more precise result on the existence of a Shapley polygon. Additionally we get conditions for the stability of the interior Nash equilibrium. In [
6] a list of equivalences for monocyclic games is given. With the help of the return map and some considerations on the index of the Nash equilibria we can completely classify
monocyclic games see
Section 7 below. The only parameters needed for the classification are
,
and
.
A behaviour can be observed similar to a supercritical Hopf bifurcation for monocyclic payoff matrices. In case is smaller than 0 there exists a Shapley polygon. In case holds this polygon shrank to the point . We will call this a degenerated Shapley polygon. In case no Shapley polygon exists.
Theorem 3 Let A be a monocyclic payoff matrix. Then the following statements are equivalent- (i)
There is no (possibly degenerated) Shapley polygon for the best response dynamics.
- (ii)
All orbits converge in finite time to the interior Nash equilibrium .
- (iii)
The system is permanent for the replicator equation.
This result shows the strong connection between the replicator equation and the best response dynamics and it follows immediately from [
2] that at least the time average of every interior orbit for the replicator equation converges to the interior Nash equilibrium. So by giving a classification for the best response dynamics for monocyclic games we automatically improve some results for the replicator equation.
The next two theorems give some results on payoff matrices with one additionally embedded RSP cycle.
Remark There are some recurring terms in the following results and throughout the analysis in the sections below (mainly in Lemma 11). To enhance readability we summarise these terms here.
and
can be interpreted as the stability criteria for the Shapley triangle to exist in the full game. If the restricted game
has a Shapley triangle then
assures that it still exists in the full game. The same holds for
and the 234 restricted game. The expression
is equivalent to the requirement that the return matrix (13) has a real eigenvalue.
Theorem 4 Let A be a payoff matrix as in (8) with and .
Then the game has an interior Shapley polygon, which is locally attracting if or and one of the following conditions hold.- (i)
- (ii)
For (ii) there are two interior Shapley polygons. One is locally attracting and one is a saddle. To be more precise every orbit following the 1234 is either attracted by the attracting Shapley polygon or repelled (except a two dimensional manifold) by the saddle-type Shapley polygon to follow a different cycle after some time. This is exactly the same result as in [
3] for the existence of relatively asymptotically stable heteroclinic cycle following the cycle 1234. Again note that that this result is not simply local. It is a result for all orbits following the 1234 cycle at least once. The next theorem shows that the return map for the best response dynamics is again more ‘powerful’than for the replicator equation and gives a result on the existence of the interior Nash equilibrium and its stability.
Theorem 5 Let A be a matrix as in (8) and let and hold. If holds and- (a)
holds, then an attracting Shapley polygon following the 123 cycle exists.
- (b)
holds and no interior Shapley polygon exists (see theorem 4) then an asymptotically stable interior Nash equilibrium exists.
Theorem 4 together with Theorem 5 also give conditions under which an attracting Shapley triangle, an attracting Shapley polygon in the interior and a second interior Shapley polygon, which is a saddle, exist. The theorem above also provides a connection between the existence of an attracting Shapley polygon and the stability of the interior Nash equilibrium. And hence it draws a connection to the existence of heteroclinic cycles and the interior Nash equilibrium, which was not done in [
3] and [
4].
The next theorem presents some statements on games with two embedded RSP cycles.
Theorem 6 Let A be a matrix as in (8) and let and and let and hold. Then if- (i)
hold then an interior Shapley polygon exists, which attracts almost all orbits.
- (ii)
hold then an interior Nash equilibrium exists, which is globally asymptotically stable.
Note that the results from Theorem 5 can be also applied so with two embedded RSP cycles it also possible to construct examples with two Shapley polygons on the boundary. One following the 123 cycle and one following the 234 cycle. It is important to note that the return map per se only allows a statement that the interior Nash equilibrium
is relatively asymptotically stable (for a definition see [
3]) with respect to the set of orbits that follow the 1234 cycle. But on one hand we know for the best response dynamics (especially when no orbit can follow a cycle different to 1234 infinitely many times) that only a set of measure 0 does not converge to the interior Nash equilibrium and on the other hand in cases mentioned above we were able with some additional considerations to show that no other Nash equilibrium can exist and hence the interior equilibrium is globally asymptotically stable.
5. General Properties of the Dynamics
The following lemma gives more insight in the solutions of the game with payoff matrix A and the assumptions made in the beginning.
Lemma 7 Let A be a payoff matrix of the form (8) with the assumptions (9)–(12). Let with then there exists a such that the solution is unique for and holds for almost all . For the solution is no longer unique or there is an interval such that the orbit for .
If holds, then either an interior Nash equilibrium is reached in finite time or converges for to a Nash equilibrium on the boundary. If holds, then as .
Remark Note that assumption (11) is needed for this lemma, which excludes forced movement. In general we cannot predict the behaviour of the orbit after the interior Nash equilibrium is reached via a forced movement. If is the unique equilibrium the orbit cannot leave , but if the game has several Nash equilibria, then uniqueness is lost and it is possible to continue the path towards each equilibrium at any time. If, for example, exists the orbit can move towards and then can move towards strategy 1 or 3 at any time. In this case the orbit reenters the set .
Proof means that
which means that the orbit has to move towards
and hence can be written as
for
t small enough. If we look at (after a change of time
)
we can calculate, which strategy becomes the next best response.
is simply the
i-th column of the matrix
. We know by (9) and (10) that
,
and
. Now there are two possibilities for
, which corresponds to
. It can be greater or smaller than zero.
:
In this case it is obvious that only the payoff of the
-st strategy is increasing and for some
which means that
or in other words the best response is not unique at this moment. To find out the possible solutions we have to look at the restricted game in which only these strategies are used, which are best replies at the moment. The orbit can move in the direction of any Nash equilibrium of this restricted game
.
This restricted game has only one Nash equilibrium (because of (9) and (10)). Thus the path can only be continued in one way, so enters , which means that the solution stays unique.
:
By (11) we know that in this case
must hold. Now three strategies can become best replies which means that we have three possibilities
- −
as before for .
- −
as before enters for .
- −
We have to look at the restricted 3×3 game and check for all possible Nash equilibria. The matrix has the following sign pattern.
so it is easy to see that this game has also only one Nash equilibrium (the pure strategy ). Hence the solution remains unique and enters .
To show the second part of the lemma we first assume that
then we know from Lemma 1 that
converges to the set
. Now assume that
. Using Lemma 8 (see below) we know that the orbit
ultimately follows only one better reply cycle. We distinguish between the following two cases.
The orbit follows the cycle 1234 for .
We show that in this case the interior Nash equilibrium is reached in finite time. Take the sequence of turning points (note that at these points at least two strategies have the same payoff), then the sequence converges to T. The map is continuous for . As the orbit follows the cycle 1234 along one cycle of the orbit the payoff of each strategy is equal to its successor, so because of continuity as the difference between and can be chosen arbitrarily small, the difference between the different payoffs gets also arbitrarily small. Thus for all payoffs must be equal. This means that all strategies have the same payoff after finite time. It is not possible to reach the boundary (from ) so the orbit must reach an interior Nash equilibrium in finite time. To make this verbal argument more precise take a sequence of such that holds and such that and so on. Each of these sequences converges to T and hence at T the payoffs for all strategies must be equal.
The orbit follows the cycle 123 (which we can choose without loss of generality). Note that this can only happen if the 123 better reply cycle exists, e.g.
and
. We can use the same argument as for the cycle 1234 that for
at least strategies 123 must have the same payoff. This means that at
the following holds
If all strategies have the same payoff a Nash equilibrium is reached, so we assume now that we reach a point , where only 123 are the best replies. To continue the orbit, we have to look at the restricted game . It follows from assumptions (9)-(11) that this is an RSP game. Thus if , then the orbit has reached a Nash equilibrium. This can only happen if the starting point of the orbit was already in the 123 face.
If then the orbit moves towards the unique Nash equilibrium of the RSP game. Along this movement the payoff of strategies 123 change at the same rate, only the payoff of strategy 4 changes differently. Therefore the orbit either hits an interior Nash equilibrium (if it exists) or converges to the unique Nash equilibrium of the RSP game (which is also a Nash equilibrium of the full game). This means we either get or moves towards .
Remark We can formulate the last statement in the proof above a bit stronger. Suppose T is finite and the orbit follows the 123 cycle and reaches at time T a point with payoff with , then the orbit reaches an interior Nash equilibrium if and only if the Nash equilibrium of the restricted RSP game is no Nash equilibrium of the full game. This follows directly from the fact that the payoff of the fourth strategy changes differently to the payoff of the strategies 123.
Lemma 8 An orbit can switch between better reply cycles at most once.
Proof We show this without loss of generality for the 123 and 1234 cycles. First note that for the possibility to switch between these two better reply cycles, there must be some orbits, which follow the 123 cycle and some orbits which follow the 1234 cycle. Now let be an orbit, which follows the 123 (at least once). Hence this orbit must cross the sets , and . Now let be an orbit which follows 1234. This orbit must cross the sets , , and . If we start in then 31 and 34 can become best replies. Thus there is a subset of which is mapped to , which is a one dimensional set by our assumption that A is generic. Hence the preimage of on is a line segment s. The orbits starting in s and moving to span a plane P in , which separates the orbits that follow the 123 cycle and the orbits that follow 1234. Because of the uniqueness of solutions in each this plane P can only be crossed in one way and hence switching between cycles is only possible once.
An immediate consequence of this lemma is
Corollary 9 For a matrix fulfilling (8) no chaotic behaviour can occur.
The results for the RSP game with respect to possibly gives rise to the idea that the behaviour is identical for a payoff matrix in the sense that if is not empty it is already an attractor. The following example 6 shows that the set in general neither has to be attracting nor invariant. It is possible to enter or leave this set in finite time.
Example 6 We take the following payoff matrixThis payoff matrix has a unique Nash equilibrium withbut the minimum of is attained at with .
The results from Theorem 6 above show that almost all orbits converge to the interior Nash equilibrium.
Figure 2.
The set is shown by the green pyramid. The red orbit connects , the top of the pyramid, the Nash equilibrium and the target point for the line segment spanned s by and .
Figure 2.
The set is shown by the green pyramid. The red orbit connects , the top of the pyramid, the Nash equilibrium and the target point for the line segment spanned s by and .
But besides the result from this theorem we can give a complete description of this game. The set is a pyramid with top . At this point the payoff vector is of the form where . This means the trajectory starting at continues towards the unique Nash equilibrium of the restricted 234 game . (Note that is not a Nash equilibrium of the full game!) Inside the pyramid holds and outside the pyramid holds. So every orbit that starts inside the pyramid reaches the line segment s (spanned by and ) in finite time, where 234 are the best replies. (This means the orbit reaches a point, where its payoff vector is of the form , with ). In s the orbit moves towards , passes and reaches in finite time. Close to s the dynamics is the same as for an RSP game with attracting Nash equilibrium.
Every orbit that starts in spirals (following the cycle 234) towards and reaches it in finite time, but after an infinite number of turning points! Then again the orbit reaches in finite time. Orbits outside the pyramid behave similar to the orbits inside, they spiral towards s reach it in finite time and thus reach the Nash equilibrium in finite time. In this example every orbit reaches in finite time. Therefore is globally asymptotically stable.
6. Construction and Properties of the Return Map
The previous example showed that the analysis of the set
is not sufficient to analyse
games. So another tool is needed. We will introduce the concept of Poincaré or –also called– return maps in this section. We just give a rough concept, which should be made intuitively clear by
figure 3 below. For more detailed information and all the technical assumptions needed see for example [
6]. To create a return map we simply take a cross-section and a starting point
in this cross-section. We now follow the orbit
until it returns to the cross section for the first time. We denote this new point in the cross section by
. The map
is called the return map. This map can be used to find periodic orbits, fixed points and to check their stability.
Figure 3.
A simplified drawing of the kind of return map we will use below.
Figure 3.
A simplified drawing of the kind of return map we will use below.
We will use a set
as the cross-section and hence this set is the domain for our return map (which will be defined later). Since solutions are piecewise linear it is not possible to calculate the return map at once. Instead of doing so we have to calculate four transitions maps and glue them together to obtain a return map. First we have to find a proper parametrisation of our four cross-sections. (These cross-sections are simply given by
,
,
and
which correspond to the cycle given by the structure of the matrix.) It turns out that the following parametrisation of the set
is very useful
We get
. Without loss of generality we assume that
(if
then
is an interior Nash equilibrium as all payoffs are equal there. This parametrisation can be interpreted as an incentive function, since it measures the relative success of strategy
i compared to all others.) Hence the domain for our special parametrisation is a convex subset of
or is empty, which means that no orbit follows the full cycle.
We calculate the transition from . We do this by calculating the time it takes a point from the inset to reach the outset .
Remark The terminology in- and outset is chosen since the set can be entered through and can be left through . Note that these need not be the only ways to enter or leave the set .
6.1. Proof of Lemma 2
Proof To improve the readability of the proof we prove this exemplarily for an orbit from
to
. So we start with a point
and parametrise it as in 16. Since 4 and 1 are the best replies the orbit moves towards 1. For
small enough the following holds:
We assume now that 2 becomes the next best reply. We have to solve
for (17). The solution can be easily calculated and is given by
where
represents the transition time from
to
. So we get the following mapping for
:
with
. Clearly we get
. As there are two zeros in this map it is possible to interpret this map as a two dimensional map, if we set
and
and
we obtain the following:
This map (20) is of the form:
It is easy to see that for a arbitrary map
from
to
we get
where
is called the transition matrix. This map is projective map and properties of such maps can be found in
Section 6.3 including that the composition of two projective maps is again a projective map (this follows directly from (31)). Glueing (via a composition) four transition maps together gives us the return map
from
to itself by
where
and
are given by
6.2. A Necessary Condition for Shapley Polygons
The important property of the return map Π is that its fixed points either correspond to an interior fixed point (Nash equilibrium) or to a periodic orbit (Shapley polygon) for the best response dynamics. More precisely, the origin
corresponds to an interior Nash equilibrium and every other fixed point in
to a Shapley polygon. To find a fixed point of Π we have to solve the equation
which is simply an eigenvalue problem
Note that the vector
in (16) is nonnegative. Hence for a full cycle a nonnegative vector must be mapped onto a nonnegative vector under the return map Π. If this is not possible no orbit follows the full cycle. The following corollary follows directly
Corollary 10 A Shapley polygon following the full cycle can only exist if the return matrix P in Equation (13) has a nonnegative eigenvector.
Proof A Shapley polygon corresponds to a fixed point for the return map Π in (13). This fixed point is an eigenvector of P by (25) and only nonnegative vectors can correspond to the set , where the return map initially started.
We state some general results for a certain class of two by two matrices
T from [
3]. In [
3] these are conditions that the replicator equation has a relatively asymptotically stable heteroclinic cycle following the full better reply cycle, whereas for the best response dynamics they are necessary conditions for an attracting interior Nash equilibrium or the existence of an interior Shapley polygon.
Lemma 11 Let ,
,
,
and and let P as in (13) and assume that for all and that .
Then P has a positive eigenvalue with a corresponding positive eigenvector if and only if one of the following statements holdwhere is the determinant of the matrix we obtain by omitting the ith row and the jth column in A. Additionally for the eigenvalue holds iff 6.3. Projective Geometry and Fractional Linear Maps
The next step is to analyse the behaviour if the iteration of the return map. This can lead to technical difficulties because (22) cannot be defined for all
. So before we start with the analysis of the concrete return map as described in
Section 2 we give some general results on fractional linear maps of the form
where
,
with
. (We also implicitly assume that the matrix
P has no entry equal to zero. The results would not differ, but else the proofs would contain a lot of cases.)
Maps of this form are also called central projections or projective maps. We have already mentioned in
Section 6.2 that a nonnegative vector has to mapped onto a nonnegative vector. Consequently we have to find conditions under which an invariant set in
exists. If there is no such set no orbit can follow the full cycle infinitely many often hence no Shapley polygon in the interior can exist. We also know from
Section 6.2 the fixed points of (22) to study their stability we have to analyse
.
To do so we use projective geometry, for more details about projective geometry see [
11]. We use homogenous coordinates and get the following: If
then
. These points are called regular and this method creates a bijective mapping from the regular points in
to
. Points with
are called points at infinity and the set of all
is called the line at infinity. With this concept we can write the two dimensional fractional linear map Π from (30) in the following form
In this map
the real points which satisfy
(with
and
) are mapped to the line at infinity. We denote with
both the matrix and the map, but as the map is only a matrix times a vector, there is no problem in doing so. This map is a block triangular matrix and hence the product of two such matrices is again a block triangular matrix. This gives any easy proof that the composition of two central projections is again a central projection. Any line
with starting point
and direction
in
can be embedded in
in the following form
Here
l from (32) contains only regular points which exactly correspond to the points defined by
. A map through the origin in
is mapped under
to
To calculate which lines in
embedded in
are invariant, we simply have to calculate the eigenvectors of
P (not of
! But note that if we project the eigenvectors of
onto their second and third component, we get the eigenvectors of
P and the origin in
.). These eigenvectors
are given by (note that we assumed in the beginning of this section that the matrix has no zero entry and its determinant is also not zero!)
with
and their corresponding eigenvalues
chosen such that
. We are only interested in real eigenvalues and hence we assume for all following considerations
which means that one (and hence both) eigenvalue is real. Note that due to the block form of the matrix
its eigenvalues are given by 1 and the eigenvalues of
P. We also get from (35) that
P is similar to a diagonal matrix and as a consequence we get that
is similar to (a matrix we will also call
)
It is easy to see (by induction) that
where
and hence
for
, if
. This means if the absolute value of both eigenvalues of
P is smaller than or equal to 1, the iteration of the map Π in (30) leads to convergence to the origin. If we use that
where is the equivalent (or the regular point corresponding to ) of in and and is an arbitrary vector, but not in (this set corresponds to the eigenspace in ) and holds, we get that all vectors , which are not eigenvectors, converge to the eigenspace of the eigenvalue with greatest absolute value. (This can be seen easily by using the fact that is diagonalizable.) It follows from (35) that has 3 different eigenvalues and hence three different eigenvectors. These eigenvectors are regular points in and therefore corresponding to points in and hence the map Π has three fixed points if holds and each line connecting two of these fixed points is invariant under the map Π. More precisely, one eigenvector of corresponds to the origin in and hence two of these three invariant lines correspond to the eigenspaces and of P. The third invariant line connects the fixed points inside these eigenspaces and is generically disjoint from the origin. This third line corresponds to the set for the best response dynamics.
Proposition 12 Let and then all (except the two different eigenvectors of ) converge to the fixed point in in a monotone way. If all converge to the origin.
Proof Convergence properties follow from (38) and from (both eigenvalues are positive) follows and hence is orientation preserving and convergence is monotone.
Corollary 13 It is impossible for an attracting interior Nash equilibrium and an interior Shapley polygon to exist simultaneously.
Proof An interior Shapley polygon only exists if the dominating eigenvalue is greater than 1. Conversely is only attracting (with respect to the orbits that follow the 1234 cycle) if is smaller or equal 1.
6.4. The Domain of the Return Map
Definition 5 A strategy is called dominated
if there exists such thatfor all and inequality is strict for one j. If inequality is strict for all j then is called strictly dominated
. Finally we define the domain of the return map in the following way
where
A crucial point is to show that
D is not empty. To do this we use the proof of a theorem from [
6] (page 90, Theorem 8.3.2). We directly get the following
Lemma 14 Let if is contained in the ω-limit for (2) then k cannot be a dominated strategy.
The lemma above shows that if the replicator equation has a heteroclinic cycle following the cycle 1234,which attracts an interior orbit, then no strategy can be dominated. Hence, whenever
P has a positive eigenvector then
D is not empty! This follows directly by Lemma 11. In [
3] the conditions from Lemma 11 guarantee the existence of a relatively asymptotically heteroclinic cycle following the cycle 1234. Hence at least one interior orbit is attracted by this heteroclinic cycle.
6.5. Shapley Triangles in 4×4 Games
This section does not apply the methods of the return maps. We constructed the return map to analyse orbits that follow the full cycle. But as we are interested in games with embedded RSP cycle we have to find conditions under which these RSP cycles form have a Shapley triangle. It turns out that a direct approach is easier than an approach via return maps to find this conditions.
Definition 6 If the payoffs of all unused strategies along a Shapley polygon are negative, the Shapley polygon is called regular.
Proposition 15 Given a payoff matrix A as in (8) thenThe 123 cycle has an asymptotically stable regular Shapley polygon, iff all of the following conditions are satisfied- (i)
,
- (ii)
and
- (iii)
.
The 234 cycle has an asymptotically stable regular Shapley polygon, iff all of the following conditions are satisfied- (i)
,
- (ii)
and
- (iii)
.
Proof We take the following payoff matrix
This is a rock-scissors-paper game, if we take
. So we simply added a fourth arbitrary strategy to a typical RSP game. It is well known [
6] that a Shapley polygon exists for the RSP game iff
holds. We will call the
s the transversal incentives since they are the positive or negative incentives to leave the 123 surface.
We denote the vertices of the Shapley polygon by
and
. We need that
holds for
. Because 4 must not be the best reply to a vertex of the Shapley polygon as in this case the dynamics would leave the 123 surface and enter the interior of
. This leads to the following inequalities:
Clearly one of the
must be negative to fulfil these inequalities. We assume without loss of generality
holds. Note that these inequalities are trivially fulfilled if
holds for all
.