1. Introduction
With the intensification of market competition and the acceleration of technology updates, the products life cycle is continually shrinking, leading to increased uncertainty in market demand. This poses challenges for market participants, such as manufacturers and retailers, in accurately predicting and meeting market demand, resulting in the common occurrence of stockouts and excess inventory across various industries, including clothing, books, and magazines [
1]. Stockouts have resulted in significant losses to the retail industry. For instance, The authors in [
2] reported annual losses of USD 7–12 billion in the US supermarket retail industry due to stock shortages. Moreover, research conducted by Roland Berger in Beijing, Shanghai, and Shenzhen revealed that Chinese supermarkets experience a conservative estimate of around 10% commodity shortage, which amounts to an annual direct loss of USD 12 billion. A stockout rate of 20–30% has even become the norm in the retail industry [
3]. Conversely, excess inventory is a prevalent issue in other retail sectors. In the first half of 2012, the total inventory of six Chinese sports brands, such as Li Ning and Anta, reached USD 800 million, while Youngor’s inventory had reached USD 4 billion by 2016. Consequently, it has become imperative to address the mismatch between supply and demand. Transshipment is one approach that many retailers consider, as it has yielded remarkable results. Transshipment, which is a form of inventory sharing, aligns with the principles of the sharing economy [
4]. Previous studies have indicated that transshipment can effectively reduce inventory and improve service levels. The authors in [
5] found that transshipment can reduce inventory costs by 15–20% and demand losses by 75%. Notably, this approach has been widely adopted in various industries, including clothing, automotive (e.g., Toyota, Volvo), publishing (e.g., Xinhua Group of China), household products, and healthcare. For instance, in the provincial and municipal companies affiliated with the State Grid of China, transshipment of electricity meters between power supply bureaus is allowed to ensure a response time within three days for residential users. Additionally, an inter-provincial transshipment mechanism for electricity meters was established in 2014 to better cope with natural disasters and accidents.
However, an underlying assumption commonly made in practice is that retailers exhibit perfect rationality. Factors such as information asymmetry and cognitive abilities can influence retailers’ ordering behavior, causing deviations from the optimal ordering quantities prescribed by traditional inventory transshipment models. A survey involving 54 inventory managers revealed that none of these managers solely relied on a purely theoretical approach when making ordering decisions [
6]. Instead, 45 of them consider both traditional theory and behavioral factors. Therefore, it is essential to incorporate behavioral factors when investigating retailers’ ordering decisions. Motivated by these observations, our study aims to examine the transshipment or inventory-sharing problem through a behavioral lens. We address the following three questions: (i.) Does a quantal response equilibrium (QRE) exist between retailers, considering behavioral factors? (ii.) If such a QRE exists, what is the relationship between this QRE and the Nash equilibrium? (iii.) How can we identify the QRE between retailers?
To address these research questions, we examine a system with transshipment involving two independent retailers, similar to the work conducted in [
7], referred to hereafter as RKP. In order to investigate systematic ordering deviations, we incorporate bounded rationality into retailers’ ordering behaviors using the quantal response equilibrium (QRE). Building upon classical quantal choice theory and the theory of quantal response equilibria [
8,
9], we develop a bounded rationality decision model in this paper. Within this model, retailers’ ordering quantities are no longer deterministic but considered as random variables. However, ordering quantities leading to higher expected profits are chosen with higher probabilities. While bounded rationality can explain various phenomena such as cognitive limitations, psychological deviations, and heuristics, in this paper, we focus on emphasizing the role of bounded rationality in explaining noisy decisions. We establish the existence of a QRE in the transshipment problem and identify the conditions for its uniqueness. Furthermore, we explore the relationship between the ordering quantity determined by the Nash equilibrium and the ordering quantity derived from the QRE. It is well-known that the QRE model is typically solved numerically. In light of retailers’ learning effects, we propose an iterative algorithm to solve the QRE in the transshipment problem.
Our paper makes significant contributions to the fields of transshipment and behavioral operations management. This research is the first investigate the quantal response equilibrium (QRE) between two independent retailers using analytical modeling. In addition to establishing the existence of QRE in their ordering decisions, we also determine the conditions for its uniqueness. Furthermore, we identify the condition under which QRE is equivalent to Nash equilibrium. To find the QRE between the two retailers, we develop an iterative algorithm. Our approach differs from the empirical perspective taken in [
10], which primarily focused on how the rationality level of retailers indirectly influenced their ordering quantities through the transshipment price. Moreover, our research goes beyond examination of the single newsvendor problem within a logit choice framework in [
11]. Unlike [
11], which solely considered the newsvendor problem without incorporating transshipment, our model allows for transshipment between the retailers, thereby adding complexity to the decision-making process of the newsvendors.
The remainder of this paper is organized as follows.
Section 2 provides a brief literature review. The bounded rationality transshipment model is established in
Section 3. In
Section 4, in addition to proving the existence of transshipment QRE and the condition to guarantee its uniqueness, we also discuss the relation between the optimal ordering strategy in RKP and the transshipment QRE ordering quantity. Based on retailers’ learning effects, the algorithm of transshipment QRE is designed in
Section 5. We present a numerical study in
Section 6. Finally, we conclude our findings in
Section 7.
3. Model
3.1. Traditional Transshipment Model
In this study, we analyze a decentralized system comprising a supplier and two independent retailers, denoted by and (where ), under the assumption of perfect rationality. At the onset of the selling season, retailer is faced with the task of determining an optimal non-negative ordering quantity, , to maximize expected profit, despite being unaware of the future demand, . Throughout the selling season, retailer fulfills demand using its own inventory first. In the event of stockouts, retailer has the option to obtain products from retailer , but only if retailer possesses excess inventory. Consequently, transshipment serves as a means to mitigate inventory costs and enhance service levels. Notably, both retailers must consider each other’s ordering behavior when making their own ordering decisions, necessitating trade-offs. Overshooting order quantities can result in elevated inventory costs and facilitate the opponent’s ability to meet demand through transshipment. Conversely, lower order quantity diminishes service levels and provides the opponent with opportunities for additional profit through transshipment.
Let
,
,
, and
denote retailer
’s per-unit cost, selling price, salvage value, and lost-sales penalty cost, respectively. We define
as the marginal value of additional retail sales at retailer
. Let
denote the per-unit cost of transshipment from
to
and
denote the per-unit transportation cost of transshipment from
to
which is assumed to be incurred by retailer
. In this research, to avoid the triviality, we make the following assumptions:
,
,
,
,
. These assumptions guarantee that transshipment is not always beneficial, and that transshipment occurs only when one retailer has excess stock and, simultaneously, the other has excess demand. Once the random demand
is realized (demand distribution
,
are common knowledge and differentiable), transshipment occurs only when a retailer has surplus stock, and another retailer has a stockout. We assume that the transshipment prices
or
are negotiated before the selling season and that if the transshipment condition is met, the transshipment occurs automatically. We define
; the units of inventory transferred from retailer
to retailer
can be written as
. Then, let
,
, and
denote the retailer
’s sales, unsold stock, and unmet demand, respectively. Given a certain pair of ordering quantities
, the expected profit of retailer
can be provided by
According to Equation (1) above and Rudi et al.’s proof method [
9], there exists a unique Nash equilibrium
. The logic behind the existence of Nash equilibrium is that there is a strategic substitution relationship between the orders of two retailers. Both retailers will have no incentive to deviate from this Nash-equilibrium ordering quantities, because any deviation will lead to a lower expected profit. Based on perfectly rational assumption, retailers will choose the Nash-equilibrium ordering quantities with certainty.
3.2. Bounded Rationality Transshipment Model
The aforementioned context resembles the RKP model, wherein retailers exclusively select Nash-equilibrium ordering quantities. Within this subsection, we develop a bounded rational transshipment model using quantal response equilibrium (QRE), whereby retailers probabilistically determine their ordering quantities. This approach allows readers to gain deeper insights into bounded rationality, operating under the following assumptions.
Assumption 1. Retailers employ stochastic responses when selecting ordering quantities, rather than solely optimizing profits. This implies that all feasible ordering quantities are chosen with a positive probability, even though those with higher expected profits are more likely to be selected.
Assumption 2. Retailers experience uncertainty concerning their competitors’ decisions, as their competitors also stochastically determine their ordering quantities.
We define as the set of retailers, as the full set of retailer ’s feasible ordering quantities, and as the number of feasible ordering quantities. The feasible order quantities here are assumed to be discrete, which is consistent with practice. If retailers are assumed to be boundedly rational, they may stochastically choose a pair of ordering quantities in . Hence, the ordering decisions of retailers can be viewed as a mixed ordering strategy. In other word, retailer ’s ordering strategy is a probability distribution on . Let denote the expected profit function of retailer and denote the set of probability distributions on . Let ; this an element in the set , where and for all . For convenience, we use the notation to represent the probability of choosing the ordering quantity . Hence, the set of all probability distributions on is . We write the set of mixed ordering strategy profiles by and denote elements in by . Hence, given a certain mixed ordering strategy profile , the expected profit of retailer is given by , where . We denote as the expected profit of retailer choosing ordering quantity and retailer choosing a mixed ordering strategy. The space of profit vectors of retailer choosing a certain ordering quantity is ; we define by , where is the profit vector of retailer choosing different ordering quantities in . Therefore, with the notations defined above, we can denote a transshipment game between retailer and retailer with .
4. Quantal Response Equilibrium of the Transshipment Game
To incorporate bounded rationality into game-theoretic analysis, [
9] introduced the concept of quantal response equilibrium (QRE). The key idea behind QRE is the inclusion of a payoff disturbance for each pure strategy. Within a QRE framework, retailers often make the “better response” instead of the “best response”, enabling a more accurate interpretation of the actual situation. One widely utilized quantal response function is the logit response function, which has a long-standing tradition in the study of individual choice behavior. In this study, we adopt the logit QRE model. While certain prior works have examined retailers’ bounded-rationality ordering behavior using the QRE framework [
10,
33], none have explored the existence and uniqueness of such QRE. In this section, we shall establish the existence of QRE within our transshipment game, as defined in
Section 4, and outline the conditions necessary to ensure uniqueness of the QRE. Additionally, we demonstrate the relationship between the QRE solution and the Nash-equilibrium solution previously addressed in RKP.
4.1. Existence of QRE
Since there is only one parameter called the bounded rationality parameter in the logit QRE model, one can deal with it conveniently. We now provide the definition of the logit QRE model of our transshipment game.
Definition 1. For any given , a logit QRE is any such that, for each and ,
We denote the set of logit QRE with one bounded rationality parameter
by
. Intuitively, if retailer
determines a mixed ordering strategy, a certain ordering quantity of retailer
which can result in more profit will be chosen with a higher probability. There is only one parameter
which is called the bounded rationality parameter in Equation (2). From [
9], we can state that, as
goes to infinity, the limiting point of the QRE is a subset of the Nash equilibrium and that, if
, retailers choose any ordering quantity with equal probability. Hence, retailers become more rational as
gets larger and the ordering quantities are closer to Nash-equilibrium ordering quantities. The QRE model can capture the essence of assumption 1 proposed in
Section 3 well. From the system of the QRE model, any feasible ordering quantity would be chosen with a strictly positive probability even though it may cause negative or zero profit. Now, we provide the following proposition.
Proposition 1. For any transshipment game satisfying and , there exists a QRE.
Proof. According to RKP, retailers’ profit functions are strictly convex when , which limits the amount of ordering quantities to a certain range. For any , Let with . With definitions of and , it can be easily verified that is a continuous mapping from to itself since is a convex and compact nonempty set. An application of Brouwer’s fixed point theorem leads to the conclusion that there exists a QRE as a fixed point of . Hence, the proposition immediately follows. □
Proposition 1 asserts the existence of at least one Quantal Response Equilibrium (QRE) within the defined transshipment game. While [
10] demonstrated that an absence of QRE can occur with infinite ordering boundaries, we mitigate this by constraining the transshipment price to maintain the convexity of the retailers’ profit function. This restriction narrows the range of ordering quantities and ensures the presence of QRE in this transshipment game. An essential characteristic of transshipment QRE is that a higher probability is assigned to the selection of the superior ordering quantity. In practical scenarios, even when the retailer’s ordering quantity deviates from the optimal value, there is still a higher likelihood of selecting the optimal quantity or those in close proximity. As a result, retailers are more likely to generate greater profits. Multiple QREs may exist in the transshipment game, leading to increased complexity in the ordering decisions. To enhance the predictive capability of retailers’ ordering behavior, we will explore the notion of QRE uniqueness.
4.2. Uniqueness of QRE
The RKP proved that for any feasible transshipment price there exists a unique Nash equilibrium in transshipment game and [
13] also considered the uniqueness of First-Best Nash equilibrium in a general framework for the transshipment problem. In this subsection we will discuss the uniqueness of QRE in transshipment game.
Intuitively, when , retailers choose any feasible ordering quantity with the same probability, which is the unique QRE. This intuitive property leads to the idea of whether the uniqueness of QRE in transshipment problem depends on the value of . Hence, we provide the following proposition.
Proposition 2. For a sufficiently small , there exists a unique QRE in the transshipment game .
Proof. To prove Proposition 2, we just prove that
is a singleton for a sufficiently small
. From Definition 1 and Proposition 1, it is indicated that
if and only if
is a fixed point of
. According to the definition of
in the proof of Proposition 1, we notice that
is Lipschitz continuous in
and that
is smooth.
represents the sup norm. For any
, there are
and
such that
Let and . Then, for any , it is true that is a contraction mapping. As an application of contraction mapping theorem, is a singleton. This completes the proof. □
Proposition 2 demonstrates that retailers possess limited information about the transshipment game due to cognitive and computational limitations when the bounded rationality parameter is small. Consequently, retailers opt for an equal probability of selecting each feasible ordering quantity, leading to a single unique quantal response equilibrium (QRE). However, smaller bounded rationality parameters correspond to lower expected profits. To achieve higher anticipated profits, retailers progressively enhance their cognitive and computational capabilities through repeated participation in transshipment games, thereby increasing the bounded rationality parameter. As a result, QRE in transshipment games initiates from a singular trajectory, with cognitive abilities gradually improving over time via the learning effect, ultimately yielding greater expected profits.
However, another special case where the retailers’ bounded rationality parameter is infinite (e.g., ) should also be considered. Intuitively, as , retailers become perfectly rational. Hence, the QRE of transshipment game becomes a Nash equilibrium, which will be discussed in the next subsection.
4.3. The Limiting Point of QRE
In RKP, for any feasible transshipment prices, there exists a unique Nash-equilibrium solution in a two-location decentralized inventory system with transshipment. Retailers who are assumed to be perfectly rational choose the Nash-equilibrium ordering quantities with the probability equal to one. When it comes to our proposed transshipment game where retailers are assumed to be boundedly rational, retailers may not choose the Nash-equilibrium ordering quantities with certainty. Instead, in any QRE, they may choose any feasible ordering quantity, but the probability of choosing the optimal ordering quantity is higher than other ordering quantities.
Moreover, we find that the solution in the RKP is a special case in our transshipment game based QRE. Therefore, we provide the following proposition and briefly state the proof of it.
Proposition 3. As , retailers’ ordering behavior in our transshipment game will converge to the unique Nash-equilibrium ordering quantity (we assume that it is included in the set of retailers’ feasible ordering quantities) which was discussed in the RKP.
Proof. From Theorems 2 and 3 in [
9], the graph of QRE,
, contains a unique branch which starts at the centroid for
and converges to a unique Nash equilibrium as
goes to infinity. Hence, the abovementioned implies that, as the transshipment game repeats, retailers become more rational (as
increases). It will result in a sequence of retailers’ QRE. Retailers’ mixed ordering strategies are different in different QRE. However, the sequence of retailers’ QRE will converges to a unique Nash equilibrium as
. Hence, it can be easily verified by contradiction that retailers’ ordering quantities at the limiting point of the sequence of retailers’ QRE are the same as these Nash-equilibrium ordering quantities in the RKP. □
The implication of Proposition 3 is that with the strengthening of learning effects, retailers gradually overcome the limitations of cognition and information, and the bounded rationality parameter gradually increases, so retailers reach the optimal ordering quantities step by step. The RKP has proved that there is a unique Nash equilibrium in a perfect-rationality transshipment model when . In our bounded-rationality transshipment model, when the bounded rationality parameter reaches a certain degree, the retailers realize that the optimal ordering quantity will lead to the optimal profit. Therefore, the retailer will choose the optimal ordering quantity of the perfect-rationality transshipment model, so as to achieve the same unique Nash equilibrium.
6. Numerical Study
We use the algorithm designed above to calculate a case where the demand at two locations is independently and identically distributed. We assume that demand is distributed uniformly and discretely over an interval . Hence, retailer will choose any ordering quantity in . We allow the initial bounded rationality parameter to be 0, i.e., , which means that the initial probability of retailer choosing any ordering quantity in is . The two retailers are assumed to be homogeneous; each retailer can procure unit inventory at cost, , and sell it at unit price, . Unit salvage value , and we ignore the penalty for lost sales, . When the transshipment occurs, the unit transshipment price and the unit transport cost .
With parameters described as above, one can easily find the Nash-equilibrium ordering quantity; say
. We use an exponential learning curve in this numerical case.
Figure 1 shows the distribution of ordering quantities of retailer
corresponding to different bounded rationality parameters. The initial probability of any ordering quantity is
when
. As
increases, the probability of choosing a Nash-equilibrium ordering quantity gradually becomes larger. In addition, each line in
Figure 1 corresponds to a QRE. Since the probability of retailer
choosing the Nash-equilibrium ordering quantity is close to 1 at
, the retailer
can be seen as perfectly rational at this bounded rationality parameter. Another interesting finding is that retailer
is more likely to choose larger ordering quantities near the Nash-equilibrium ordering quantity. It seems that the retailer prefers to have a surplus rather than a shortage. The similar finding was also discussed in [
11]. The main reason for this phenomenon is that, once the product is out of stock, it can be observed at any time during the selling season, while the product surplus only appears at the end of the selling season.
Due to the learning effect, the retailers’ bounded rationality parameter increases with game repetition. It can be seen in
Figure 2 that the bounded rationality parameter does not change significantly in the first 60 periods and then increases dramatically, which is consistent with the characteristics of an exponential learning effect. We can find that the probability of retailer
choosing any ordering quantity changes with the repetition of the transshipment game, as shown in
Figure 3. The probability of retailer
choosing the Nash-equilibrium ordering quantity increases with
, and finally converges to one at
. The probability of choosing
which is near the Nash-equilibrium ordering quantity increases first, then decreases, and finally converges to zero. Other ordering quantities are chosen with a decreasing probability which finally converges to zero at
.
7. Conclusions
Inventory transshipment has the potential to simultaneously reduce inventory levels and increase service levels. However, traditional models solely assume perfectly rational decision makers and overlook the possibility of biased decision making by human actors. To explore systematic ordering deviations, we incorporate bounded rationality into retailers’ ordering behaviors by utilizing the concept of quantal response equilibrium (QRE) in a system involving two independent retailers, akin to the RKP model. Our research yields valuable managerial insights:
Firstly, we establish the existence of QRE for the ordering decisions made independently by two retailers. Regardless of the specific transshipment price, there is always at least one QRE present in the defined transshipment game. In practical settings, even when retailers’ ordering quantities tend to deviate from the optimal amount, there should be a higher likelihood of selecting the optimal quantity or those in close proximity to optimize profits.
Secondly, we investigate the conditions under which such QRE becomes unique. Retailers choose each feasible ordering quantity with equal probability, resulting in a unique QRE when the bounded rationality parameter is sufficiently small. To maximize profits, retailers need to gradually enhance their cognitive or computational abilities through repeated participation in transshipment games, leveraging the learning effect.
Thirdly, we identify the condition in which the QRE and Nash equilibrium are equivalent. RKP developed the notion that a unique Nash equilibrium exists in traditional transshipment models for any feasible transshipment price. Through the strengthening of the learning effect, the retailer gradually overcomes the limitations of cognition and information, leading to an increase in the bounded rationality parameter. This progression allows retailers to make optimal ordering decisions incrementally, ensuring the realization of maximum profits.
Finally, we demonstrate that the right side of the optimal ordering quantity is chosen with a higher probability than the left side. This can be attributed to the fact that stockouts may occur at any point during the selling season, whereas product surpluses typically emerge towards the end. To avoid penalties associated with stockouts during the selling season, retailers prefer having a surplus rather than a shortage. Consequently, it is crucial to modify the evaluation mechanisms for inventory managers in practical scenarios.
There are several avenues for future research. Firstly, considering that the retailer adopts a mixed ordering strategy at quantal response equilibrium (QRE), an intriguing question emerges regarding the distribution of ordering quantities when the demand distribution is known. Su demonstrated that the ordering quantities of a single newsvendor follow a truncated normal distribution in the case of uniformly distributed demand [
11]. Therefore, it is of interest to investigate whether the ordering quantities in our transshipment model exhibit a similar property. Secondly, our study focuses solely on the scenario involving two independent retailers. It is worth exploring whether QRE still exists when multiple retailers are involved. Lastly, the same question arises when considering multiple-stage transshipments.