Optimal pricing and inventory policies with reference price effect and loss-Averse customers

Optimal pricing and inventory policies with reference price effect and loss-Averse customers

Optimal Pricing and Inventory Policies with Reference Price Effect and Loss-Averse Customers Journal Pre-proof Optimal Pricing and Inventory Policie...

1MB Sizes 1 Downloads 92 Views

Optimal Pricing and Inventory Policies with Reference Price Effect and Loss-Averse Customers

Journal Pre-proof

Optimal Pricing and Inventory Policies with Reference Price Effect and Loss-Averse Customers Qiang Wang, Nenggui Zhao, Jie Wu, Qingyuan Zhu PII: DOI: Reference:

S0305-0483(19)30292-0 https://doi.org/10.1016/j.omega.2019.102174 OME 102174

To appear in:

Omega

Received date: Accepted date:

4 March 2019 18 December 2019

Please cite this article as: Qiang Wang, Nenggui Zhao, Jie Wu, Qingyuan Zhu, Optimal Pricing and Inventory Policies with Reference Price Effect and Loss-Averse Customers, Omega (2019), doi: https://doi.org/10.1016/j.omega.2019.102174

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2019 Published by Elsevier Ltd.

Highlights • A dynamic pricing model that jointly consider inventory decision is studied. • Customers in the market are loss-averse. • Reference price has a directly impact on the customers’ purchase utility. • Demand is determined from customers’ purchase utility through an MNL model.

1

Optimal Pricing and Inventory Policies with Reference Price Effect and LossAverse Customers Qiang Wanga E-mail address: [email protected] School of Management, University of Science and Technology of China, Hefei, China, 230026

a

Nenggui Zhaob,* E-mail address: [email protected] b School of Management, Hefei University of Technology, Hefei, China, 230009

Jie Wua,* E-mail address: [email protected] School of Management, University of Science and Technology of China, Hefei, China, 230026

a

Qingyuan Zhuc,d E-mail address: [email protected] College of Economics and Management, Nanjing University of Aeronautics and Astronautics, Nanjing, China, 210000

c

d

Research Centre for Soft Energy Science, Nanjing University of Aeronautics and Astronautics, Nanjing, China, 211106

* Corresponding author

Optimal Pricing and Inventory Policies with Reference Price Effect and Loss-Averse Customers

Abstract Research on customer behavior and marketing has shown that reference prices significantly influence the customers’ purchase behaviors and retailers’ policies. In this paper, we consider a retailer selling a single product to loss-averse customers over a finite planning horizon, and we address the reference price effect on the retailer’s optimal pricing and inventory policies. The customers’ demand is determined from their purchasing utility through a Multinomial Logit (MNL) model, and the utility is contingent on the reference price and the current sales price. A generalized model is presented to jointly characterize the optimal pricing and inventory policies to maximize the retailer’s total expected profit. First, we derive the optimal myopic policy for the retailer in a single-period and reveal the sensitivity of the optimal myopic policy to the reference price. Next, we perform an equivalent transformation on the model such that the single-period revenue is jointly concave, which is critical to characterize the optimal policy in the multi-period case. Dynamic programming is then used to analyze the optimal policy for the retailer. Interestingly, we find that a reference price-dependent base-stock-list-price policy is proved to be optimal in each period. Keywords: Pricing, Inventory, Loss-averse customer, Reference price, Multinomial Logit model 1. Introduction Motivation. With the rapid development of economy and information technology, modern customers are becoming more and more sophisticated in their purchasing behavior, and retailers are increasingly cognizant that they should price and order dynamically to meet customers’ demands and maximize the total profits. Previous literature has declared that customer behaviors play an important role in retailers’ decisions, and that detailed knowledge of customers’ behaviors can help retailers to develop successful business strategies. This effect is well-recognized in the behavioral decision literature on retailers’ decisions and profitability (e.g., Kalyanaram and Winer 1995 and Zhang et al. 2014). With the rapid development of network technology, it is very convenient and efficient for online customers to utilize search engines or other tools (such as Youdao Shopping

Preprint submitted to Omega

December 19, 2019

Assistant1 and price-comparison web-sites 2 ) to observe the historical prices of products (see Zhao and Yu 2012). This knowledge often leads customers to form their own “price expectations”, also referred to as reference prices, after observing the historical prices. The reference prices greatly influence the customers’ purchase intention and further, the retailers’ decisions and profitability. For example, when a price promotion is implemented for a product, customers would feel gains from purchasing the product, which will stimulate purchasing and accelerate sales. On the contrary, driving up the price would reduce customers’ purchasing desire as it causes them to feel losses. Customers react differently to the current sales price relative to the reference price. If the current sales price is higher (lower) than the reference price, they perceive it as a loss (gain), and this perception affects their purchasing utility significantly. The customers are generally classified into three types according to their sensitivity to gains and losses (Hu et al. 2016): Loss-averse customers are those who are more sensitive to losses than gains and are more willing to wait for discounts; Gain-seeking customers differ from the loss-averse customers in that they are more sensitive to gain, are very eager to get the new product, and fear the risk of stockout; Loss-neutral customers are those who are equally sensitive to gains and losses. Prospect Theory stated that customers are more sensitive to losses (Kahneman and Tversky 1979), i.e., it is reasonable to assume that customers are loss-averse in the market. The above train of thought suggests that it is significant to explore how a retailer sets prices or inventory levels over time to maximize profit while considering the reference price and loss-averse customers. The research topic is also motivated by practical cases. For example, some fans of Apple purchase new products immediately when these products are released, while a large proportion of the fans will purchase these products when they are on sale. Customers who purchase later always prefer to purchase products at lower prices and they will feel a loss if they pay more for the same product, i.e., these customers are loss-averse. Besides, these customers are generally accustomed to comparing the current sales prices with the “price expectations” that formed in their mind, before making purchase decisions. The modeling literature on pricing generally assumes the reference price has a direct impact on the demand function, a relationship that often is linear (see Wang 2016 and Chen et al. 2016a). To the best of our knowledge, few published research have attempted to assume the reference price directly affects customers’ purchasing utility and further demand function (e.g., Kallio et al. 2009, Wang 2018, and Zhao et al. 2019a). In this paper, we assume that the customers’ purchasing utility is linear in the reference price, and that the demand function is determined from the utility through a Multinomial Logit Model (MNL), an approach which we believe is feasible and innovative. 1 2

http://zhushou.huihui.cn/. see http://tool.manmanbuy.com/HistoryLowest.aspx.

4

Mental accounting theory (Thaler 1985) suggested that whether a customer purchases a product depends on his purchasing utility. Hu et al.

(2017) also discussed in detail the relationship

between the demand function and customer utility. The model of the demand function in the current paper is similar to Wang (2018), in which the author considered a multi-product setting where the reference price may be the lowest or the highest price for all the products. In the current paper, however, we consider a single product and assume that the reference price is formed from the historical prices of the product by an exponential smoothing process. Our goal is to operationalize the specific demand function in a joint pricing and inventory context and explore the optimal pricing and inventory policies for a retailer. In this paper, we consider a retailer selling a single product to loss-averse customers with reference price over a finite planning horizon. We address the reference price effect on the retailer’s optimal pricing and inventory policies in single-period and multi-period scenarios. A dynamic model is presented to determine the optimal policies to maximize the retailer’s total expected profit. Our contributions. The contributions of this paper mainly focus on two aspects. The first lies in the research framework; this is the first paper that determines joint pricing and inventory decisions with the reference price effect and loss-averse customers. The modeling literature on the reference price effect assumes that the reference price effect is reflected in the demand function directly (generally in a linear relationship), while we assume it is reflected in the customers’ purchasing utility, which will further affect the demand function. More specifically, we assume the utility is contingent on the reference price and current sales price, and that the demand function is determined from the utility through an MNL model, which fits reality better than previous assumptions. We obtain the interesting result that a reference price-dependent base-stock-list-price policy is optimal in each period. Since this particular angle has not yet been published in the literature, we believe this paper is a valuable supplement to the literature on joint pricing and inventory decisions. Besides, this paper reflects the real business environment and could help managers make better decisions and generate higher profits. The second contribution aspect is that, from a methodological standpoint, the special definitions of loss-averse customers and demand function make the analysis of the dynamic pricing and inventory problem substantially more difficult. Our problem is equivalent to solving a dynamic programming model with a nonlinear and nonsmooth demand function, requiring an equivalent transformation on the problem. Organization of the paper. The organization of this paper is as follows: §2 reviews the relevant literature. In §3, we describe a demand function which is determined from the customers’ purchasing utility through an MNL model, where the utility depends on the reference price and the current sales price. §4 characterizes the optimal myopic policy for the retailer in a single-period

5

scenario. In §5, we perform an equivalent transformation on the model and show the optimality of a reference price-dependent base-stock-list-price policy in the multi-period scenario. §6 concludes the paper and suggests future research directions.

2. Literature review There is extensive literature on pricing and inventory models for revenue management (see Federgruen and Heching 1999 for a review). This paper focuses on investigating the optimal pricing and inventory policies with the reference price effect and loss-averse customers. The most representative and relevant literature is reviewed below. The first stream of research focuses on pricing models with the reference price effect, which has been studied widely in academia (see Mazumdar et al. 2005 and Arslan and Kachani 2010 for a review). A reference price generally acts as an internal standard against which the current sales price is judged, and it evolves as a function of past prices (Kalyanaram and Winer 1995). Most of the literature on the reference price effect assume it influences sales demand and overall expected profit directly. Kopalle et al. (1996) considered two reference price formation processes and showed that cyclical prices may be optimal when demand is a function of the reference price. Popescu and Wu (2007) and Zhang et al. (2014) explored optimal pricing policies to maximize the total profit, where the demand functions are linear in the reference price and the current sales price. Dye and Yang (2016) proposed a joint dynamic pricing and preservation technology investment model with time-and-price sensitive demand and reference price effects. Chen et al. (2016a) analyzed a finite horizon dynamic pricing model in which demand is linear in the reference price. A significant challenge in their research is the asymmetric reference price effect, and the authors developed strongly polynomial-time algorithms to compute the optimal prices for several scenarios. Wang (2016) considered a dynamic pricing problem in which customers arrive in heterogeneous time periods and their purchase decisions are affected by reference prices; they showed that the optimal pricing strategy is cyclic. Lu et al. (2016) investigated a joint pricing and advertising problem with reference price effect, where the sales price and advertisement impact the reference price, studying fundamental marketing strategies. All of the literature mentioned above posited that the reference price has a direct effect on demand function; few studies assumed that the reference price impacts customers’ purchasing utility, valuation or purchasing probability directly. Kallio et al. (2009) assumed the customers’ preferences are given by a utility model, and the preference is contingent on the reference price and the current sales price. Wang (2018) considered the reference price affects the customers’ utility directly, then on demand through an MNL model. Zhao et al. (2019a) assumed the customers’

6

reference prices affect their purchasing valuations directly, which would greatly affect their purchase behavior. Cao et al. (2019) assumed the customers’ reference prices affect their willingness-to-pay (WtP), which influences their purchasing probability, and then further affects their demand for the products. It is worth mentioning that there are few references that jointly consider the reference price effect and loss-averse customers in a pricing model, while this topic is of great relevance to our study. Loss-averse customers have been extensively researched in the existing literature. For example, Heidhues and K˝oszegi (2008) assumed that the customers in the market are loss averse, and their sensitivity to losses increases the price sensitivity of demand, which enhances the intensity of price competition. Matzke et al. (2016) modeled a Stackelberg game involving a set of loss-averse customers and one manufacturer to assess the ability of upgrade auctions to allocate excess options to customer orders. Yang et al. (2018) considered a service system in which customers are loss averse toward both price and delay attributes, and they found that with loss-averse customers a firm can gain a greater profit in a duopoly market than in a monopoly market. However, publications which jointly consider the reference price effect and loss-averse customers are still scarce. Tversky and Kahneman (1991) confirmed that loss aversion does exist and provided a reference-dependent model based on utility theory. They found that a utility function is concave for gains and convex for losses, and the loss aversion utility function is S-shaped. Popescu and Wu (2007) considered a dynamic pricing problem with reference price, showing that the optimal prices converge to a constant steady-state price if the customers in the market are loss-averse. Nasiry and Popescu (2011) analyzed optimal pricing strategy with loss-averse customers and reference price effects, but they assumed the reference price evolves according to a peak-end rule while in this paper we assume it evolves by an exponential smoothing process, which is the most commonly used model in the existing literature on reference price effect. Zhao et al.

(2019b) considered

a pricing model with reference price effect and risk-preference customers, assuming customers in the market are split between loss-averse, loss-neutral, and loss-seeking. They also analyzed the pricing decisions when the customers are all loss-averse. One difference between our study and all of the above literature is that we also consider the inventory problem, which significantly affects the managers’ pricing decisions. Thus, we believe this study is a valuable supplement to the literature on pricing and inventory. The second stream of research jointly considers pricing and inventory under the reference price effect; this topic has been gaining considerable attention in operations management in recent years. Urban (2008) analyzed a joint inventory and pricing model with reference price effect in a single-period scenario, assuming demand is uncertain and the price is elastic. The author provided

7

a numerical analysis showing that the reference price substantially impacts the sellers’ optimal decisions and profitability. Gimpl-Heersink (2008) and Taudes and Rudloff (2012) proposed an integrated pricing and inventory control model with a two-period linear demand model, in which demand is contingent on reference price. They proved that a base-stock-list-price (BSLP) policy is optimal for the replenishment opportunity. The BSLP policy was first introduced by Porteus (1990), and was first shown to be optimal in special models considered by Thowsen (1975). The policy states that an optimum inventory level and price pair exist in each period. If the on-hand inventory is below the optimal level, the retailer or firm places an order to reach the optimal level and sets the optimum price; Otherwise, nothing is ordered and a markdown price is charged. Note that Federgruen and Heching (1999) combined pricing and inventory control without the reference price and showed the optimality of a BSLP policy. Zhang (2010) and G¨ uler et al. (2014) considered joint pricing and inventory level problems with reference price effect, and showed that the optimal policy is a reference price-dependent order-up-to (BSLP) policy. Wu et al. (2015) studied the influence of reference price on a retailer’s dynamic pricing and inventory strategies in a two-period horizon, exploring the changes of the optimal strategy under differing conditions. However, none of the literature mentioned above considers customers’ preference for gains and losses. In the following, we review several papers that are most related to our work. Chen et al. (2016b) jointly considered inventory and pricing decisions with reference price effect and lossaverse customers, which is also our research topic. They showed that a reference price-dependent BSLP policy is optimal for firms, but one distinct feature of their paper is that they assume the reference price affects the customers’ demand linearly. In the current paper, we assume the customers’ demand is determined from the customers’ purchasing utility through an MNL model, and that the utility is contingent on the reference price and the current sales price. Surprisingly, we can prove that a reference price-dependent BSLP policy is still optimal for firms. Another recent work by Wang (2018) studied a joint assortment planning and pricing problem in a multi-product setting, in which the customers’ reference prices affect their utility for each product and their choice probability for each product is determined from the utility through an MNL model. This scenario is similar to ours, but Wang (2018) did not consider the customers’ preference to gains and losses, and the reference price in his paper may be the lowest price or the highest price for all products in a consumer’s consideration set. In our paper, we consider a joint pricing and inventory problem in a single-product setting, the reference prices are formed from the historical prices of the product by an exponential smoothing process, and the customers in the market are loss-averse.

8

3. Model description Consider a retailer selling a single product to loss-averse customers over a finite planning horizon with T periods. The retailer faces demand through successive periods and simultaneously decides on order quantity and sales price for each period to maximize the total expected profit. At the beginning of any period t ∈ {1, 2, . . . , T }, he3 first observes reference price rt and initial inventory level xt . He then orders to bring the inventory level up to yt and selects a price pt ∈ P, where P = [p, p¯] (0 < p < p¯) is a feasible price set, p and p¯ are the lowest and highest sales prices, respectively. The retailer dynamically places orders and sets prices over the T periods to maximize the total expected profit. 3.1. Purchasing utility with reference price Customers generally remember or observe past prices of a product and update their reference prices by balancing their existing reference prices and observed prices. In this paper, we model reference prices using an exponential smoothing process (e.g., Mazumdar et al. 2005 and Chen et al. 2016b), in which the reference price in period t, rt , is a weighted average of the sales price pt−1 and the reference price rt−1 in period t − 1. Specifically, given the initial reference price r0 , the reference price rt evolves as follows: rt = αrt−1 + (1 − α)pt−1 , t ∈ {1, 2, . . . , T },

(1)

where α ∈ [0, 1) is memory factor, reflecting how strongly past prices affect the reference price. As α increases, customers are less sensitive to the new price information; In the extreme case of α = 1, reference prices remain a constant r0 over the whole planning horizon, and thus we restrict α < 1 to avoid the case where the past prices have no impact on the customers. It is reasonable to assume that r0 ∈ P, in which case it is easy to verify that all rt ∈ P. Mental accounting theory (Thaler 1985) has posited that the total utility from purchasing a product, U (pt , rt ), is composed of two parts: acquisition utility w(pt ) and transaction utility u(pt , rt ), i.e., U (pt , rt ) = w(pt ) + u(pt , rt ). In period t, a customer’s acquisition utility can be defined as w(pt ) = a − pt , where a is assumed to be the customer’s full acquisition utility. The effect of the reference price on utility operates via the difference between the current sales price and the reference price. Specifically, we define u(pt , rt ) as the customer’s transaction utility in 3

For convenience, we use “he” represents retailer and “she” represents customer in what follows.

9

period t; it can be written as follows:   λ(rt − pt ), if rt ≥ pt u(pt , rt ) = λ max{rt − pt , 0} + γ min{rt − pt , 0} =  γ(r − p ), if p > r , t t t t

where λ and γ > 0 capture the customers’ asymmetric anchoring on reference dependence, rt − pt , and we call λ and γ risk coefficients hereafter. Then, U (pt , rt ) can be expressed as follows (see Schweitzer and Cachon 2000 and Song et al. 2017 for similar definitions regarding utility):

  a − pt + λ(rt − pt ), if rt ≥ pt U (pt , rt ) =  a − p + γ(r − p ), if p > r . t t t t t

(2)

Customers in the market are generally classified into three types according to their asymmetric anchoring on the reference dependence: loss-averse, gain-seeking, and loss-neutral4 . The reference dependence is referred to as gain if rt ≥ pt and as loss if rt < pt . Loss-averse customers are those who are more sensitive to perceive surcharges than discounts of the same reference dependence; Gain-seeking customers are just the opposite, and loss-neutral customers are those who are equally sensitive to surcharges and discounts. Customers are loss-averse, gain-seeking, and loss-neutral depending on whether γ > λ, γ = λ, or γ < λ (see Chen et al. 2016b for a similar classification). Positive λ(rt − pt ) can be interpreted as additional utility with respect to acquisition utility and negative γ(rt − pt ) can be interpreted as a utility reduction. Prospect Theory (Kahneman and Tversky 1979) postulated that customers are more sensitive to losses than gains in the market, and hence we focus on the loss-averse customer in this paper, i.e., γ > λ. Before proceeding, we make the following assumption on the customers’ purchasing utility. Assumption 1. A customer may choose to purchase the product only when her purchasing utility is greater than 0, i.e., U (pt , rt ) > 0. The assumption is widely applied in the existing literature, e.g., Talluri and Van Ryzin (2004), it implies that if a customer’s purchasing utility is less than 0, she will not purchase the product. 3.2. Demand function with Multinomial Logit model In this subsection, we apply a Multinomial Logit (MNL) model to transform the customers’ purchasing utility into a purchase probability, which is closely tied with customers’ demand for the product. The MNL model is well-known and widely used in revenue management due to 4

For convenience, we use loss-neutral instead of gain/loss-neutral in what follows.

10

its simplicity and tractability in estimation, especially when modeling customer choice behavior. Anderson and De Palma (1992) offered a comprehensive description of consumer discrete choice models including the MNL model. Essentially, the MNL model is a special case of the stochastic utility model for a statistically homogeneous population. Kallio et al. (2009) converted customer’s utility to demand through an MNL model, in which an asymmetrical reference price effect is reflected in the utility. Another recent work by Wang (2018) considered a joint assortment planning and pricing problem in a multi-product setting, in which the customers’ choice (purchase) probability is determined from their utility through an MNL model. Besides, their utility is affected by reference price and the reference price may be the lowest price or the highest price for all products in the consideration set of a consumer. Here, we use a similar method to characterize the customers’ purchase probability and demand. Specifically, given sales price pt and reference price rt in period t, a customer’s purchasing utility is U (pt , rt ) > 0 if she makes a purchase, otherwise the utility is denoted as U0 = 0. Denote P (pt , rt ) as the probability he chooses to purchase. Thus, following Dong et al. (2009) and Wang (2018), we have P (pt , rt ) =

exp(U (pt , rt )) exp(U (pt , rt )) = . exp(U0 ) + exp(U (pt , rt )) 1 + exp(U (pt , rt ))

(3)

We model the demand in period t as D(pt , rt ) = d(pt , rt ) + εt , where d(pt , rt ) is the expected demand, and εt is a random variable with density function φ(εt ) and cumulative distribution Φ(εt ). We assume that E(εt ) = 0 and V ar(εt ) < +∞. Define dk (pt , rt ) =

N exp(a−pt +k(rt −pt )) 1+exp(a−pt +k(rt −pt )) ,

k ∈ {λ, γ},

where N is the market size (see Zhao et al. (2019b) for similar definition); when rt ≥ pt , we have k = λ; otherwise, k = γ. Loss-aversion (γ > λ) allows us to write the kinked expected demand function d(pt , rt ) as the minimum of two smooth functions: dk (pt , rt ), for k ∈ {λ, γ}. Then, the expected demand can be written as follows: d(pt , rt ) : = E[D(pt , rt )] = min {dλ (pt , rt ), dγ (pt , rt )}   N exp(a − pt + λ(rt − pt )) N exp(a − pt + γ(rt − pt )) = min , . 1 + exp(a − pt + λ(rt − pt )) 1 + exp(a − pt + γ(rt − pt ))

(4)

Obviously, it follows from Eq.(2)-Eq.(4) that customers with a higher reference price would have a greater purchase probability, and correspondingly a larger demand for the product. Figure 1 displays a demand function example with reference price r = 3.5, where “r.p.” denotes “reference price”. From Figure 1, we observe that the reference price has a significant negative effect on the demand if the sales price is set higher than the reference price, otherwise has a positive effect on the demand. In contrast to the existing literature which assumes that the demand function varies linearly with the reference price and the current sales price (e.g., Wang 2016 and Chen et al.

11

ZLWKUS ZLWKRXWUS





([SHFWHGGHPDQGG SU









  





3ULFHS







Figure 1: Demand function example. The curve with ‘*’ displays an example of the demand function in (4) with parameters N = 30, a = 10.6, λ = 1, γ = 1.5. The reference price in this example is r = 3.5. The curve with ‘x’ shows the demand function with the same parameters in the absence of the reference price.

2016a), the demand function in our example seems more realistic. Lemma 1. For any period t ∈ {1, 2, . . . T }, we have the following results: (1) d(pt , rt ) is decreasing in pt but increasing in rt ; (2) d(pt , rt ) is continuous and jointly concave; (3) d(pt , rt ) is increasing in λ but decreasing in γ. Lemma 1 shows some vital properties of the demand function. Customers with a higher reference price would generally have a higher price expectation for the product, and then be more likely to purchase the product in view of the current sales price. Unless stated otherwise, by the concavity(convexity) of a function we mean joint concavity (convexity) of that function in all of its variables. The concavity of d(pt , rt ) reflects most realistic practical situations, i.e., demand is more sensitive at lower reference prices or higher sales prices. The concavity is also essential to determine the specific optimal pricing and ordering policy for the retailer. Besides, Lemma 1 suggests that a higher risk coefficient λ or a lower γ would generate a higher demand, which can be attributed to the risk coefficient λ(γ) having a positive (negative) effect on the customers’ purchasing utility. 3.3. The sequence of events and objective function Figure 2 illustrates the sequence of events. At the beginning of period t, the retailer first observes the reference price rt and the initial inventory level xt . Then, he simultaneously decides order-up-to level yt and sales price pt . The order arrives immediately. Then demand is realized and all the unsatisfied demand is backlogged to the next period. Finally, the retailer collects the profit and incurs the holding, and back-order costs. We assume that the realized demand is satisfied only

12

by on-hand inventory, unsatisfied demand is fully backlogged, and any excess inventory is carried over to the next period. Our notation is summarized in Table 1 for later reference. ĞĐŝĚĞŽƌĚĞƌͲƚŽͲƵƉŝŶǀĞŶƚŽƌLJ ĞŵĂŶĚŝƐƌĞĂůŝnjĞĚĂŶĚƵŶƐĂƚŝƐĨŝĞĚ ůĞǀĞůLJƚ ĂŶĚƉƌŝĐĞƉƚ ĚĞŵĂŶĚŝƐďĂĐŬůŽŐŐĞĚ

WĞƌŝŽĚƚнϭ

WĞƌŝŽĚƚ KďƐĞƌǀĞŝŶŝƚŝĂůŝŶǀĞŶƚŽƌLJ ůĞǀĞůdžƚ ĂŶĚƌĞĨĞƌĞŶĐĞƉƌŝĐĞƌƚ

KƌĚĞƌĂƌƌŝǀĞƐ ŝŵŵĞĚŝĂƚĞůLJ

ǀĂůƵĂƚĞƉƌŽĨŝƚͬĐŽƐƚĂŶĚ ƵƉĚĂƚĞƌĞĨĞƌĞŶĐĞƉƌŝĐĞƌƚнϭ

Figure 2: Sequence of events.

Table 1: Notation Description

Description

N

Market size

P

Feasible price set P ∈ [p, p¯]

α

Memory factor α ∈ [0, 1)

δ

Discount factor δ ∈ [0, 1)

λ, γ

Risk coefficients λ, γ > 0

h

Unit holding cost

b

Unit back-order cost

φ(·)

Density function of random variable εt

Φ(·)

Distribution function of random variable εt

wt

Customers’ acquisition utility in period t

µt

Customers’ transaction utility in period t

Ut

Pt

Customers’ purchase probability in period t

xt

Customers’ purchasing utility in period t Initial inventory level at the beginning of period t

yt

Inventory level after receiving order in period t

Ht (Gt )

Inventory(expected) cost in period t

Dt (dt )

Demand(expected) in period t

Let Ht (z) be the inventory cost when the retailer is left with inventory z at the end of period t. A holding cost is incurred when z > 0 and a back-order cost when z < 0. Specifically, we consider Ht (z) = hz + + bz − , where z + = max{z, 0}, z − = max{−z, 0}, and h(> 0) and b(> 0) are the per period unit holding cost and unit backlogging cost, respectively. The expected inventory cost in period t is then given by Gt (yt , pt , rt ) = EHt (yt − D(pt , rt )) Z yt −d(pt ,rt ) Z = h(yt − d(pt , rt ) − εt )φ(εt )dεt − −∞

+∞

yt −d(pt ,rt )

b(yt − d(pt , rt ) − εt )φ(εt )dεt . (5)

To facilitate the analysis, we impose the following general assumption. Assumption 2. b − (h + b)Φ(yt − d(pt , rt )) ≤ 0. This assumption is commonly proposed in the inventory literature (e.g., Cheng and Sethi 1999 13

and Zhang et al. 2008). It is made to rule out the trivial cases including “always order nothing” or “order infinity”. We have the following result. Lemma 2. For any period t ∈ {1, 2, . . . , T }, if Assumption 2 holds, then Gt (yt , pt , rt ) is jointly convex. The convexity of the expected inventory cost is vital to characterize the optimal policy for the retailer. Federgruen and Heching (1999) proved that the inventory cost is jointly convex in orderup-to level and sales price, but they did not consider the reference price effect. G¨ uler et al. (2014) transformed the problem by defining an inverse demand function and took the mean demand as the variable instead of price so that the inventory cost is jointly convex. Loss aversion (γ > λ) allows us to write the single-period revenue Π(pt , rt ) as the minimum of two smooth revenue functions: Πk (pt , rt ) = pt dk (pt , rt ), for k ∈ {λ, γ}. That is, Π(pt , rt ) = min{Πλ (pt , rt ), Πγ (pt , rt )}. Let Vt (xt , rt ) denote the maximum expected discounted profit at the beginning of period t and Vˆt (yt , pt , rt ) denote the value function in period t after receiving the order and setting the price. Note that we normalize the unit ordering cost as 0 in this paper, it is well known (e.g. Huh and Janakiraman 2008 and shen et al. 2018) that a system with a positive unit ordering cost can be transformed equivalently into the one in which the ordering cost is 0 but the other cost parameters need to be modified suitably. Then, the dynamic programming model can be written as follows: P roblem (O) : Vt (xt , rt ) = max {Vˆt (yt , pt , rt )}, y ,p t

t

(O1)

yt ≥xt

 Vˆt (yt , pt , rt ) = Π(pt , rt ) − Gt (yt , pt , rt ) + δEVt+1 yt − D(pt , rt ), αrt + (1 − α)pt ,

(O2)

where δ ∈ [0, 1) is the discount factor. The terminal value VT +1 (xT +1 , rT +1 ) = 0. 4. Optimal myopic pricing and inventory policy In this section, we first analyze the optimal pricing and inventory policies for the retailer in a single-period, i.e., the optimal myopic policy, that is chosen to maximize the profit of the current period without considering the impact of the policy on subsequent periods. Then, we compare the optimal myopic policy with the case that the retailer ignores (or not be aware of) the reference price effect. Give a reference price rt , we define [˜ p(rt ), y˜(rt )] as the optimal myopic pricing and inventory

14

policy and then it can be obtained by solving the following problem: V¯t (yt , pt ) = Π(pt , rt ) − Gt (yt , pt , rt ). Note that when multiple optimal solutions exist for the above problem, we always select the one which gives the lexicographically smallest among the optimal solutions for convenience. Lemma 3. For any period t ∈ {1, 2, . . . , T }, V¯t (yt , pt ) is jointly concave in yt and pt .

The concavity of V¯t (yt , pt ) implies that we could derive the optimal solution, i.e., the optimal

myopic policy, to maximize V¯t (yt , pt ) by applying first-order-condition. Taking the derivatives of V¯t (yt , pt ) with respect to yt and pt , respectively, we obtain ∂ V¯t (yt , pt ) ∂yt ¯ ∂ Vt (yt , pt ) ∂pt where

∂Gt ∂pt

=

∂Gt ∂dk (pt ,rt ) , ∂yt ∂pt

from Assumption 2 that

= −

= dk (pt , rt ) + pt ·

and

∂Gt ∂yt

∂Gt , ∂yt

∂Gt ∂yt

∂dk (pt , rt ) ∂Gt ∂dk (pt , rt ) − , ∂pt ∂yt ∂pt

= (h + b)Φ(yt − dk (pt , rt )) − b, for k ∈ {λ, γ}. It follows

≥ 0, which is consistent with the the actual situation that the more

products the retailer orders the higher the inventory cost will be. Define [˜ pk (rt ), y˜k (rt )] as the optimal solution to

∂ V¯t (yt ,pt ) ∂yt

= 0 and

∂ V¯t (yt ,pt ) ∂pt

= 0. Then, [˜ pk (rt ), y˜k (rt )] satisfies: 

b  , b+h 1 + exp(a − p˜k (rt ) + k(rt − p˜k (rt ))) . p˜k (rt ) = 1+k

y˜k (rt ) = dk (˜ pk (rt ), rt ) + Φ−1

(6) (7)

By analysing the above equations, we obtain the following results. Proposition 1. p˜k (rt ) is increasing in rt and k; y˜k (rt ) is decreasing in k. Proposition 1 implies that given a reference price rt , p˜γ (rt ) > p˜λ (rt ) (γ > λ) holds. The result is consistent with the model setting that when rt ≥ pt , the risk coefficient is λ, while the risk coefficient is γ when rt < pt , which further implies that if rt < p˜λ (rt ) < p˜γ (rt ), the optimal myopic price is p˜γ (rt ), and the corresponding optimal myopic inventory level is y˜γ (rt ); if rt > p˜γ (rt ) > p˜λ (rt ), the optimal myopic price is p˜λ (rt ), and the corresponding optimal myopic inventory level is y˜λ (rt ). Note that if p˜λ (rt ) ≤ rt ≤ p˜γ (rt ), the optimal myopic price would be rt , which is a common observation in the existing literature, e.g., Chen et al. (2016b) and Popescu and Wu (2007). Define y˜rt as the optimal myopic inventory level when the optimal myopic price is  N exp(a−rt ) b + Φ−1 b+h . Besides, it follows from Eq. (6) rt , and it follows from Eq.(6) that y˜rt = 1+exp(a−r t)

and Lemma 1 that the retailer should order a lower inventory level if he sets a higher price, i.e., 15

y˜γ (rt ) < y˜λ (rt ), which is a significant finding in the single-period. Clearly, the optimal myopic price could be p˜γ (rt ), p˜λ (rt ) or rt , i.e., the optimal myopic price p˜(rt ) ∈ {˜ pγ (rt ), p˜λ (rt ), rt }, and the optimal myopic inventory level y˜(rt ) ∈ {˜ yγ (rt ), y˜λ (rt ), y˜rt }.       Then, the optimal myopic policy would be p˜λ (rt ), y˜λ (rt ) , p˜γ (rt ), y˜γ (rt ) or rt , y˜rt . We have

the following results (the proof of this specific result follows from the above discussion and is thus omitted). Lemma 4. For a given reference price rt in period t,

  (1) if rt < p˜λ (rt ) < p˜γ (rt ), the optimal myopic policy is p˜γ (rt ), y˜γ (rt ) ;

  (2) if p˜λ (rt ) < p˜γ (rt ) < rt , the optimal myopic policy is p˜λ (rt ), y˜λ (rt ) .   (3) if p˜λ (rt ) ≤ rt ≤ p˜γ (rt ), the optimal myopic policy is rt , y˜λ (rt ) .

Given a reference price rt , it follows from Eq. (6) that the optimal inventory level only depends

on the optimal price, and hence three kinds of relationship between p˜γ (rt ), p˜λ (rt ) and rt arise.   Lemma 4 implies that p˜λ (rt ), y˜λ (rt ) is the optimal myopic policy if and only if p˜γ (rt ) < rt .     Similarly, p˜γ (rt ), y˜γ (rt ) is the optimal myopic policy if and only if p˜λ (rt ) > rt ; rt , y˜rt is the

optimal myopic policy if and only if p˜λ (rt ) ≤ rt ≤ p˜γ (rt ).

Modern customers are becoming more and more shrewd in their purchase behavior, and they will inevitably refer to past prices when considering the current sales price before completing their purchase decisions, i.e., the reference price effect truly does exist among the customers. This being the case, it would be interesting to explore the retailer’s optimal policies if he ignores the reference price effect, which can be interpreted as the reference price having no effect on the customers’ purchasing utility in this paper, i.e., λ, γ = 0. Define [¯ pt , y¯t ] as the optimal myopic policy if the retailer ignores the reference price. Then, y¯t and p¯t satisfy  b  N (1 + exp(a − p¯t ) + Φ−1 , 1 + exp(a − p¯t ) h+b p¯t = 1 + exp(a − p¯t ).

y¯t =

(8) (9)

Corollary 1. If the retailer ignores (or does not realize) the reference price effect in the system, the optimal myopic policy [¯ pt , y¯t ], is given by Eq.(8) and Eq.(9). In the following, we use an experiment to further illustrate the above theoretical results by analyzing and comparing the optimal myopic policies in two cases: with and without the reference price effect. We assume the market size N = 30, price feasible set P = [1, 10], customers’ full acquisition utility a = 10.6 (a > p¯ so that customer’s acquisition utility is greater that 0). The risk coefficients are λ = 1, and γ = 1.5. The unit holding cost h = 0.1, unit back-order cost 16

b = 0.6, memory factor α = 0.6 and discount factor δ = 0.9. εt is distributed as a standard normal distribution N (0, 1), i.e., εt ∼ N (0, 1). In Figure 3, the red lines with “ ∗ ” show the optimal myopic policies with the reference price effect, while the solid blue lines represent the optimal myopic policies without the reference price effect. We observe that with the reference price effect both the optimal myopic price and inventory are increasing in the reference price. From Figure 3(a), we find that there exists a threshold (˜ r = 8.55) such that when the reference price is lower than that threshold, i.e., rt ≤ r˜, the optimal myopic price is p˜γ (rt ) and p˜γ (rt ) ≥ rt ; otherwise, the optimal myopic price is p˜λ (rt ) and p˜λ (rt ) < rt . Without the reference price effect, the optimal price is p¯t = 8.7583 and the optimal inventory level is y¯t = 26.7566. When the reference price rt ≤ 8.8808, the optimal myopic price with the reference price effect is lower than that without the effect, and when the reference price rt ≤ 8.605, the optimal myopic inventory with the reference price effect is lower than that without the effect.

10

45

9

40 35 Myopic optimal inventory y(r)

Myopic optimal price p(r)

8 7 6 5 4 3

30 25 20 15 10

2

optimal price p=r without r.p.

1 0

with r.p. without r.p.

1

2

3

4

5 6 reference price r

7

8

5

9

0

10

(a)

1

2

3

4

5 6 reference price r

7

8

9

10

(b)

Figure 3: The retailer’s optimal myopic pricing and inventory policies with and without the reference price effect.

5. Optimal policy for the multi-period problem In this section, we explore the optimal pricing and inventory policies for the retailer in the multi-period scenario. We first perform an equivalent transformation on the original problem. Then, we prove that a reference price-dependent base-stock-list-price (BSLP)5 policy is optimal in each period. Next, we give an example to explore the optimal price and inventory level paths for the retailer. The BSLP policy is generally proved to be optimal in the literature jointly considering pricing and inventory, and a common method to prove the optimality of BLSP is to show the 5

The term “base-stock-list-price” policy was coined by Porteus (1990). A BSLP policy is characterized by a base stock level and list price combination. If the initial inventory level is below the base stock level, it is increased to the base stock level and the list price is charged; If the initial inventory is above the base stock level, then nothing is ordered, and a price discount is offered. The readers can refer to Federgruen and Heching (1999).

17

concavity of the profit-to-go function and the value function (e.g., Gimpl-Heersink 2008 and G¨ uler et al. 2014). Finally, we compare and discuss the optimal policies in two scenarios that the retailer considers of ignores the reference price effect. 5.1. Problem equivalence In the following, we perform a transformation that similar to Chen et al. (2016b) on problem (O) so that the transformed single-period revenue function is jointly concave (we know the singleperiod revenues Π(pt , rt ) is not jointly concave, see proof of Theorem 1). The joint concavity of the transformed function is a sufficient condition to show the concavity of the profit-to-go function and the value function. Specifically, we define transformed profit-to-go function in period t, Ut (xt , rt ), by subtracting a positive quadratic term θrt2 from Vt (xt , rt ): we get Ut (xt , rt ) = Vt (xt , rt ) − θrt2 . Then, problem (O) can be rewritten as ˆt (yt , pt , rt )}, P roblem (U ) : Ut (xt , rt ) = max {U y ,p t t yt ≥xt

(U1)

 ˆt (yt , pt , rt ) = Π(p ˆ t , rt ) − Gt (yt , pt , rt ) + δEUt+1 yt − D(pt , rt ), αrt + (1 − α)pt , U

(U2)

ˆ t , rt ) = Π(pt , rt ) − θr2 + θδ[αrt + (1 − α)pt ]2 can be viewed as the transformed singlewhere Π(p t

period revenue of problem (U ), and the terminal value is UT +1 (xT +1 , rT +1 ) = −βrT2 +1 . We denote

[p(xt , rt ), y(xt , rt )] as the lexicographically smallest optimal solution when multiple optimal solutions exist for the problem (U ). The optimal solution to problem (U ) also solves problem (O), and vice versa. By performing the transformation on problem (U ), the coefficient of rt2 is more negative while ˆ t , rt ), which bends the curvature of that of p2t is less negative in the transformed profit function Π(p ˆ t , rt ). Therefore, it is possible the function Π(pt , rt ) to an extent that yields jointly concavity in Π(p ˆ t , rt ) is jointly concave. to select an appropriate parameter such that Π(p Theorem 1. The following results hold: (1) Π(pt , rt ) is not jointly concave in pt and rt , but there exists a positive constant θ such that ˆ t , rt ) is jointly concave in pt and rt ; Π(p ˆ t , rt ) are supermodular in (pt , rt ). (2) Both Π(pt , rt ) and Π(p Item (1) is similar to Chen et al. (2016b), in which the authors also proved that there exists a positive constant parameter such that the single-period revenue is concave, the result is based on the linearity of the demand function with the reference price. In this paper, we prove that the 18

result still holds even if the demand function is determined from the reference price through an MNL model. The concavity of the transformed single-period revenue is essential to characterize the optimal policy in the multi-period case, and the specific policy will be explored later in §5.2. Item (2) confirms the intuition that myopic retailers, those focused on single-period revenue should charge higher sales prices if the customers have higher price expectations (i.e., reference prices), which is consistent with the optimal myopic policy (see Proposition 1). Let p∗ (rt ) be the maximizer ˆ t , rt ), i.e., p∗ (rt ) = arg maxpt ∈P Π(p ˆ t , rt ). Then, it immediately follows from Theorem 2.8.2 of Π(p in Topkis (2011) that p∗ (rt ) is increasing in rt . The result is also similar to Lemma 1 in Nasiry and Popescu (2011), in which the reference price evolves by a peak-end rule, and the demand function is linear with the reference price. 5.2. The optimal policy with reference price effect In a model that jointly consider pricing and inventory, a sufficient condition to show the optimality of a BSLP policy is to prove the concavity of the profit-to-go function Ut (xt , rt ) and value ˆt (yt , pt , rt ) (e.g., Gimpl-Heersink 2008 and G¨ function U uler et al. 2014). We have the following statements. Theorem 2. For any period t ∈ {1, 2, . . . , T }: (1) Vt (xt , rt ) and Ut (xt , rt ) are increasing in rt and λ, but decreasing in γ; (2) Ut (xt , rt ) is decreasing in xt and jointly concave; ˆt (yt , pt , rt ) is jointly concave and submodular in (yt , pt ). (3) U Theorem 2 states some vital properties of the profit-to-go and the value functions. Specifically, both Vt (xt , rt ) and Ut (xt , rt ) increase in the reference price rt and risk coefficient λ, and they decrease in the risk coefficient γ. Besides, Theorem 2 states that it is possible to ensure that ˆt (yt , pt , rt ) are jointly concave by properly selecting θ; Also, Ut (xt , rt ) is decreasing in Ut (xt , rt ) and U the initial inventory xt . Another significant property obtained from Theorem 2 is the submodularity ˆt (yt , pt , rt ) with respect to yt and pt , which implies that the optimal inventory level is decreasing of U in the optimal price. This relationship means that if the retailer selects a higher price, then he needs to select a lower inventory level to maximize the total profit, this property is also applicable to the single-period scenario (see §4). This result fits common intuition: if a retailer raises his sales price, then he must order less products to reduce inventory level; otherwise he will leave much product unsold at the end of the selling horizon. Based on the concavity shown in Theorem 2, problem (U ) admits optimal solutions in period t,     which we denote as p(xt , rt ), y(xt , rt ) . In the case of multiple maxima, select p(xt , rt ), y(xt , rt ) 19

to be the lexicographically smallest. Now, we are ready to characterize and analyze the optimal policy for the multi-period scenario. Consider the problem (U ) by relaxing the constraint yt ≥ xt , that is n o ˆ t , rt ) − Gt (yt , pt , rt ) + δEUt+1 yt − D(pt , rt ), αrt + (1 − α)pt . max Π(p yt ,pt

  Suppose that p(rt ), y(rt ) solves the above problem. We assume that [p(rt ), y(rt )] is the lexico-

graphically smallest if there are multiple solutions. Based on the above theoretical results, we show that a reference price-dependent BSLP policy is an optimal policy in each period, which is   characterized by a base-stock level and list-price combination, p(rt ), y(rt ) . If the initial inventory level is below the base-stock level, it is optimal for the retailer to increase the inventory level to

the base-stock level and charge the list-price. If, however, the initial inventory level is above the base-stock level, then he orders nothing and charges a markdown price to clear excess inventory. Moreover, the higher the excess in the initial inventory level, the larger the optimal discount offered to clear the inventory. That is, the optimal price is nonincreasing in the initial inventory level, and no discounts are offered unless the product is overstocked. When the initial inventory level is below the base-stock level, the higher the initial inventory level the larger the base-stock level should be set because the list price is independent of the initial inventory level. The following theorem states our main result. Theorem 3. For period t, given state (xt , rt ), there exists a reference price-dependent base-stock level y(rt ), a list sales price p(rt ), and a markdown price p(xt , rt ), such that the optimal pricing and inventory policy in period t are given as follows: (1) If xt ≤ y(rt ), it is optimal for the retailer to order up to the base-stock level y(rt ) (i.e., order y(rt ) − xt ) and charge the list price p(rt ) = p(y(rt ), rt ); (2) If xt > y(rt ), it is optimal for the retailer to order nothing (i.e., y(rt ) = xt ), and charge a markdown price p(xt , rt ), with p(xt , rt ) ≤ p(rt ); Moreover, p(xt , rt ) = p(rt ) if xt ≤ y(rt ) and y(xt , rt ) = max{xt , y(rt )}; Also, p(xt , rt ) is nonincreasing in xt and y(xt , rt ) is nondecreasing in xt . Theorem 3 shows the optimality of a reference price-dependent BSLP policy. The list price is independent of the initial inventory level, while the markdown price is closely related to the initial inventory level. The base-stock level is determined by the reference price if the initial inventory level is less than the base-stock level; otherwise, the base-stock level is the initial inventory level, i.e., order nothing. Obviously, the list-price and base-stock level are intimately tied to each other 20

via the reference price. Although Gimpl-Heersink (2008) and Chen et al. (2016b) showed results similar to Theorem 3, the former made an assumption that the expected inventory cost is jointly convex, and the latter assumed the demand function is linear in the reference price. Our paper is the first study that shows the optimality of a reference price-dependent BSLP policy in each period when the demand function is determined from customers’ purchasing utility through an MNL model, and the utility is closely tied to the reference price and the customers are loss-averse in the market. Next, we use an example to illustrate Theorem 3 and further investigate the optimal price and inventory level paths. We consider the following example: Example 1: We assume the initial (period 0) reference price is r0 , there is no initial on-hand inventory (x0 = 0) and the optimal (list) price in this period is p0 (r0 ). The demand increases with the reference price at a given sales price (see Lemma 1), and the fact that a higher sales price would generate a higher reference price follows from the formation of the reference price (see Eq.(1)); these two facts give the retailer an incentive to increase the sales price. Let pm1 (rm1 ) be the optimal price in period m1 such that in the next period m1 + 1 the initial inventory level is higher than the optimal inventory level, i.e., xm1 +1 (rm1 +1 ) > ym1 +1 (rm1 +1 ). If so, then the retailer will reduce price in period m1 + 1. Besides, since the markdown price p(xt , rt ) is decreasing in xt (see Theorem 3), we then obtain that pm1 +1 (xm1 +1 , rm1 ) is less than p0 (r0 ). Similar to the optimal price path during period [0, m1 ], the optimal price would increase to a threshold pm2 (rm2 ) such that in the next period the initial inventory level is higher than the optimal inventory level. Then, the optimal price in period m2 + 1 would decrease. We conclude that in this example the optimal price path would be cyclic as time goes on. See Figure 4(a), where p0 = p0 (r0 ), p1 = pmi +1 (xmi +1 , rmi +1 ) and p2 = pmi (rmi ) (i = 1, 2 . . . , mi ≤ T ). Figure 4(b) illustrates the optimal base-stock level path, and we can clearly observe that in the long run the optimal inventory is also cyclic. That level keeps decreasing until it reaches a threshold point such that the retailer would order noting in the next periods, where y0 = y0 (r0 ), and y1 = ymi (rmi ) denotes the points such that in the next period the initial inventory level is larger than the base-stock level, i.e., xmi +1 > y(rmi +1 ); y2 = ym0i (rm0i ) denotes the points such that the initial inventory level is less that the base-stock level (i.e., xm0i (rm0i ) < ym0i (rm0i )) causing the retailer to begin to order (again). In both subfigures, mi are the same threshold values. The opposite monotonicity of the optimal prices and base-stock level paths is also consistent with the ˆt (yt , pt , rt ) in Theorem 2. Note that the dash lines in [m1 , m0 ] and [m2 , m0 ] submodularity of U 1 2 denote the retailer ordering nothing in these periods.

21

y0 y2

p p2

y1

p1 p0 p 

m m

m m

m3



T

(a) Optimal price paths in multi-period

m m

m m

m T

(b) Optimal inventory paths in multi-period

Figure 4: Optimal Policy for Example 1.

In the following, we explore the relationship between the optimal policies and risk coefficients λ, ˆ t , rt ) as the minimum of two smooth functions: γ. Loss aversion (γ > λ > 0) allows us to write Π(p ˆ k (pt , rt ) for k ∈ {λ, γ}, where Π ˆ k (pt , rt ) = pt · dk (pt , rt ) − θr2 + θδ[αrt + (1 − α)pt ]2 . Consider the Π t following two problems: n o ˆ λ (pt , rt ) − Gt (yt , pt , rt ) + δEU λ yt − D(pt , rt ), αrt + (1 − α)pt , Utλ (rt ) = max Π t+1 yt ,pt n o ˆ γ (pt , rt ) − Gt (yt , pt , rt ) + δEU γ yt − D(pt , rt ), αrt + (1 − α)pt . Utγ (rt ) = max Π t+1 yt ,pt

(10) (11)

It follows from the concavity stated in Theorem 2 that optimal solutions exist for both problem (10) and problem (11). Denote [pλ (rt ), yλ (rt )] and [pγ (rt ), yγ (rt )] as the lexicographically smallest optimal solutions of (10) and (11), respectively. Similar to the analysis of the optimal myopic pricing and inventory policy in §4, an observation here is that the optimal price p(rt ) ∈ {pλ (rt ), pγ (rt ), rt }. Define yrt as the optimal inventory level if the optimal price is rt , then the optimal inventory level y(rt ) ∈ {yλ (rt ), yγ (rt ), yrt }.

Because of the nonsmooth demand function, complicated expected

inventory cost function, and the curse of dimensionality, it would be arduous to characterize the specific monotonicity of the optimal price with respect to the reference price and risk coefficients (may not exist). However, given the submodularity showed in Theorem 2, we know that if pλ (rt ) is larger than pγ (rt ), then yλ (rt ) < yγ (rt ), and vice versa. Then, we obtain the following results that are similar to Lemma 4. Corollary 2. For a given reference price rt in period t,   (1) when pλ (rt ) ≤ pγ (rt ): if rt < pλ (rt ) < pγ (rt ), the optimal policy is pγ (rt ), yγ (rt ) ; if pλ (rt ) <   pγ (rt ) < rt , the optimal policy is pλ (rt ), yλ (rt ) ; if pλ (rt ) ≤ rt ≤ pγ (rt ), the optimal policy is [rt , yrt ];

22

  (2) when pλ (rt ) > pγ (rt ): if rt < pγ (rt ) < pλ (rt ), the optimal policy is pγ (rt ), yγ (rt ) ; if pγ (rt ) <   pλ (rt ) < rt , the optimal policy is pλ (rt ), yλ (rt ) ; if pγ (rt ) ≤ rt ≤ pλ (rt ), the optimal policy is [rt , yrt ];

5.3. The optimal policy without reference price effect If the retailer ignores the reference price effect in the system, problem (U ) can be viewed as a specific case of the model studied by Federgruen and Heching (1999), in which the authors combined pricing and inventory control in face of demand uncertainty. To show the optimality of a BSLP policy, those authors made proposed several assumptions about the demand function and expected inventory cost so the objective function would be jointly concave. Without the reference price effect in the system, problem (U ) can be written as {Jˆt (yt , pt )}, P roblem (J) : Jt (xt ) = max y ,p

(J1)

t t yt ≥xt

 ˜ t) − G ˜ t (yt , pt ) + δEWt+1 yt − D(pt ) , Jˆt (yt , pt ) = Π(p

(J2)

˜ t ) = pt · d(pt ) is the single-period revenue in problem (J), D(pt ) = d(pt ) + εt is the where Π(p N exp(a−pt ) 1+exp(a−pt )

demand and d(pt ) =

is the expected demand. The terminal value JT +1 (xT +1 ) = 0.

The expected inventory cost without the reference price effect is ˜ t (yt , pt ) = G

Z

yt −d(pt )

−∞

h(yt − d(pt ) − εt )φ(εt )dεt −

Z

+∞

yt −d(pt )

b(yt − d(pt ) − εt )φ(εt )dεt . (12)

Similar to the analysis in §5.2, we show that if no reference price effect is considered, then a BSLP policy is still optimal in each period, which is characterized by a base-stock level and a list-price combination, [pt , yt ]. The following theorem states our main results when the retailer ignores the reference price effect. Theorem 4. For period t ∈ {1, 2, . . . , T }, given state (xt , rt ), (1) Jˆt (yt , pt ) is jointly concave in yt and pt , and Jt (xt ) is concave and nonincreasing in xt ; (2) If xt ≤ yt , it is optimal for the retailer to order up to the base-stock level yt and charge the list-price pt = p(yt ); If xt > yt , it is optimal to order nothing and charge a markdown price p(xt ). Moreover, p(xt ) = pt if xt ≤ yt and y(xt ) = max{xt , yt }; p(xt ) is nonincreasing in xt and y(xt ) is nondecreasing in xt . The results shown in Theorem 4 are similar to Theorem 1 in Federgruen and Heching (1999). Comparing Theorem 3 and Theorem 4, we obtain that whether or not the retailer ignores the 23

reference price effect in our model setting, a BSLP is an optimal policy in each period. The specific definition of the demand function in this paper is realistic and render the objective functions jointly concave, which is essential to characterize the optimality of a BSLP policy. Also, the reference price is nonnegligible because it exists in reality among the customers and links the optimal list-price and base-stock level closely.

6. Conclusions This paper throws new light on joint pricing and inventory policies issues by incorporating the reference price effect. We consider a retailer selling a single product to loss-averse customers over a finite horizon. District features of our paper including assumptions that the customers’ demand is determined from their purchasing utility through an MNL model, and the utility is contingent on the reference price and the current sales price. We first derive the optimal myopic policy for the retailer in a single-period and perform sensitivity analysis on the optimal myopic policy. Then, we formulate the problem as a dynamic programming model in the multi-period scenario. Because the dynamic programming model cannot be solved directly due to the dimension of the dynamic programming (reference price) and the nonsmooth demand function (loss-averse customers), we then perform an equivalent transformation on the problem, which allows us to characterize the optimality of a reference price-dependent BSLP policy in each period. The optimal list-price and base-stock level are intimately tied to each other via the reference price. We give an example to illustrate the optimal price and inventory level paths for the retailer. Finally, we prove that the optimal strategy is still a BSLP policy if the retailer ignores the reference price effect in the system. Several related topics are worthy of future research, exemplified by these three: First, we assume that the unsatisfied demand is backlogged in this paper, although unsatisfied demand is often lost in practice. What happens to the optimal pricing and inventory policies as well as profitability if the unsatisfied demand is lost? Second, are the optimal policies determined here still optimal if the customers in the market are gain-seeking? If not, how do they change? Third, in this paper, we assume the randomness in demand is an additive random term; the problem could be reconsidered assuming the randomness is a multiplicative random term.

Acknowledgements The authors would like to thank the editor and anonymous reviewers for their insightful comments and suggestions. This research was financially supported by National Natural Science Foundation of China (Grants 71571173, 71971203, and 71904084), the Natural Science Foundation for Jiangsu Province, China (Grant BK20190427), the Social Science Foundation of Jiangsu Province, 24

China (Grant 19GLC017), and the Fundamental Research Funds for the Central Universities, China (Grant XAB19005).

Reference References Anderson, S.P., De Palma, A., 1992. Multiproduct firms: A nested logit approach. The Journal of Industrial Economics 261–276. Arslan, H., Kachani, S., 2010. Dynamic pricing under consumer reference-price effects. Wiley Encyclopedia of Operations Research and Management Science Wiley, pp.1-17. Cao, P., Zhao, N., Wu, J., 2019. Dynamic Pricing with Bayesian Demand Learning and Reference Price Effect. European Journal of Operational Research, 279, 2, 540–556. Chen, X., Hu, P., Hu, Z., 2016a. Efficient algorithms for the dynamic pricing problem with reference price effect. Management Science 63, 12, 4389–4408. Chen, X., Hu, P., Shum, S., Zhang, Y., 2016b. Dynamic stochastic inventory management with reference price effects. Operations Research 64, 6, 1529–1536. Cheng, F., Sethi, S.P., 1999. A periodic review inventory model with demand influenced by promotion decisions. Management Science 45, 11, 1510-1523. Dong, L., Kouvelis, P., Tian, Z., 2009. Dynamic pricing and inventory control of substitute products. Manufacturing & Service Operations Management 11, 2, 317–339. Dye, C. Y., Yang, C.T. 2016. Optimal dynamic pricing and preservation technology investment for deteriorating products with reference price effects. Omega 62, 52–67. Federgruen, A., Heching, A., 1999. Combined pricing and inventory control under uncertainty. Operations Research, 47,3, 454–475. Gimpl-Heersink, L., 2008. Joint pricing and inventory control under reference price effects. Ph.D. thesis, WU Vienna University of Economics and Business. G¨ uler, M.G., Bilgi¸c, T., G¨ ull¨ u, R., 2014. Joint inventory and pricing decisions with reference effects. IIE Transactions 46, 4, 330–343. Heidhues, Paul., K˝oszegi, Botond., 2008. Competition and price variation when consumers are loss averse. American Economic Review 98, 4, 1245–68. Heyman, Daniel P, Matthew J Sobel. 1984. Stochastic models in operations research: stochastic optimization, Volume 2. McGraw-Hill, New York. Hu, Z., Chen, X., Hu, P., 2016. Dynamic pricing with gain-seeking reference price effects. Operations Research 64, 1, 150–157. Hu, Z., Nasiry, J., 2017. Are markets with loss-averse consumers more sensitive to losses? Management Science, 64, 3, 1384–1395.

25

Huh, W., Janakiraman, G., 2008. (s, S) optimality in joint inventory-pricing control: An alternate approach. Operations Research 56, 3, 783–790. Kahneman, D., Tversky, A. 1979. Prospect theory: an analysis of decision under risk. Econometrica 29, 263–291. Kallio, M., Halme, M., et al., 2009. Redefining loss averse and gain seeking consumer price behavior: based on demand response. Kalyanaram, G., Winer, R.S., 1995. Empirical generalizations from reference price research. Marketing Science 14, 3 supplement, 161–169. Kopalle, P.K., Rao, A.G., Assuncao, J.L., 1996. Asymmetric reference price effects and dynamic pricing policies. Marketing Science 15, 1, 60–85. Lu, L., Gou, Q., Tang, W., Zhang, J., 2016. Joint pricing and advertising strategy with reference price effect. International Journal of Production Research 54, 17, 5250–5270. Mazumdar, T., Raj, S.P., Sinha, I., 2005. Reference price research: Review and propositions. Journal of Marketing 69, 4, 84–102. Matzke, A., Volling, T., Spengler, T. S. 2016. Upgrade auctions in build-to-order manufacturing with loss-averse customers. European Journal of Operational Research 250, 2, 470-479. Nasiry, J., Popescu, I., 2011. Dynamic pricing with loss-averse consumers and peak-end anchoring. Operations Research 59, 6, 1361–1368. Popescu, I., Wu, Y., 2007. Dynamic pricing strategies with reference effects. Operations Research 55, 3, 413–429. Porteus E. L., 1990. Stochastic inventory theory. Handbooks in operations research and management science 2, 605–652. Seh, X., Bao, L., Yu, Y., 2018. Coordinating Inventory and Pricing Decisions with General Price-Dependent Demands. Production and Operations Management 27, 7, 1355–1367. Schweitzer, M.E., Cachon, G.P., 2000. Decision bias in the newsvendor problem with a known demand distribution: Experimental evidence. Management Science 46, 3, 404–420. Song, H., Ran, L., Shang, J., 2017. Multi-period optimization with loss-averse customer behavior: Joint pricing and inventory decisions with stochastic demand. Expert Systems with Applications 72, 421–429. Talluri, K., Van Ryzin, G., 2004. Revenue management under a general discrete choice model of consumer behavior. Management Science 50, 1, 15–33. Taudes, A., Rudloff, C., 2012. Integrating inventory control and a price change in the presence of reference price effects: a two-period model. Mathematical Methods of Operations Research 75, 1, 29–65. Thaler, R., 1985. Mental accounting and consumer choice. Marketing Science 4, 3, 199–214. Thowsen, G, T., 1975. A dynamic, nonstationary inventory problem for a price/quantity setting firm. Naval Research Logistics Quarterly 22, 3, 461–476. Tversky, A., Kahneman, D., 1991. Loss aversion in riskless choice: A reference-dependent model. The Quarterly Journal of Eeconomics 106, 4, 1039–1061.

26

Topkis, Donald, M., 2011. Supermodularity and complementarity. Princeton university press. Urban, T.L., 2008. Coordinating pricing and inventory decisions under reference price effects. International Journal of Manufacturing Technology and Management 13, 1, 78–94. Wang, Z., 2016. Intertemporal price discrimination via reference price effects. Operations Research 64, 2, 290–296. Wang, R., 2018. When prospect theory meets consumer choice models: Assortment and pricing management with reference prices. Manufacturing & Service Operations Management, 20, 3, 583–600. Wu, S., Liu, Q., Zhang R. Q. 2015. The reference effects on a retailer’s dynamic pricing and inventory strategies with strategic consumers. Operations research 63, 6, 1320-1335. Yang, L., Guo, P., Wang, Y., 2018. Service pricing with loss-averse customers. Operations research 66, 3, 761–777. Zhang, J., Chiang, W.Y.K., Liang, L., 2014. Strategic pricing with reference effects in a competitive supply chain. Omega 44, 126–135. Zhang, Y., 2010. Essays on robust optimization, integrated inventory and pricing, and reference price effect. University of Illinois at Urbana-Champaign. Zhang, J., Chen, J., Lee, C., 2008. Joint optimization on pricing, promotion and inventory control with stochastic demand. International Journal of Production Economics 116, 2, 190–198. Zhao, W., Yu, H., 2012. Survive in the price war: Growth strategy for online individual entrepreneurship in china. In Proceedings of 2012 Annual Meeting of the Academy of International Business-US North East Chapter: Business Without Borders, p. 257. Zhao, N., Wang, Q., Cao, P., Wu J., 2019a. Dynamic pricing with reference price effect and price-matching policy in the presence of strategic consumers. Journal of the Operational Research Society, 70, 12, 2069-2083. Zhao, N., Wang, Q., Cao, P., Wu, J., 2019b. Pricing decisions with reference price effect and risk preference customers. International Transactions in Operational Research. https://doi.org/10.1111/itor.12673.

27

Appendix Proof of Lemma 1. The continuity of d(pt , rt ) follows immediately from Eq.(2)-Eq.(4) that U (pt , rt ) and P (pt , rt ) are continuous. Because d(pt , rt ) = min{dk (pt , rt )}, for k ∈ {λ, γ}, and the minimum of two smooth decreasing, increasing or concave functions is still a decreasing, increasing or concave function. Then, we show the monotonicity and concavity of dk (pt , rt ), for k ∈ {λ, γ}. Taking derivatives of dk (pt , rt ) with respect to pt and rt , respectively, ∂dk (pt , rt ) ∂pt ∂dk (pt , rt ) ∂rt

N (1 + k) exp(a − pt + k(rt − pt )) < 0, (1 + exp(a − pt + k(rt − pt )))2 N k exp(a − pt + k(rt − pt )) > 0, (1 + exp(a − pt + k(rt − pt )))2

= − =

where the inequalities follow from Assumption 1. Thus, d(pt , rt ) is decreasing in pt but increasing in rt . What is more, ∂ 2 dk (pt , rt ) ∂p2t ∂ 2 dk (pt , rt ) ∂rt2 ∂ 2 dk (pt , rt ) ∂pt ∂rt

N (1 + k)2 exp(a − pt + k(rt − pt )) exp(a − pt + k(rt − pt )) − 1 < 0, (1 + exp(a − pt + k(rt − pt )))2 1 + exp(a − pt + k(rt − pt )) N k 2 exp(a − pt + k(rt − pt )) 1 − exp(a − pt + k(rt − pt )) < 0, (1 + exp(a − pt + k(rt − pt )))2 1 + exp(a − pt + k(rt − pt )) ∂ 2 dk (pt , rt ) N k(1 + k) exp(a − pt + k(rt − pt )) exp(a − pt + k(rt − pt )) − 1 = > 0. ∂rt ∂pt (1 + exp(a − pt + k(rt − pt )))2 1 + exp(a − pt + k(rt − pt ))

= − = =

The Hessian matrix of dk (pt , rt ) is 

Hd =  Because

∂ 2 dk (pt ,rt ) ∂p2

< 0,

∂ 2 dk (pt ,rt ) ∂r2

∂ 2 dk (pt ,rt ) ∂p2 2 ∂ dk (pt ,rt ) ∂r∂p

∂ 2 dk (pt ,rt ) ∂p∂r ∂ 2 dk (pt ,rt ) ∂r2



.

< 0 and det(Hd ) = 0, we conclude that dk (pt , rt ) is jointly

concave in (pt , rt ). Furthermore, ∂dλ (pt , rt ) ∂λ ∂dγ (pt , rt ) ∂γ

= =

N (rt − pt ) exp(a − pt + λ(rt − pt )) > 0 (rt ≥ pt ), (1 + exp(a − pt + λ(rt − pt )))2 N (rt − pt ) exp(a − pt + γ(rt − pt )) < 0 (rt < pt ). (1 + exp(a − pt + γ(rt − pt )))2

We obtain that d(pt , rt ) is nondecreasing in λ but nonincreasing in γ.

Proof of Lemma 2. Take derivatives of Gt (yt , pt , rt ) with respect to yt , pt and rt , respectively. Defining η1 = (h + b)φ(yt − d(pt , rt )) and η2 = b − (h + b)Φ(yt − d(pt , rt )), we obtain ∂ 2 Gt ∂ 2 Gt ∂d(pt , rt ) ∂ 2 Gt ∂d(pt , rt ) = η , = −η , = −η1 , 1 1 2 ∂yt ∂pt ∂pt ∂yt ∂rt ∂rt ∂yt 28

    ∂ 2 Gt ∂d(pt , rt ) 2 ∂ 2 Gt ∂d(pt , rt ) 2 ∂ 2 d(pt , rt ) ∂ 2 d(pt , rt ) = η + η , = η + η , 2 1 2 1 ∂pt ∂rt ∂p2t ∂p2t ∂rt2 ∂rt2 ∂ 2 d(pt , rt ) ∂d(pt , rt ) ∂d(pt , rt ) ∂ 2 Gt = η2 + η1 . ∂pt ∂rt ∂pt ∂rt ∂pt ∂rt The Hessian matrix of Gt (yt , pt , rt ) is 

  HG =  

∂ 2 Gt ∂yt2

∂ 2 Gt ∂yt ∂pt

∂ 2 Gt ∂yt ∂rt

∂ 2 Gt ∂yt ∂pt

∂ 2 Gt ∂p2t

∂ 2 Gt ∂pt rt

∂ 2 Gt ∂rt ∂yt

∂ 2 Gt ∂pt rt

∂ 2 Gt ∂rt2



  . 

If we can show HG is positive definite, then Gt (yt , pt , rt ) is jointly convex in (yt , pt , rt ). HG is a symmetric matrix, so one method to show it’s positive definiteness is to prove its sequential principal minor is not less than zero. We have

∂ 2 Gt ∂yt2

∂ 2 Gt ∂yt ∂pt

∂ 2 Gt ∂yt ∂pt

∂ 2 Gt ∂p2t

∂ 2 Gt ∂yt2

> 0 and

2 = (h + b)φ(yt − d(pt , rt )) (b − (h + b)Φ(yt − d(pt , rt ))) ∂ d(pt , rt ) > 0, ∂p2t

where the inequalities follow from Assumption 2. If we can show det(HG ) ≥ 0, then proof is completed. ∂d(pt ,rt ) ∂d(pt ,rt ) η −η −η 1 1 ∂pt 1 ∂rt  2 ∂ 2 d(pt ,rt ) ∂d(pt ,rt ) ∂ 2 d(pt ,rt ) ∂d(pt ,rt ) ∂d(pt ,rt ) ∂d(pt ,rt ) det(HG ) = −η1 ∂pt η2 ∂pt ∂rt + η1 ∂pt η2 ∂p2 + η1 ∂pt ∂rt t 2  2 2 t ,rt ) t ,rt ) −η1 ∂d(pt ,rt ) η2 ∂ d(pt ,rt ) + η1 ∂d(pt ,rt ) ∂d(pt ,rt ) η2 ∂ d(p + η1 ∂d(p ∂rt ∂pt ∂rt ∂pt ∂rt ∂rt ∂rt2   ∂d(pt , rt ) 2 ∂ 2 d(pt , rt ) ∂ 2 d(pt , rt )  ∂d(pt , rt ) 2 ∂ 2 d(pt , rt ) ∂d(pt , rt ) ∂d(pt , rt ) 2 = η1 η2 + −2 ∂pt ∂rt ∂pt ∂rt ∂pt ∂rt ∂rt2 ∂p2t  2      2 2 2 2 ∂ d(pt , rt ) ∂d(pt , rt ) ∂d(pt , rt ) ∂ d(pt , rt ) ∂d(pt , rt ) ∂d(pt , rt ) ∂ d(pt , rt ) 2 − + η1 η2 2 − ∂pt ∂rt ∂pt ∂rt ∂rt ∂pt ∂p2t ∂rt2 = 0, where the equalities follow from

∂ 2 d(pt ,rt ) ∂ 2 d(pt ,rt ) ∂p2t ∂rt2





∂ 2 d(pt ,rt ) ∂pt ∂rt

2

= 0 (see proof of Lemma 1).

Proof of Lemma 3. According to the definition of V¯t (yt , pt ), if we can show both Π(pt , rt ) and −Gt (yt , pt , rt ) are jointly concave in yt and pt , then V¯t (yt , pt ) is jointly concave in yt , pt . The concavity of −Gt (yt , pt , rt ) is immediately obtained from Lemma 2. Furthermore, because the single-period revenue can be written as follows Π(pt , rt ) = pt · min{dλ (pt , rt ), dγ (pt , rt )}

29

= min{Πλ (pt , rt ), Πγ (pt , rt )}, if we can show Πk (pt , rt ) = pt · dk (pt , rt ), k ∈ {λ, γ}, is jointly concave, then proof is completed. Taking the partial derivatives of Πk (pt , rt ) with respect to yt and pt , respectively, we obtain ∂ 2 Πk (pt ,rt ) ∂p2t

(pt ,rt ) + pt ∂ = 2 ∂dk∂p t

2 d (p ,r ) k t t ∂p2t

< 0 (see proof of Lemma 1),

∂ 2 Πk (pt ,rt ) ∂pt ∂yt

=

∂ 2 Πk (pt ,rt ) ∂yt2

= 0. The

Hessian matrix of Πk (pt , rt ) is 

HΠk =  Obviously, det(HΠk ) = 0, combining

∂ 2 Πk (pt ,rt ) ∂p2t 2 ∂ Πk (pt ,rt ) ∂yt ∂pt

∂ 2 Πk (pt ,rt ) ∂p2t

∂ 2 Πk (pt ,rt ) ∂yt ∂pt ∂ 2 Πk (pt ,rt ) ∂yt2



.

< 0, we obtain Π(pt , rt ) is jointly concave in pt and

yt . Hence, V¯t (yt , pt ) is jointly concave.

Proof of Proposition 1. Define f (˜ pk , rt , k) = (1 + k)˜ pk − 1 − exp(a − p˜k + k(rt − p˜k )). With regard to the monotonicity of the optimal myopic policy, we can show f (˜ pk , rt , k) is supermodular ˜ = f (˜ in (˜ pk , k) and (˜ pk , rt ). For any p˜hk ≥ p˜lk ∈ P, k h ≥ k l ≥ 0, define ∆ phk , rt , k h ) − f (˜ plk , rt , k h ) −

˜ ≥ 0, then f (˜ pk , rt , k) is supermodular in (˜ pk , k). We plk , rt , k l ). If we can show ∆ f (˜ phk , rt , k l ) + f (˜ have, ˜ = ea−˜plk +kh (rt −˜plk ) − ea−˜phk +kh (rt −˜phk ) + p˜h k h − p˜l k h − [ea−˜plk +kl (rt −˜plk ) − ea−˜phk +kl (rt −˜phk ) + p˜h k l − p˜l k l ]. ∆ k k k k l

l

h

h

phk k−˜ plk k. Because Define f1 (k) = ea−˜pk +k(rt −˜pk ) −ea−˜pk +k(rt −˜pk ) +˜

∂f1 (k) ∂k

h

h

≥ (1+ea−˜pk +k(rt −˜pk ) )(˜ phk −

˜ ≥ 0 and f (˜ pk , rt , k) is supermodular in p˜lk ) ≥ 0, we know f1 (k) is nondecreasing in k, and thus ∆ (˜ pk , k). Hence, it follows from Theorem 2.8.2 in Topkis (2011) that p˜k is increasing in k. By the same method, we obtain that f (˜ pk , rt , k) is also supermodular in (˜ pk , rt ), and then p˜k is increasing in rt . It follows from the monotonicity of p˜k (rt ) with respect to k (k ∈ {λ, γ}), Eq.(6) and Lemma 1 that y˜k (rt ) is decreasing in k.

Proof of Theorem 1. (1) The minimum of two concave functions is still a concave function, if we can show Πk (pt , rt ), k ∈ {λ, γ}, is not jointly concave in pt and rt , then Π(pt , rt ) is not jointly concave in pt and rt . Taking derivatives of Πk (pt , rt ) with respect to pt and rt , respectively, −2N (1 + k)ea−pt +k(rt −pt ) ∂ 2 Πk ∂ 2 Πk 2 = − (1 + k) A < 0, = −k 2 A < 0, ∂p2t ∂rt2 (1 + ea−pt +k(rt −pt ) )2 ∂ 2 Πk ∂ 2 Πk N kea−pt +k(rt −pt ) = = + k(1 + k)A > 0, ∂pt ∂rt ∂rt ∂pt (1 + ea−pt +k(rt −pt ) )2 30

N pt ea−pt +k(rt −pt ) ea−pt +k(rt −pt ) −1 . (1+ea−pt +k(rt −pt ) )2 1+ea−pt +k(rt −pt )

where A =



Because det(HΠ ) =

−N 2 k 2

in pt and rt .



HΠ = 

Then, the Hessian matric of Πk (pt , rt ) is ∂ 2 Πk ∂pt ∂rt

∂ 2 Πk ∂p2t ∂ 2 Πk ∂rt ∂pt

ea−pt +k(rt −pt ) (1+ea−pt +k(rt −pt ) )2

2

∂ 2 Πk ∂rt2



.

< 0, and thus Πk (pt , rt ) cannot be jointly concave

ˆ t , rt ) is jointly concave. DeNext, we show that there exists a positive θ, such that Π(p ˆ k (pt , rt ) = Πk (pt , rt ) − θrt2 + θδ[αrt + (1 − α)pt ]2 , k ∈ {λ, γ}. fine Π

ˆ t , rt ) = We know Π(p

ˆ λ (pt , rt ), Π ˆ γ (pt , rt )}, and the minimum of two smooth concave function is still a concave min{Π ˆ t , rt ) is jointly concave if we can show Π ˆ k (pt , rt ) is jointly concave. Taking function. Therefore, Π(p ˆ k (pt , rt ) with respect to pt and rt , we have the derivatives of Π ˆk ˆk ∂2Π ∂ 2 Πk ∂ 2 Πk ∂2Π = + 2θδα(1 − α), = + 2θδ(1 − α)2 , 2 2 ∂pt ∂rt ∂pt ∂rt ∂pt ∂pt ˆk ∂2Π ∂ 2 Πk = − 2θ + 2θδα2 , ∂rt2 ∂rt2 ˆ k (pt , rt ) is the Hessian matrix of Π

HΠˆ

Clearly,

ˆk ∂2Π ∂p2t

ˆk ∂2Π ∂rt2 2θδα2

and

0 and −k 2 A − 2θ +



=

∂ 2 Πk ∂p2t ∂ 2 Πk ∂rt ∂pt

∂ 2 Πk ∂pt ∂rt

+ 2θδ(1 − α)2

∂ 2 Πk ∂rt2

+ 2θδα(1 − α)

must be negative, that is

+ 2θδα(1 − α) − 2θ + 2θδα2



.

−2N (1+k)ea−pt +k(rt −pt ) −(1+k)2 A+2θδ(1−α)2 (1+ea−pt +k(rt −pt ) )2

<

< 0. Considering θ is positive, we obtain



θ ∈ 0,

2N (1+k)ea−pt +k(rt −pt ) (1+ea−pt +k(rt −pt ) )2

+ (1 + k)2 A 

2δ(1 − α)2

.

(A.1)

Then, the determinant of the Hessian matrix is  ∂2Π  2 ∂ 2 Πk k 2 2 2 ∂ Πk det(HΠˆ ) = det(HΠ ) + 2θ − (1 − δα ) + δ(1 − α) − 2δα(1 − α) − 2θδ(1 − α) . ∂pt ∂rt ∂p2t ∂rt2 Furthermore, −

∂ 2 Πk ∂ 2 Πk ∂ 2 Πk (1 − δα2 ) + δ(1 − α)2 − 2δα(1 − α) − 2θδ(1 − α)2 2 2 ∂pt ∂rt ∂pt ∂rt

31

=

 2N (1 + k)ea−pt +k(rt −pt )

 + (1 + k)2 A (1 − δα2 ) − δ(1 − α)2 k 2 A

(1 + ea−pt +k(rt −pt ) )2  N kea−pt +k(rt −pt )  −2δα(1 − α) + k(1 + k)A − 2θδ(1 − α)2 (1 + ea−pt +k(rt −pt ) )2

N ea−pt +k(rt −pt ) (1 + α + k) + δ(1 − α)(1 + k)A(1 + α + ka − αk) (1 + ea−pt +k(rt −pt ) )2 −δ(1 − α)2 k 2 A − 2θδ(1 − α)2 N ea−pt +k(rt −pt ) ≥ 2δ(1 − α) (1 + α + k) + δ(1 − α2 )kA − 2θδ(1 − α)2 , (A.2) (1 + ea−pt +k(rt −pt ) )2

≥ 2δ(1 − α)

a−pt +k(rt −pt )

Ne where the first inequality follows from 0 < δ < 1. It is easy to determine that 2δ(1−α) (1+e a−pt +k(rt −pt ) )2 (1+

α + k) + δ(1 − α2 )kA > 0. Define 2 ∂ 2 Πk ∂ 2 Πk 2 2 ∂ Πk (1 − δα ) + δ(1 − α) − 2δα(1 − α) ∂pt ∂rt ∂p2t ∂rt2  2N (1 + k)ea−pt +k(rt −pt )  = + (1 + k)2 A (1 − δα2 ) − δ(1 − α)2 k 2 A (1 + ea−p+k(rt −pt ) )2  N kea−pt +k(rt −pt )  −2δα(1 − α) + k(1 + k)A > 0. (1 + ea−pt +k(rt −pt ) )2

B = −

(A.3)

Then we have det(HΠˆ ) = det(HΠ ) + 2θ(B − 2θδ(1 − α)2 ). Observing the quadratic function, if θ < 0 or θ > B , 2δ(1−α)2

B , 2δ(1−α)2

then det(HΠˆ ) < 0. Comparing the boundary value of θ in Eq.(A.1) and B 2δ(1−α)2

2N (1+k)ea−p+k(r−p) +(1+k)2 A (1+ea−pt +k(rt −pt ) )2 2δ(1−α)2

we obtain that < ; this means that θ should satisfy   B θ ∈ Θ = 0, 2δ(1−α) ˆ ) > 0, then there exists a positive 2 . Therefore, if we can show maxθ∈Θ det(HΠ ˆ t , rt ) is jointly concave. Clearly, θ such that Π(p

2 1 B B2 = det(H ) + 1 4δ(1 − α)2 δ 2(1 − α)  a−p+k(r−p) N (1 + k)e 1−α 2 1 (1 + k)2 A  (1 + α) − k A ≥ det(HΠ ) + + a−p+k(r−p) 2 δ 2 2 (1 + e ) !  N kea−pt +k(rt −pt )  2 −α + k(1 + k)A (1 + ea−p+k(rt −pt ) )2 N kea−pt +k(rt −pt ) 1  N (1 + k)ea−pt +k(rt −pt ) ≥ det(HΠ ) + (1 + α) − α + δ (1 + ea−pt +k(rt −pt ) )2 (1 + ea−pt +k(rt −pt ) )2 (1 + k)2 A(1 + α) (1 − α)k 2 A 2 − αAk(1 + k) − 2 2  2  2 a−p +k(r −p ) t t t 1 N ke 1−δ N kea−p+k(rt −pt ) = > 0, ≥ det(HΠ ) + δ (1 + ea−pt +k(rt −pt ) )2 δ (1 + ea−pt +k(rt −pt ) )2

max det(HΠˆ ) = det(HΠ ) + θ∈Θ

where the inequalities follow from δ > 1. Hence, we obtain that there exists a positive θ ∈ Θ =   B ˆ 0, 2δ(1−α) 2 , such that Π(pt , rt ) is jointly concave. If we can show the lower bound of B is a

positive and independent of pr and rt , then we can say there exists a positive constant θ such that 32

ˆ t , rt ) is jointly concave. It follows from Eq.(A.2) and Eq.(A.3) that Π(p B ≥ 2δ(1 − α) ≥ 2δ(1 − α)

N ea−pt +k(rt −pt ) (1 + α + k) + δ(1 − α2 )kA (1 + ea−pt +k(rt −pt ) )2 N ea−pt +k(rt −pt ) (1 + α + k) > 0, (1 + ea−pt +k(rt −pt ) )2

where the first inequality follows from Eq.(A.2) and the second inequality follows from A ≥ 0. x

Ne Define a function B(x) = 2δ(1 − α) (1+e It x )2 (1 + α + k), where x = a − pt + k(rt − pt ).

p − p)]. Therefore, we have follows from Assumption 1 and pt , rr ∈ [p, p¯] that x ∈ [0, a − p + k(¯ ∂B(x) ∂x

x

x

(1−e ) = 2δ(1 − α)(1 + α + k)N e(1+e x )3 ≤ 0. Hence,

B ≥ 2δ(1 − α)

N ea−p+k(¯p−p) (1 + ea−p+k(¯p−p) )2

(1 + α + k) > 0.

 Therefore, we obtain that there exists a positive constant θ ∈ Θ0 = 0,

¯ N ea−p+k(p−p) ¯ a−p+k(p−p) )2 (1+e

ˆ t , rt ) is jointly concave. such that Π(p

·

1+α+k 1−α



,

ˆ t , rt ). For any pl , ph , rl , rh ∈ (2) In the following, we show the supermodularity of Π(pt , rt ) and Π(p t t t t

P, rth > rtl and pht > plt .

Define ∆1 = Π(pht , rth ) − Π(plt , rth ) + Π(plt , rtl ) − Π(pht , rtl ), ∆2 =

ˆ ht , rth ) − Π(p ˆ lt , rth ) + Π(p ˆ lt , rtl ) − Π(p ˆ ht , rtl ). We need to show ∆1 > 0 and ∆2 > 0. It is easy Π(p to see that ∆2 = ∆1 + 2θδα(1 − α)(pht − plt )(rth − rtl ). Because rth > rtl and pht > plt , if we can prove

∆1 > 0 then ∆2 > 0. We consider all possible cases: (1) plt ≤ pht ≤ rtl ≤ rth ; (2) plt ≤ pht ≤ rtl ≤ rth ;

(3) plt ≤ pht ≤ rtl ≤ rth ; (4) plt ≤ pht ≤ rtl ≤ rth ; (5) plt ≤ pht ≤ rtl ≤ rth ; (6) plt ≤ pht ≤ rtl ≤ rth . For case 1 and 6, we define h

R(r) = pht ·

h

N ea−pt +k(rt −pt ) h

h

1 + ea−pt +k(rt −pt )

l

− plt ·

l

N ea−pt +k(rt −pt ) l

l

1 + ea−pt +k(rt −pt )

, k ∈ {λ, γ}.

Then, ∆1 = R(rth ) − R(rtl ), and we need to show R(rt ) is increasing in rt . Taking the derivative of R(rt ) with respect to rt , we have h

h

l

l

N kea−pt +k(rt −pt ) ∂R(rt ) N kea−pt +k(rt −pt ) l − p · . = pht · t h h l l ∂rt (1 + ea−pt +k(rt −pt ) )2 (1 + ea−pt +k(rt −pt ) )2 Since ∂

kea−pt +k(rt −pt ) (1+ea−pt +k(rt −pt ) )2



a−pt +k(rt −pt )

ke /∂pt = (1 + k) (1+e a−pt +k(rt −pt ) )2 (1 −

2 ) 1+ea−pt +k(rt −pt )

> 0 and pht > plt ,

ˆ t , rt ) are we know that R(rt ) is increasing in rt . Then, ∆1 > 0 and ∆2 > 0, i.e., Π(pt , rt ) and Π(p supermodular in these two cases. After rearranging terms, case 2 holds because

ea−p2 +γ(r1 −p2 ) 1+ea−p2 +γ(r1 −p2 )

in case 2 is smaller than

ea−p2 +λ(r1 −p2 ) 1+ea−p2 +λ(r1 −p2 )

in case 1, since case 1 holds we know case 2 also holds. By the same method, because case 6 holds,

33

we obtain case 4 and case 5 hold, and then case 3 holds. Therefore, we obtain that both Π(pt , rt ) ˆ t , rt ) are supermodular in pt and rt . and Π(p

Proof of Theorem 2. We show the monotonicity of Vt (xt , rt ) and Ut (xt , rt ) by backward induction. Notice that Vt+1 (xt , rt ) is increasing in rt at t = T . Suppose this property holds for t ∈ {1, 2, . . . T }. We know (1) d(pt , rt ) is increasing in rt and λ (Lemma 1); (2) Define Xt = Π(pt , rt ) − Gt (yt , pt , rt ). Then, for k ∈ {λ, γ} we have ∂Xt ∂rt ∂Xt ∂λ ∂Xt ∂γ

∂dk (pt , rt ) ∂dk (pt , rt ) + [(b + h)Φ(yt − dk (pt , rt )) − b] > 0, ∂rt ∂rt ∂dλ (pt , rt ) ∂dλ (pt , rt ) = pt · + [(b + h)Φ(yt − dk (pt , rt )) − b] > 0, ∂λ ∂λ ∂dγ (pt , rt ) ∂dγ (pt , rt ) = pt · + [(b + h)Φ(yt − dk (pt , rt )) − b] < 0, ∂γ ∂γ = pt ·

where the inequalities follow from Assumption 2,

∂dk (pt ,rt ) ∂rt

> 0,

∂dλ (pt ,rt ) ∂λ

> 0 and

∂dγ (pt ,rt ) ∂γ

< 0;

(3) Its objective increasing and its feasible set expands as rt and λ increase or γ decrease; (4) αrt + (1 − α)pt is linear in rt . Combining (1), (2), (3), (4), and the induction assumption, we obtain that Vt+1 (yt − Dt (pt , rt ), αrt + (1 − α)pt ) is increasing in rt and λ. Then, it follows from Eq. (O1) and Eq. (O2) that Vt (xt , rt ) is increasing in rt and λ, and it is also decreasing in γ. Furthermore, the monotonicity of Ut (xt , rt ) with respect to rt , λ and γ can be proved to be the same as Vt (xt , rt ). The monotonicity of Ut (xt , rt ) in xt is straightforward because for problem (U ), its objective is independent of xt and its feasible set shrink as xt increases. In the following, we show the ˆt (yt , pt , rt ). It is trivial for t = T since UT +1 (xT +1 , rT +1 ) = joint concavity of Ut (xt , rt ) and U −βrT2 +1 . Suppose it is true for t + 1, i.e., Ut+1 (xt+1 , rt+1 ) is jointly concave. We first prove the last term in Eq. (U2) is jointly concave. Let St+1 (yt , pt , rt ) = Ut+1 (yt − D(pt , rt ), αrt + (1 − α)pt ). For any λ ∈ (0, 1), yth ≥ ytl > 0, plt , pht , rtl , rth ∈ P, rth ≥ rtl and pht ≥ plt , so we need to show St+1 (λ(ytl , plt , rtl ) + (1 − λ)(yth , pht , rth )) ≥ λSt+1 (ytl , plt , rtl ) + (1 − λ)St+1 (yth , pht , rth ). St+1 (λ(ytl , plt , rtl ) + (1 − λ)(yth , pht , rth ))   = St+1 λytl + (1 − λ)yth , λplt + (1 − λ)pht , λrtl + (1 − λ)rth  = Ut+1 λytl + (1 − λ)yth − d(λplt + (1 − λ)pht , λrtl + (1 − λ)rth − ε  α(λrtl + (1 − λ)rth ) + (1 − α)(λplt + (1 − λ)pht )  ≥ Ut+1 λytl + (1 − λ)yth − λd(plt , rtl ) − (1 − λ)d(pht , rth ) − ε  α(λrtl + (1 − λ)rth ) + (1 − α)(λplt + (1 − λ)pht ) 34

 = Ut+1 λ(ytl − d(plt , rtl ) − ε) + (1 − λ)(yth − d(pht , rth ) − ε)  λ(αrtl + (1 − α)plt ) + (1 − λ)(αrth + (1 − α)pht )     ≥ λUt+1 ytl − d(plt , rtl ) − ε, αrtl + (1 − α)plt + (1 − λ)Ut+1 yth − d(pht , rth ) − ε, αrth + (1 − α)pht = λSt+1 (ytl , plt , rtl ) + (1 − λ)St+1 (yth , pht , rth ),

where the first inequality follows from d(pt , rt ) is jointly concave in (pt , rt ) (see Lemma 1) and Ut+1 (xt+1 , rt+1 ) is decreasing in xt+1 . The second inequality follows from the hypothesis that Ut+1 (xt+1 , rt+1 ) is jointly concave in (xt+1 , rt+1 ) (induction hypothesis). Combining Theorem 1 and the convexity of the expected inventory cost Gt (yt , pt , rt ) (see Lemma 2), we obtain that the objective function in problem (U ) is jointly concave. Furthermore, the expected demand function d(pt , rt ) is jointly concave and αrt + (1 − α)pt is linear in pt , and rt , the feasible set in Problem (U ) is convex in (xt , yt , pt , rt ). Thus, Problem (U ) is a concave ˆt (yt , pt , rt ) are jointly concave. maximization problem, which implies that Ut (xt , rt ) and U ˆt (yt , pt , rt ) is submodular in (yt , pt ). Since the sum of submodular functions Finally, we show U is still submodular, it suffices to establish submodularity for each term to the right of Eq.(U2). The first and second terms are trivially submodular since they depend on only one of the two variables yt , pt . To show that Gt (yt , pt , rt ) has isotone differences on yt and pt , fix εt and consider an arbitrary pair of inventory levels (yth , ytl ) and any pair of price levels (pht , plt ) with yth > ytl and pht > plt . Let ι = yt − d(pt , rt ), ι1 = yth − d(pht , rt ), ι2 = yth − d(plt , rt ), ι3 = ytl − d(pht , rt ) and

ι4 = ytl − d(plt , rt ). By the monotonicity of the demand function with respect to pt , we have: ι3 > ι4 . Since the expected inventory cost can be written as Gt (ι) =

Z

ι

−∞

h(ι − εt )φ(εt )dεt −

Z

ι

we take the second partial derivative of ι, and we obtain

+∞

b(ι − εt )φ(εt )dεt , ∂Gt (ι) ∂ι

= (b + h)φ(ι) > 0, that is Gt (ι) is

convex in ι. Then, by the convexity of Gt (ι) we have:   Gt (ι1 ) − Gt (ι3 ) = Gt ι3 + (yth − ytl ) − Gt (ι3 )

≥ Gt (ι4 + yth − ytl ) − Gt (ι4 ) = Gt (ι2 ) − Gt (ι4 ).

We conclude that Gt (yt , pt , rt ) has isotone differences in yt and pt . The submodularity proof for the last term in Eq.(U2) is identical to that of −Gt since Ut+1 is concave. Proof of Theorem 3. Because problem (O) has the same optimal solutions as the concave

35

maximization problem (U ), the optimality of the BSLP policy follows immediately. ˆt (yt , pt , rt ) is submodular in (yt , pt ). To show p(xt , rt ) is nonincreasing in xt , we can show U Since the sum of submodular functions is submodular, it suffices to establish submodularity for each term to the right of Eq.(U2). The first and second terms are trivially submodular since they depend on only one of the two variables yt and pt . Next, we show that Gt (yt , pt , rt ) is submodular in (yt , pt ) for fixed rt and εt . We just discuss the case rt ≥ pt since the same technique proves

case rt < pt . We consider an arbitrary pair of inventory levels (yth , ytl ) and an arbitrary pair of price level (pht , pht ) with yth > ytl and pht > plt . Note that Gt (yt , pt , rt ) = EHt (yt − D(pt , rt )). Let

lh = yth − D(pht , rt ), l2 = yth − D(plt , rt ), l3 = ytl − D(pht , rt ) and l4 = ytl − D(plt , rt ). Since D(pt , rt )

is decreasing in pt , we have l1 > l2 and l3 > l4 . Because of the specific definition of Ht (·), we know that it is convex, and then we have: Ht (l1 ) − Ht (l3 ) = Ht (l3 + ytl − yth ) − Ht (l3 ) ≥ Ht (l4 + ytl − yth ) − Ht (l4 ) = Ht (l2 ) − Ht (l4 ). We conclude that the function Ht (y − D(pt , rt )) has isotone differences in yt and pt . Finally, the submodularity proof for the last term in (U2) is identical to that of −Gt since Ut+1 is concave.

ˆt+1 (yt , pt , rt ) is jointly concave (see Theorem 2 (2)), [p(rt ), y(rt )] is Fix t = 1, 2, . . . , T . Since U

the optimal decision pair when xt ≤ y(rt ). Similarly, for xt > y(xt ), it is optimal to choose y(rt ) = xt . That is, y(xt , rt ) = max{xt , y(rt )}. We conclude in particular that y(xt , rt ) is nondecreasing in xt . Furthermore, the decisions in period t can be viewed as consisting of two stages. In the first stage, the inventory decision y(xt , rt ) is chosen and in the second stage the corresponding price ˆt+1 (yt , pt , rt ) is strictly concave in pt (see Theorem 2), we have that the p(xt , rt ) is set. Since U ˆt+1 (yt , pt , rt ) is submodular, it follows from Theorem 8optimal price p(yt (rt ), rt ) is unique. Since U 4 in Heyman and Sobel (1984) that the optimal price p(xt , rt ) is nonincreasing in the “state” yt , and hence in xt because of y(xt , rt ) is nondecreasing in xt . Proof of Theorem 4. If we can show Jˆt (yt , pt ) is jointly concave in yt and pt , then it follows from Eq. (J1) that Jt (xt ) is concave in xt . By backwards induction: it is clear that JˆT +1 (yT +1 , pT +1 ) is jointly concave. Assume it holds for Jˆt+1 . Because ˜ t) ∂ 2 Π(p exp(a − pt ) = (1 + exp(a − pt ))2 ∂p2t



−2 − pt ·

exp(a − pt ) − 1 1 + exp(a − pt )



< 0,

˜ t ) must be concave in pt . Next, we show G ˜ t (yt , pt ) is jointly convex in yt and pt . Taking Π(p

36

˜ t (yt , pt ) with respective to yt and pt , respectively, derivatives of G   ˜t ˜t ˜t ∂2G ∂d(pt ) ∂ 2 G ∂d2 (pt ) ∂d(pt ) 2 ∂2G = η˜, = −˜ η , = (b − η˜) + η˜ , ∂yt ∂pt ∂pt ∂pt ∂yt2 ∂p2t ∂p2t ˜ t (yt , pt ) can be written as where η˜ = (h + b)φ(yt − d(pt )). Then, the Hessian matrix of G 

˜ = H

˜t ∂2G ∂yt2 ˜t ∂2G ∂yt ∂pt

˜t ∂2G ∂yt ∂pt ˜t ∂2G ∂p2t



. ˜t ∂2G ˜ ∂2G , ∂p2t > 0, ∂yt2 t ∂d2 (pt ) and ∂p2 < 0. t

˜ t (yt , pt ) is jointly convex if we can show H ˜ is positive definite. Because G 2

(b − η˜) ∂d∂p(p2 t ) η˜ > 0, where the inequalities follows from Assumption 2 t

˜ = det(H) ˜ Hence, H

is positive definite. The first term of Eq. (J2) is concave in pt , the second term is linear in yt while the third term is jointly convex in yt and pt . Therefore, it follows from Eq. (J2) and the inductive assumption that Jˆt (yt , pt ) is jointly concave. Thus, Jt (xt ) is concave in xt . Proving the monotonicity of Jt (xt ) in x is straightforward because for problem (J), its objective is independent of xt and its feasible set shrinks as xt increases. For any period t ∈ {1, 2, . . . , T }, because problem (J) is a concave maximization problem, it immediately follows that BSLP policy [pt , yt ] is optimal.

37

Author Statement Qiang Wang: Conceptualization, Methodology, Formal analysis, Writing- Original Draft Nenggui Zhao: Methodology, Formal analysis, Writing Reviewing and Editing Jie Wu: Supervision, Writing- Reviewing and Editing Qingyuan Zhu: Discussion and Editing