19 Pricing

Price is the only element of the marketing mix that directly produces revenue; every other element—product, promotion, place—creates cost. It is also the most quickly adjusted lever a firm controls, the one most visible to competitors, and the one most freighted with psychological meaning for buyers. A one-percentage-point improvement in realized price typically lifts operating profit by far more than an equivalent improvement in volume or variable cost, because price flows to the bottom line undiluted. Yet pricing decisions are routinely delegated to spreadsheets and gut feel, precisely because the object the manager most needs—each buyer’s willingness to pay—is unobserved and actively concealed during exchange.

This chapter treats pricing as the problem of recovering a latent demand object and acting on it. We begin with the primitive: the reservation price, the maximum a buyer will pay, and the demand curve that aggregates reservation prices across a market. From the demand curve we derive the monopolist’s optimal price and the central quantity managers actually estimate—the price elasticity of demand—along with its identification problem. We then study how firms extract more surplus than a single price allows through price discrimination in its three textbook degrees, and how that machinery is implemented as versioning, bundling, and nonlinear tariffs. The second half of the chapter turns behavioral: buyers do not respond to prices as the textbook supposes. They evaluate prices against reference points, they treat a discount differently from a surcharge, they read price as a signal of quality, and they are moved by the framing of a promotion as much as by its depth. We close on the contemporary frontier—dynamic and personalized pricing—where firms set prices that vary across time, context, and individual, and where the gains from discrimination collide with consumer perceptions of fairness and with privacy regulation.

The through-line is measurement. A price is only as good as the demand estimate behind it, and every method we present—from a logit demand model to a conjoint study to a field experiment—is in the end a strategy for estimating how quantity responds to price when price is set by a firm that already knows something the analyst does not. We lead with the economic intuition, give each method its estimator and the assumption that breaks its identification, and supply seeded, runnable code.

19.1 Reservation Prices and Willingness to Pay

The atomic construct of pricing is the reservation price, also called willingness to pay (WTP): the maximum amount a particular consumer would pay for a unit of a product rather than forgo it. Formally, let a consumer derive gross utility (in monetary units) $v$ from acquiring one unit. Facing price $p$, the consumer buys if and only if consumer surplus is nonnegative,

\[ \text{buy} \iff v - p \ge 0 \iff p \le v, \tag{19.1}\]

so $v$ is the reservation price: the indifference price at which the consumer is exactly willing to transact. Reservation prices are heterogeneous across consumers; the distribution of $v$ in the population is the economic content of “demand.”

19.1.1 From Reservation Prices to the Demand Curve

Let $F(\cdot)$ be the cumulative distribution function of reservation prices across a unit mass of potential buyers, so $F(p) = \Pr(v \le p)$. At posted price $p$, the buyers are exactly those with $v \ge p$, a fraction $1 - F(p)$. Market demand is therefore

\[ Q(p) = M\,\bigl[1 - F(p)\bigr], \tag{19.2}\]

where $M$ is market size. The demand curve is the survival function of the reservation-price distribution, scaled by $M$. This identity is the conceptual bridge between the behavioral object (what individuals will pay) and the aggregate object (how quantity responds to price): the slope of demand is governed by how reservation prices are dispersed. A market in which everyone values the good identically yields a flat demand curve with a single kink; wide dispersion in $v$ yields a smooth, gently sloped curve.

Reservation price is not the same as the price a consumer expects or prefers

The reservation price is a threshold—the worst deal the consumer would still accept—not the price the consumer hopes to pay or considers fair. Survey methods that ask “what is a reasonable price?” measure a reference price (covered in Section 19.5), not the reservation price. Conflating the two systematically understates WTP and leaves money on the table.

19.1.2 The Monopolist’s Price and the Elasticity

A firm with constant marginal cost $c$ choosing a single price solves $\max_p (p - c)\,Q(p)$. The first-order condition can be written in the canonical inverse-elasticity (Lerner) form

\[ \frac{p^\star - c}{p^\star} = -\frac{1}{\varepsilon(p^\star)}, \qquad \varepsilon(p) \equiv \frac{\partial Q}{\partial p}\frac{p}{Q}, \tag{19.3}\]

where $\varepsilon < 0$ is the price elasticity of demand, the percentage change in quantity per percentage change in price. The optimal markup over cost, expressed as a fraction of price, equals the reciprocal of the (absolute) elasticity: inelastic demand ($|\varepsilon|$ small) supports a high markup; elastic demand forces price toward cost. Profit maximization requires operating where $|\varepsilon| > 1$; no firm optimally prices on the inelastic portion of its demand curve, because there a price increase would raise revenue and cut cost. The elasticity is thus the single most consequential number in pricing, and most of empirical pricing research is, at bottom, the estimation of $\varepsilon$.

A workhorse functional form is constant-elasticity (log-log) demand, $\log Q = \alpha + \varepsilon \log p + u$, under which $\varepsilon$ is a single parameter. The following seeded example simulates a market of heterogeneous reservation prices, recovers the demand curve as the empirical survival function per Equation 19.2, and locates the profit-maximizing price.

Code

set.seed(17)
M <- 100000                       # market size
v <- rlnorm(M, meanlog = log(40), sdlog = 0.5)   # reservation prices
c <- 15                           # marginal cost

price_grid <- seq(15, 120, by = 1)
demand  <- sapply(price_grid, function(p) mean(v >= p)) * M   # eq-demand-from-wtp
profit  <- (price_grid - c) * demand

p_star  <- price_grid[which.max(profit)]
q_star  <- demand[which.max(profit)]

# point elasticity at the optimum via numerical derivative
dQ <- diff(demand) / diff(price_grid)
elas_at_star <- dQ[which.max(profit)] * p_star / q_star

cat("Optimal price p*:        ", p_star, "\n")
#> Optimal price p*:         41
cat("Quantity at p*:          ", round(q_star), "\n")
#> Quantity at p*:           48208
cat("Elasticity at p*:        ", round(elas_at_star, 2), "\n")
#> Elasticity at p*:         -1.64
cat("Lerner markup (p*-c)/p*: ", round((p_star - c) / p_star, 3),
    " vs. -1/elasticity =", round(-1 / elas_at_star, 3), "\n")
#> Lerner markup (p*-c)/p*:  0.634  vs. -1/elasticity = 0.608

The recovered markup matches $-1/\varepsilon$ to within grid resolution, confirming Equation 19.3 numerically.

19.1.3 Eliciting Willingness to Pay

Because reservation prices are private, firms must elicit them. Four families of methods dominate, trading off realism against control.

Direct survey methods ask consumers their WTP, either openly or through a sequence of yes/no purchase questions (the contingent-valuation tradition). They are cheap but suffer hypothetical bias: respondents overstate WTP when no money changes hands, and the open-ended format invites strategic understatement.

Incentive-compatible mechanisms make truth-telling a dominant strategy by binding the respondent to a real transaction. In the Becker–DeGroot–Marschak (BDM) procedure the respondent states a bid $b$; a price $p$ is then drawn at random, and the respondent buys at $p$ if $b \ge p$. Because the stated bid never sets the price the respondent pays, only whether they transact, the respondent’s optimal bid is exactly their true reservation price $v$. Second-price (Vickrey) auctions share this incentive property.

Conjoint analysis infers WTP indirectly from trade-offs. Respondents choose among profiles that vary attributes (including price); a discrete-choice model recovers part-worth utilities, and the monetary value of an attribute is the ratio of its part-worth to the price coefficient. If utility is $U_{ij} = \mathbf{x}_j'\boldsymbol{\beta} + \beta_p\,p_j + \varepsilon_{ij}$, the WTP for a unit change in attribute $k$ is $-\beta_k/\beta_p$, and the reservation price for a whole profile is the price that drives its choice probability against the outside option to one-half. Choice-based conjoint with hierarchical Bayes estimation is the industry standard for new-product pricing precisely because it disciplines the hypothetical task with realistic competitive trade-offs and recovers individual-level heterogeneity in $\boldsymbol{\beta}$ (Toubia and Stephen 2013; Ding, Li, and Chatterjee 2015). A complementary structural approach maps the conjoint primitives directly into demand and competitive equilibrium so that the elicited utilities deliver an optimal price rather than a ranking (Jedidi, Jagpal, and Manchanda 2003; Jedidi et al. 2021).

Revealed-preference / market data methods recover WTP from actual purchases. These are the most credible because choices are consequential, but they are silent about prices never observed in the data and carry the identification problem we turn to next.

Estimator box: WTP from a choice model

Given choice data, estimate $\boldsymbol{\beta}$ and $\beta_p$ by maximum likelihood (logit) or hierarchical Bayes (mixed logit). The WTP for attribute $k$ is $\widehat{\text{WTP}}_k = -\hat\beta_k/\hat\beta_p$. What breaks it: the ratio is fragile when $\hat\beta_p \approx 0$ (a weak or wrong-signed price coefficient sends WTP to $\pm\infty$); price endogeneity biases $\hat\beta_p$ toward zero and inflates WTP; and a fixed $\beta_p$ across respondents imposes the implausible restriction that the marginal utility of income is identical for all, which the mixed-logit random coefficient relaxes (Dubé, Hortaçsu, and Joo 2021).

19.2 The Identification Problem in Demand Estimation

Estimating elasticity from observational price–quantity data confronts the oldest problem in econometrics: price is endogenous. Firms set prices in response to demand conditions the analyst does not see—a popular item is priced high because it is popular—so a naïve regression of quantity on price recovers a tangle of the demand curve and the firm’s pricing rule, biasing the elasticity toward zero (often to the wrong sign). Identification requires an instrument or structure that shifts price without shifting demand.

19.2.1 The Logit Demand System and BLP

The modern standard for differentiated-products demand is the random-coefficients logit, estimated by the method of Berry, Levinsohn, and Pakes (1995) (BLP). Consumer $i$’s indirect utility for product $j$ in market $t$ is

\[ u_{ijt} = \mathbf{x}_{jt}'\boldsymbol{\beta}_i - \alpha_i\,p_{jt} + \xi_{jt} + \varepsilon_{ijt}, \tag{19.4}\]

where $\mathbf{x}_{jt}$ are observed characteristics, $p_{jt}$ is price, $\xi_{jt}$ is the unobserved (to the analyst) product quality that consumers and firms see, and $\varepsilon_{ijt}$ is an extreme-value taste shock. The coefficients $(\boldsymbol{\beta}_i, \alpha_i)$ are random across consumers, generating realistic substitution patterns. The endogeneity is explicit and structural: price is correlated with $\xi_{jt}$ because firms price higher where unobserved quality is higher, $\mathbb{E}[p_{jt}\,\xi_{jt}] \ne 0$.

Estimator. Invert market shares to recover the mean utility $\delta_{jt}(\boldsymbol{\theta}) = \mathbf{x}_{jt}'\boldsymbol{\beta} - \alpha p_{jt} + \xi_{jt}$, form the structural error $\xi_{jt}(\boldsymbol{\theta})$, and choose the parameters by GMM to satisfy the moment condition $\mathbb{E}[\xi_{jt} \mid \mathbf{z}_{jt}] = 0$ for instruments $\mathbf{z}_{jt}$. What identifies it: valid instruments shift price (markups) but are mean-independent of $\xi_{jt}$—cost shifters, and the classic BLP instruments formed from characteristics of rival products, which move a firm’s markup through competition without entering its own demand shock. What breaks it: weak instruments (rival characteristics that barely move price) inflate standard errors and bias estimates toward OLS; instruments correlated with $\xi_{jt}$ (e.g., rival quality that proxies own quality in a correlated market) reintroduce the bias the design was meant to remove. The structural demand literature in marketing builds directly on this machinery to estimate price response with panel and scanner data (Chintagunta et al. 2006; Reiss 2011; Nair et al. 2017).

19.2.2 Why Field Experiments Are the Clean Benchmark

A randomized price experiment severs the link between price and $\xi_{jt}$ by construction: when the firm assigns price at random, $\mathbb{E}[p\,\xi] = 0$ by design, and the elasticity is identified by the experimental contrast alone. This is why pricing field experiments are the gold standard for elasticity, and why much recent work prices via controlled tests rather than relying on historical variation (Simester and Zhang 2010). The cost is external validity—an experiment identifies the elasticity at the tested prices, in the tested context, for the tested population—and the standard practical hazard is that competitors or consumers detect and react to the test. Figure 19.1 summarizes the identification logic that organizes the remainder of the methods in this chapter.

flowchart TD
    XI["Unobserved demand<br/>shifter (quality, ξ)"] -->|firm prices higher| P["Price p"]
    XI -->|raises demand| Q["Quantity Q"]
    P -->|true causal effect ε| Q
    P -.->|"naïve OLS conflates<br/>both arrows"| BIAS["Biased elasticity"]
    IV["Instrument:<br/>cost / rival shifters"] -->|shifts p, not ξ| P
    EXP["Randomized price"] -->|cuts ξ→p link| P
    STR["Structural model:<br/>BLP moments"] -->|"E[ξ|z]=0"| P
    style XI fill:#f8d7da,stroke:#842029
    style EXP fill:#d1e7dd,stroke:#0f5132
    style IV fill:#fff3cd,stroke:#664d03
    style STR fill:#cfe2ff,stroke:#084298

Figure 19.1: The price-endogeneity problem and three identification strategies. Unobserved demand shifters drive both price and quantity; credible elasticity estimates require breaking that back-door path.

19.3 Price Discrimination

A single price leaves surplus uncaptured on both sides: high-WTP buyers pocket consumer surplus, and low-WTP buyers who would have paid more than cost are priced out entirely. Price discrimination—charging different prices that do not reflect differences in cost—aims to recover both. The classical taxonomy distinguishes three degrees by the information the firm exploits.

Table 19.1: The three degrees of price discrimination, by the information the firm exploits.

Degree	Information used	Mechanism	Surplus captured	Real-world example
First	Each buyer’s exact $v$	Personalized price	All surplus (in theory)	Negotiated B2B deals; algorithmic personalization
Second	$v$ self-revealed through choice	Versioning, bundling, quantity discounts, nonlinear tariffs	Partial; buyers sort themselves	Software tiers; airline fare classes; family-size packs
Third	Observable group membership correlated with $v$	Different price per segment	Partial; bounded by arbitrage	Student/senior discounts; geographic pricing

19.3.1 First-Degree (Perfect) Discrimination

Under first-degree discrimination the firm knows each buyer’s reservation price and charges exactly $v$, extracting the entire surplus and selling to everyone whose $v$ exceeds marginal cost. Output is efficient (no willing buyer-above-cost is excluded) but all surplus accrues to the firm. Perfect discrimination is an idealization, but it is the conceptual ceiling that personalized pricing (Section 19.8) approaches as the firm’s information about individual WTP improves.

19.3.2 Second-Degree Discrimination: Self-Selection

When the firm cannot observe $v$ but knows its distribution, it offers a menu and lets buyers sort themselves—second-degree discrimination via self-selection. The design problem is a screening problem: construct options so that each consumer type prefers the option intended for it. Two constraints bind. Individual rationality (IR): each type must (weakly) prefer its option to no purchase. Incentive compatibility (IC): each type must (weakly) prefer its own option to every other option on the menu. The IC constraint is what forces the firm to leave information rent to high-WTP types: to stop a high type from masquerading as a low type, the firm must discount the high type’s option relative to first-degree, a markdown that is the price of not observing $v$ directly.

Three implementations recur:

Versioning offers quality-differentiated variants—a full-featured product and a deliberately degraded one—so that high-WTP buyers self-select into the premium version. The damaged-goods logic explains why firms incur cost to remove features: the degraded version exists to make the premium version’s price incentive-compatible for high types.

Bundling sells goods together at a price below the sum of standalone prices. When reservation prices for components are negatively correlated across buyers, bundling reduces the dispersion of total WTP and lets the firm price closer to the common bundle value, raising profit—the classic Stigler result. Mixed bundling, offering components both separately and as a bundle, dominates pure bundling when component valuations are heterogeneous, and is the architecture behind cable tiers and software suites.

Nonlinear (quantity) pricing charges a per-unit price that depends on quantity—a two-part tariff (fixed fee plus marginal price) or block tariff. With heterogeneous demand intensity, the optimal nonlinear schedule again leaves information rent to high-volume buyers and distorts the low-volume option downward. The marketing literature studies how such schedules interact with competition and with consumers’ misperception of their own future usage (Iyer, Soberman, and Villas-Boas 2005).

19.3.3 Third-Degree Discrimination: Observable Segments

When the firm observes a segment indicator correlated with WTP—age, location, student status, purchase history—it sets a separate optimal price per segment, each satisfying its own Lerner condition: charge the more inelastic segment the higher price. Third-degree discrimination is the most common in practice (coupons, geographic and channel pricing, demographic discounts) and is bounded by arbitrage: if low-price buyers can resell to high-price buyers, the scheme unravels, which is why discriminated goods are typically services, perishable, or identity-linked (a student ID, a loyalty account). The following example contrasts uniform pricing with optimal third-degree pricing across two segments with different elasticities.

Code

set.seed(170)
c <- 15
# Two segments with constant-elasticity demand Q = A * p^eps
seg <- data.frame(
  name = c("price-sensitive", "price-insensitive"),
  A    = c(5e6, 2e6),
  eps  = c(-2.5, -1.4)
)

# Optimal monopoly price under constant elasticity: p* = c * eps/(eps+1)
seg$p_opt  <- c * seg$eps / (seg$eps + 1)
seg$q_opt  <- seg$A * seg$p_opt^seg$eps
seg$profit <- (seg$p_opt - c) * seg$q_opt

# Uniform price: maximize total profit over a single price
pg <- seq(16, 80, by = 0.5)
tot <- sapply(pg, function(p) sum((p - c) * seg$A * p^seg$eps))
p_uniform <- pg[which.max(tot)]
profit_uniform <- max(tot)

knitr::kable(
  transform(seg, p_opt = round(p_opt, 2), profit = round(profit)),
  col.names = c("Segment", "Scale A", "Elasticity", "Optimal price", "Quantity", "Profit"),
  caption = "Optimal third-degree prices differ across segments by elasticity."
)

Optimal third-degree prices differ across segments by elasticity.
Segment	Scale A	Elasticity	Optimal price	Quantity	Profit
price-sensitive	5e+06	-2.5	25.0	1600.000	16000
price-insensitive	2e+06	-1.4	52.5	7812.805	292980

Code

cat("Uniform price:        ", p_uniform, " | profit:", round(profit_uniform), "\n")
#> Uniform price:         49.5  | profit: 302692
cat("Discrimination profit:", round(sum(seg$profit)),
    " | gain:", round(100 * (sum(seg$profit) / profit_uniform - 1), 1), "%\n")
#> Discrimination profit: 308980  | gain: 2.1 %

The price-insensitive segment optimally pays the higher price, and segmenting raises profit over the best single price—the basic dividend of discrimination.

19.4 Promotions and Discounting

Few products sell at a constant price for long. Temporary price promotions—a discount of limited duration—are ubiquitous in consumer packaged goods, where a large share of volume moves on deal. The empirical study of promotions is among the most mature in marketing because scanner data make deal incidence, depth, and timing directly observable.

19.4.1 Why Promotions Exist: Price Discrimination Over Time

A leading rationale for promotions is intertemporal price discrimination. Consumers differ in search and storage costs: price-sensitive “cherry pickers” pay attention to deals and stockpile, while loyal or time-pressed buyers pay the regular price. A pattern of high regular prices punctuated by deep discounts lets the firm charge the two groups different effective prices without an observable segment indicator—a temporal analogue of second-degree discrimination. Narasimhan (1988) models this coupon-and-deal logic, and Varian and Purohit (1980) shows that even with homogeneous goods, the mix of informed and uninformed consumers supports an equilibrium in which firms randomize prices (run sales) rather than converging on a single price. The implication is structural, not merely tactical: price dispersion and promotional cycling are equilibrium outcomes, not pricing errors.

19.4.2 Decomposing the Sales Spike

A promotion’s observed sales bump aggregates several distinct consumer responses, and a credible promotion model must separate them because they have opposite implications for incremental profit:

Brand switching—buyers substitute toward the promoted brand from competitors; genuinely incremental to the brand.
Category expansion—buyers consume more of the category; incremental to the category.
Purchase acceleration—buyers who would have bought later buy now; a timing shift that borrows from future sales.
Stockpiling—buyers increase inventory at home, depressing post-promotion demand (the post-promotion dip).

Estimating these shares is the central task of promotion analytics. Household panel decompositions consistently find that the large majority of a promotion’s sales spike comes from brand switching rather than category expansion, which means much promotional volume is incremental to the brand but cannibalizes the firm’s own future sales and rival sales rather than growing the pie (Van Heerde, Helsen, and Dekimpe 2007; Bell and Lattin 1998). Guyt and Gijsbrechts (2014) show that the decomposition itself depends on whether the analyst uses store or household data and on the deal-frequency environment, a caution against reading a single elasticity off aggregate data.

19.4.3 The Reference-Price Trap and Promotion Dynamics

Frequent discounting is not free even when each deal is individually profitable. Promotions erode the reference price—the internal benchmark against which buyers judge the next price (Section 19.5)—so that a brand which deals often trains its customers to wait, raising deal elasticity and depressing baseline sales over time (Kopalle and Hoffman 1992). The dynamic optimal promotion policy therefore trades the short-run volume gain against the long-run reference-price cost; Kopalle and Hoffman (1992) characterize this as a dynamic-programming problem in which the state is the prevailing reference price. Promotions also have asymmetric competitive effects: a stronger or higher-tier brand’s promotion steals more share than a weaker brand’s promotion of equal depth, so promotional power is not symmetric across the price tier (Sethuraman, Tellis, and Briesch 2011).

19.4.4 Promotion Framing and the Multitier Discount

How a discount is communicated affects response beyond its arithmetic depth—a behavioral theme we develop fully in Section 19.7. One frontier result concerns multitier discounting: presenting a quantity discount alongside an additional discount tier (for example, a “buy-one” price displayed next to a “buy-more” price) makes the quantity discount more attractive by inflating the perceived savings, even though the additional tier need not be the option chosen. Yang (2024) show that such multitier displays raise clickthrough, with the effect strengthening when the additional tier’s price is lower (amplifying perceived savings), among frequent category purchasers, and weakening among consumers who distrust advertised reference prices. The mechanism is reference-dependent: the extra tier resets the comparison standard against which the chosen option is evaluated, which is why the same discount performs differently under different framings.

19.5 Reference Prices

The standard model treats demand as a function of the absolute price. Decades of evidence contradict this: consumers judge a price relative to an internal standard, the reference price, and respond to the deviation between the observed price and that standard. The reference price may be internal (recalled from past purchases, formed from price history) or external (a manufacturer’s suggested price, a “regular price” struck through on the tag, a competitor’s displayed price). Reference effects are the empirical foundation of behavioral pricing.

19.5.1 Loss Aversion and Asymmetric Response

The dominant theoretical account imports prospect theory (Kahneman and Tversky 1979; A. Tversky and Kahneman 1981): buyers code a price below their reference point as a gain and a price above it as a loss, and because losses loom larger than gains (loss aversion), the demand response to a price increase above the reference exceeds the response to an equal price decrease below it. Let $r$ denote the reference price and $p$ the observed price. A reference-dependent demand specification splits the price deviation into gain and loss regions,

\[ \log Q = \alpha + \beta\,\log p + \gamma_{\text{gain}}\,(r - p)^{+} - \gamma_{\text{loss}}\,(p - r)^{+} + u, \tag{19.5}\]

where $(\cdot)^{+} = \max(\cdot, 0)$ and loss aversion predicts $\gamma_{\text{loss}} > \gamma_{\text{gain}} > 0$. The empirical literature, beginning with brand-choice models on scanner panels, robustly finds this asymmetry: surcharges relative to the reference price suppress purchase more than equal discounts stimulate it. The asymmetry is the reason a sequence of small increases is less damaging than one large increase, and why “everyday low pricing” can outperform high-low promotion for some categories: it stabilizes the reference point.

19.5.2 Estimating the Reference Price

The reference price is latent, which raises an identification problem of its own: the analyst must posit a formation rule before estimating response. The two leading formulations are a memory-based rule, in which the reference adapts toward observed prices via exponential smoothing, $r_t = \theta\,r_{t-1} + (1-\theta)\,p_{t-1}$, and a stimulus-based (contextual) rule, in which the reference is constructed from prices currently on the shelf. The two are observationally distinct and frequently both contribute; a model that omits the operative rule mis-attributes reference effects to absolute-price response. Crucially, the smoothing parameter $\theta$ and the response coefficients $(\gamma_{\text{gain}}, \gamma_{\text{loss}})$ are jointly identified only with sufficient independent variation in price and price history—if price follows a deterministic high-low cycle, the reference and the current price move together and the gain/loss coefficients are not separately identified from the level coefficient $\beta$. This is the reference-price analogue of the endogeneity problem in Section 19.2: collinearity between the price and its own lagged transform, not firm behavior, is what breaks identification here.

Code

set.seed(1717)
n <- 400
theta <- 0.7                      # reference-price memory
p <- 10 + 2 * sin(seq_len(n) / 5) + rnorm(n, 0, 0.8)   # noisy price path

# build the memory-based reference price
r <- numeric(n); r[1] <- p[1]
for (t in 2:n) r[t] <- theta * r[t - 1] + (1 - theta) * p[t - 1]

gain <- pmax(r - p, 0); loss <- pmax(p - r, 0)
# data-generating demand with loss aversion: loss coef > gain coef
logQ <- 6 - 0.8 * log(p) + 0.5 * gain - 1.6 * loss + rnorm(n, 0, 0.05)

fit <- lm(logQ ~ log(p) + gain + loss)
round(coef(summary(fit))[, 1:2], 3)
#>             Estimate Std. Error
#> (Intercept)    5.982      0.043
#> log(p)        -0.796      0.019
#> gain           0.505      0.005
#> loss          -1.594      0.004

The fitted loss coefficient is larger in magnitude than the gain coefficient, recovering the loss-aversion asymmetry the data were built with—provided the price path carries enough independent variation to separate the two.

19.6 Theoretical Foundations

The first half of this chapter rests on the economic baseline: a consumer with a stable reservation price $v$ buys when $v \ge p$ (Equation 19.1), demand is the survival function of the $v$ distribution, the monopolist prices by the Lerner condition (Equation 19.3), and price discrimination recovers surplus a single price forgoes (Table 19.1). In that world price is a number that enters utility linearly and the only psychology is the dispersion of $v$. The behavioral half of the chapter is organized by the theories that explain where this baseline fails, and behavioral economics supplies them. Naming the governing theories shows that the pricing effects below are not a grab-bag of tricks but consequences of a small set of results about how prices are judged.

The anchor is prospect theory (Kahneman and Tversky 1979; A. Tversky and Kahneman 1981). Consumers do not evaluate a price against final wealth; they evaluate it against a reference price, coding an outcome below the reference as a gain and one above it as a loss. Because the value function is steeper for losses than for gains, loss aversion produces asymmetric price response: a surcharge relative to the reference suppresses purchase more than an equal discount stimulates it, the asymmetry formalized in Equation 19.5 and given its empirical-generalization statement by Kalyanaram and Winer (1995) and Amos Tversky and Kahneman (1991). The same asymmetry is why discount framing (a lower price offered) provokes less resistance than surcharge framing (a fee added) for an identical net schedule, and why a sequence of small increases is less damaging than one large jump.

Mental accounting (Thaler 1985) explains the second cluster of effects. Consumers code outcomes into separate, non-fungible cognitive accounts and apply prospect-theory value within each, so the partitioning of a total price changes its perceived magnitude. Partitioned and drip pricing, the discount-versus-rebate-versus- bonus-pack framing of a promotion, and the silver-lining logic of bundling a small loss with a larger gain all follow from coding rules rather than from the arithmetic of the total. Anchoring (A. Tversky and Kahneman 1974) supplies the third: an external reference (a struck-through “regular price,” a manufacturer’s suggested price, a left-digit 9-ending) sets the standard against which the focal price is judged, and insufficient adjustment away from that anchor moves willingness to pay. Anchoring and reference-price formation are two views of the same comparison machinery that the reference-dependent demand model in Section 19.5 makes precise.

A fourth theory governs the intertemporal side of pricing. Hyperbolic discounting, the present-biased finding that near-term costs and benefits are over-weighted relative to distant ones (Laibson 1997), explains why subscriptions and “buy now, pay later” are powerful: deferring or fragmenting payment exploits the gap between a present-biased purchase decision and the future stream of charges the consumer discounts too steeply at the moment of choice. It is also why free-trial-to-paid conversion and subscription inertia are reliable revenue mechanisms, and why a self-aware consumer may want a commitment device against their own present bias. Together these four theories (prospect theory and reference dependence, mental accounting, anchoring, and present bias) are the behavioral departures from the utility-and-discrimination baseline that the rest of this section develops and quantifies.

19.7 Behavioral Pricing

Reference dependence is one of a broader family of departures from the rational single-price model. Behavioral pricing studies how the psychology of price perception—framing, fluency, signaling, and mental accounting—shapes demand in ways the firm can exploit (or be exploited by). We organize the main effects below; each has a clean experimental signature and a pricing implication.

19.7.1 Price as a Quality Signal

When quality is unobservable before purchase, price itself carries information: buyers infer that a higher price signals higher quality, especially in categories where quality is hard to judge and stakes are high. This price–quality inference inverts the usual demand logic over a range—raising price can raise perceived value and, in thin information environments, raise demand. The mechanism is the signaling logic of Chapter 11 and the lemons problem of Akerlof (1970): price can support a separating equilibrium in which only high-quality firms can profitably sustain a high price, so a high price is credible quality information. The pricing implication is that underpricing a premium product can suppress demand by signaling inferior quality, a failure mode invisible to a model in which demand is monotone decreasing in price.

19.7.2 Price Endings, Precision, and Fluency

Prices are not processed as raw numbers. Just-below (9-ending) pricing—$9.99 rather than $10.00—raises demand beyond the trivial one-cent saving, through a left-digit anchoring effect (the leading digit dominates magnitude perception) and a learned association of 9-endings with discounts. The framing of a price’s precision also matters: round prices are processed more fluently and feel “right” for emotional or hedonic purchases, while precise prices signal that a figure was carefully computed and can anchor negotiations more effectively (Janiszewski, Labroo, and Rucker 2016). These effects are robust enough to be exploited routinely in retail, but they are also category- and context-dependent, and a 9-ending that signals “discount” can undercut a premium positioning.

19.7.3 Partitioned Pricing, Drip Pricing, and Mental Accounting

How a total price is split changes its perceived magnitude. Partitioned pricing—separating a base price from a surcharge (shipping, handling, a resort fee)—can lower the perceived total because attention anchors on the salient base component and underweights add-ons. Its aggressive cousin, drip pricing, reveals mandatory charges sequentially during checkout, exploiting the sunk cost of search effort already invested. The mechanism is mental accounting (Thaler 1985): consumers evaluate gains and losses in separate cognitive accounts rather than integrating them, so the framing of a price into components—and the framing of a promotion as a discount versus a rebate versus a bonus pack—changes evaluation even when the net economics are identical. The same logic explains why bundling a small loss with a larger gain (silver linings) and segregating multiple gains maximizes perceived value.

19.7.4 The Pennies-a-Day and Temporal Reframing

Reframing a price’s temporal unit alters its acceptability: $365 per year framed as “a dollar a day” recruits a more favorable comparison standard (a trivial daily expenditure) and lifts compliance, the pennies-a-day effect. Subscription pricing exploits this directly. The common thread across these effects is that the number the consumer compares against—the reference, the salient digit, the temporal unit—is malleable, and the firm partly controls it through framing. Figure 19.2 organizes the behavioral effects by the cognitive mechanism each exploits.

flowchart LR
    subgraph REF["Reference dependence"]
      A1["Loss aversion<br/>(asymmetric response)"]
      A2["Promotion erosion<br/>of reference price"]
      A3["Multitier discount<br/>framing"]
    end
    subgraph FLU["Fluency & anchoring"]
      B1["9-ending / left-digit"]
      B2["Round vs. precise prices"]
    end
    subgraph MA["Mental accounting"]
      C1["Partitioned / drip pricing"]
      C2["Pennies-a-day reframing"]
      C3["Discount vs. rebate framing"]
    end
    subgraph SIG["Inference"]
      D1["Price–quality signaling"]
    end
    REF --> OUT["Demand deviates from<br/>the single-price benchmark"]
    FLU --> OUT
    MA --> OUT
    SIG --> OUT
    style OUT fill:#d1e7dd,stroke:#0f5132

Figure 19.2: A map of behavioral pricing effects by underlying mechanism. Each effect is a lever on the comparison standard or the perceived magnitude of a price, not on its arithmetic level.

19.8 Dynamic and Personalized Pricing

The classical theory sets one price for all buyers at one time. Digital commerce relaxes both restrictions. Dynamic pricing varies price over time in response to demand, inventory, and competition; personalized pricing varies price across individuals in response to what the firm infers about each buyer’s WTP. Together they push pricing toward the first-degree ideal of Table 19.1—and toward its ethical and regulatory limits.

19.8.1 Dynamic Pricing: Learning and Inventory

Two distinct forces drive price over time. The first is demand learning: a firm uncertain about the demand curve experiments with prices to estimate elasticity, then exploits the estimate—a price-setting bandit problem in which the firm balances the exploration value of an informative price against the exploitation value of the currently estimated optimum. The second is intertemporal capacity allocation: when capacity is fixed and perishable (airline seats, hotel nights, event tickets), the firm raises price as the sell-by date approaches and inventory tightens, the logic of revenue management. Both are now studied structurally on transaction data (Elberg et al. 2019; Cosguner and Seetharaman 2022), and both interact with forward-looking consumers who anticipate price paths: when buyers expect a clearance, they delay, and the firm’s optimal path must account for strategic waiting—the durable-goods monopolist’s commitment problem.

19.8.2 Personalized Pricing and the Value of Data

Personalized pricing conditions the offer on individual data—browsing history, device, location, purchase record—to approximate each buyer’s reservation price. Online, the firm can in principle observe a behavioral proxy for WTP and price against it, an operational route toward first-degree discrimination. The economics are double-edged. Shapiro (2016) and the price-discrimination-with-data literature show that better consumer information lets the firm extract more surplus, but competition can flip the sign: when rivals also personalize, the same data that enables extraction also intensifies price competition for each contestable consumer, and consumers can be made better off in aggregate. The welfare and profit consequences of personalization thus depend on market structure—monopoly personalization transfers surplus to the firm, competitive personalization can dissipate it (Chen et al. 2009; Aguirre et al. 2016). The data itself becomes a strategic asset whose value is exactly the incremental surplus the firm can extract by conditioning price on it.

19.8.3 Algorithmic Pricing and Collusion

When competing firms delegate pricing to learning algorithms, a new hazard emerges: algorithms that independently learn to set supra-competitive prices, sustaining tacit collusion without any explicit agreement. Because the algorithms reach a collusive outcome through repeated interaction rather than communication, the conduct falls outside the reach of antitrust doctrine built for human agreements, an active concern at the marketing–policy frontier. The same automation that lets a firm respond to demand in real time can, in a market of similar algorithms, soften price competition without anyone intending it.

19.8.4 Fairness and the Backlash Constraint

The binding constraint on personalized and dynamic pricing is often not the law but perceived fairness. Consumers judge price differences they cannot attribute to cost as unfair, and react with reduced purchase, reduced loyalty, and reputational punishment. The reference-transaction model of fairness holds that buyers anchor on a fair reference price (the price others pay, or the price they paid before) and treat a personalized surcharge as a loss imposed on them—linking fairness directly to the reference-dependence machinery of Section 19.5. The canonical cautionary case is the consumer backlash against early online price experiments that charged different customers different prices for the same item; the lesson generalizes to surge pricing, where transparency and a cost-based rationale moderate the backlash. Surcharge framing (a fee added) provokes more outrage than discount framing (a lower price offered) for the identical price schedule, a direct consequence of loss aversion. The managerial upshot is that the feasible degree of discrimination is set not by the firm’s data but by what consumers will tolerate before the fairness penalty exceeds the discrimination dividend.

19.8.5 Privacy Regulation as a Constraint on Personalization

Personalized pricing runs on personal data, so privacy regulation directly bounds it. Restrictions on tracking and data collection—and consumers’ own privacy choices—shrink the information set on which prices can be conditioned, attenuating the firm’s move toward first-degree discrimination. The evidence is that privacy regulation and opt-out reduce the granularity of targeting available to firms, with measurable effects on the data economy that underwrites personalization (Acquisti, John, and Loewenstein 2012; Johnson, Faraj, and Kudaravalli 2014). Privacy is thus not merely a compliance overlay but a first-order determinant of how far personalized pricing can go: the legal ceiling on data acquisition is, in effect, a ceiling on price discrimination.

19.9 Key Takeaways

The primitive of pricing is the reservation price; the demand curve is the survival function of the reservation-price distribution (Equation 19.2), and the elasticity governs the optimal markup through the Lerner condition (Equation 19.3). Estimating elasticity is the core empirical task.
Observational price data are endogenous—firms price on information the analyst cannot see—so credible elasticities require an instrument (cost or rival shifters), a structural model (BLP, Equation 19.4), or a randomized price experiment.
Price discrimination recovers surplus a single price forgoes, in three degrees distinguished by information (Table 19.1); second-degree menus must satisfy incentive compatibility, which forces the firm to leave information rent to high-WTP buyers.
Promotions are largely intertemporal price discrimination; most of a deal’s spike is brand switching, and frequent dealing erodes the reference price, imposing a dynamic cost on tactical gains.
Buyers respond to prices relative to a reference point with loss-averse asymmetry (Equation 19.5), and to framing, fluency, signaling, and mental accounting—levers on the comparison standard rather than the price level.
Dynamic and personalized pricing push toward first-degree extraction, but their feasible reach is bounded by competition, by perceived fairness (itself a reference-dependence phenomenon), and by privacy regulation.

19.10 Further Reading

For the structural estimation of demand and price response, the differentiated-products literature beginning with Berry, Levinsohn, and Pakes (1995) and developed in marketing by Chintagunta et al. (2006) and Nair et al. (2017) is the entry point; Dubé, Hortaçsu, and Joo (2021) treats random-coefficient logit estimation in depth. On promotions, Van Heerde, Helsen, and Dekimpe (2007) and Kopalle and Hoffman (1992) cover decomposition and the dynamic reference-price cost. The behavioral strand connects to Chapter 11 (price–quality signaling) and to the prospect-theoretic foundations in Kahneman and Tversky (1979), A. Tversky and Kahneman (1981), and Thaler (1985), with the reference-price empirical generalizations of Kalyanaram and Winer (1995), the riskless-choice loss-aversion model of Amos Tversky and Kahneman (1991), the anchoring foundation of A. Tversky and Kahneman (1974), and the present-bias account of subscriptions in Laibson (1997). The personalization frontier and its welfare economics are surveyed through Shapiro (2016), Chen et al. (2009), and Aguirre et al. (2016), with privacy constraints in Acquisti, John, and Loewenstein (2012).

Acquisti, Alessandro, Leslie K John, and George Loewenstein. 2012. “The Impact of Relative Standards on the Propensity to Disclose.” Journal of Marketing Research 49 (2): 160–74.

Aguirre, Elizabeth, Anne L. Roggeveen, Dhruv Grewal, and Martin Wetzels. 2016. “The Personalization-Privacy Paradox: Implications for New Media.” Journal of Consumer Marketing 33 (2): 98–110. https://doi.org/10.1108/jcm-06-2015-1458.

Akerlof, George A. 1970. “The Market for "Lemons": Quality Uncertainty and the Market Mechanism.” The Quarterly Journal of Economics 84 (3): 488. https://doi.org/10.2307/1879431.

Bell, David R., and James M. Lattin. 1998. “Shopping Behavior and Consumer Preference for Store Price Format: Why “Large Basket” Shoppers Prefer EDLP.” Marketing Science 17 (1): 66–88. https://doi.org/10.1287/mksc.17.1.66.

Berry, Steven, James Levinsohn, and Ariel Pakes. 1995. “Automobile Prices in Market Equilibrium.” Econometrica 63 (4): 841. https://doi.org/10.2307/2171802.

Chen, Yuxin, Yogesh V. Joshi, Jagmohan S. Raju, and Z. John Zhang. 2009. “A Theory of Combative Advertising.” Marketing Science 28 (1): 1–19. https://doi.org/10.1287/mksc.1080.0385.

Chintagunta, Pradeep, Tülin Erdem, Peter E Rossi, and Michel Wedel. 2006. “Structural Modeling in Marketing: Review and Assessment.” Marketing Science 25 (6): 604–16.

Cosguner, Koray, and PB Seetharaman. 2022. “Dynamic Pricing for New Products Using a Utility-Based Generalization of the Bass Diffusion Model.” Management Science 68 (3): 1904–22.

Ding, Amy Wenxuan, Shibo Li, and Patrali Chatterjee. 2015. “Learning User Real-Time Intent for Optimal Dynamic Web Page Transformation.” Information Systems Research 26 (2): 339–59.

Dubé, Jean-Pierre, Ali Hortaçsu, and Joonhwi Joo. 2021. “Random-Coefficients Logit Demand Estimation with Zero-Valued Market Shares.” Marketing Science 40 (4): 637–60.

Elberg, Andrés, Pedro M Gardete, Rosario Macera, and Carlos Noton. 2019. “Dynamic Effects of Price Promotions: Field Evidence, Consumer Search, and Supply-Side Implications.” Quantitative Marketing and Economics 17: 1–58.

Guyt, Jonne Y., and Els Gijsbrechts. 2014. “Take Turns or March in Sync? The Impact of the National Brand Promotion Calendar on Manufacturer and Retailer Performance.” Journal of Marketing Research 51 (6): 753–72. https://doi.org/10.1509/jmr.14.0193.

Iyer, Ganesh, David Soberman, and J. Miguel Villas-Boas. 2005. “The Targeting of Advertising.” Marketing Science 24 (3): 461–76. https://doi.org/10.1287/mksc.1050.0117.

Janiszewski, Chris, Aparna A. Labroo, and Derek D. Rucker. 2016. “A Tutorial in Consumer Research: Knowledge Creation and Knowledge Appreciation in Deductive-Conceptual Consumer Research.” Journal of Consumer Research 43 (2): 200–209. https://doi.org/10.1093/jcr/ucw023.

Jedidi, Kamel, Sharan Jagpal, and Puneet Manchanda. 2003. “Measuring Heterogeneous Reservation Prices for Product Bundles.” Marketing Science 22 (1): 107–30. https://doi.org/10.1287/mksc.22.1.107.12850.

Jedidi, Kamel, Bernd H. Schmitt, Malek Ben Sliman, and Yanyan Li. 2021. “R2M Index 1.0: Assessing the Practical Relevance of Academic Marketing Articles.” Journal of Marketing 85 (5): 22–41. https://doi.org/10.1177/00222429211028145.

Johnson, Steven L, Samer Faraj, and Srinivas Kudaravalli. 2014. “Emergence of Power Laws in Online Communities.” Mis Quarterly 38 (3): 795–A13.

Kahneman, Daniel, and Amos Tversky. 1979. “Prospect Theory: An Analysis of Decision Under Risk.” Econometrica 47 (2): 263–91. https://doi.org/10.2307/1914185.

Kalyanaram, Gurumurthy, and Russell S. Winer. 1995. “Empirical Generalizations from Reference Price Research.” Marketing Science 14 (3, supplement): G161–69. https://doi.org/10.1287/mksc.14.3.g161.

Kopalle, Praveen K., and Donna L. Hoffman. 1992. “Generalizing the Sensitivity Conditions in an Overall Index of Product Quality.” Journal of Consumer Research 18 (4): 530. https://doi.org/10.1086/209279.

Laibson, David. 1997. “Golden Eggs and Hyperbolic Discounting.” The Quarterly Journal of Economics 112 (2): 443–78. https://doi.org/10.1162/003355397555253.

Nair, Harikesh S., Sanjog Misra, William J. Hornbuckle, Ranjan Mishra, and Anand Acharya. 2017. “Big Data and Marketing Analytics in Gaming: Combining Empirical Models and Field Experimentation.” Marketing Science 36 (5): 699–725. https://doi.org/10.1287/mksc.2017.1039.

Narasimhan, Chakravarthi. 1988. “Competitive Promotional Strategies.” The Journal of Business 61 (4): 427. https://doi.org/10.1086/296442.

Reiss, Peter C. 2011. “Structural Workshop PaperDescriptive, Structural, and Experimental Empirical Methods in Marketing Research.” Marketing Science 30 (6): 950–64. https://doi.org/10.1287/mksc.1110.0681.

Sethuraman, Raj, Gerard J. Tellis, and Richard A. Briesch. 2011. “How Well Does Advertising Work? Generalizations from Meta-Analysis of Brand Advertising Elasticities.” Journal of Marketing Research 48 (3): 457–71. https://doi.org/10.1509/jmkr.48.3.457.

Shapiro, Bradley. 2016. “Positive Spillovers and Free Riding in Advertising of Prescription Pharmaceuticals: The Case of Antidepressants.” SSRN Electronic Journal. https://doi.org/10.2139/ssrn.2477877.

Simester, Duncan, and Juanjuan Zhang. 2010. “Why Are Bad Products So Hard to Kill?” Management Science 56 (7): 1161–79. https://doi.org/10.1287/mnsc.1100.1169.

Thaler, Richard. 1985. “Mental Accounting and Consumer Choice.” Marketing Science 4 (3): 199–214. https://doi.org/10.1287/mksc.4.3.199.

Toubia, Olivier, and Andrew T. Stephen. 2013. “Intrinsic Vs. Image-Related Utility in Social Media: Why Do People Contribute Content to Twitter?” Marketing Science 32 (3): 368–92. https://doi.org/10.1287/mksc.2013.0773.

Tversky, A., and D. Kahneman. 1974. “Judgment Under Uncertainty: Heuristics and Biases.” Science 185 (4157): 1124–31. https://doi.org/10.1126/science.185.4157.1124.

Tversky, A, and D Kahneman. 1981. “The Framing of Decisions and the Psychology of Choice.” Science 211 (4481): 453–58. https://doi.org/10.1126/science.7455683.

Tversky, Amos, and Daniel Kahneman. 1991. “Loss Aversion in Riskless Choice: A Reference-Dependent Model.” The Quarterly Journal of Economics 106 (4): 1039–61. https://doi.org/10.2307/2937956.

Van Heerde, Harald, Kristiaan Helsen, and Marnik G. Dekimpe. 2007. “The Impact of a Product-Harm Crisis on Marketing Effectiveness.” Marketing Science 26 (2): 230–45. https://doi.org/10.1287/mksc.1060.0227.

Varian, Hal R., and Devavrat Purohit. 1980. “A Model of Sales.” American Economic Review 70 (4): 651–59. https://www.jstor.org/stable/1803562?seq=1#metadata_info_tab_contents.

Yang, Haiyang. 2024. “EXPRESS: The Multitier Discount Effect.” Journal of Marketing Research, 00222437241295719.