flowchart TD
A[Define attributes & levels<br/>actionable, feasible, bounded] --> B[Construct design<br/>fractional factorial / orthogonal array]
B --> C[Choose elicitation<br/>rate · rank · choose · MaxDiff]
C --> D[Collect responses]
D --> E[Estimate partworths<br/>OLS · MNL · mixed logit · HB]
E --> F[Derive quantities<br/>WTP · importance · share-of-preference]
F --> G[Managerial action<br/>design · price · segment · position]
37 Preference Measurement
Preference measurement is the problem of recovering, from observable choices or judgments, the latent valuation a consumer places on a product and on each of the attributes that compose it. It is the empirical bridge between what people do—rate, rank, click, buy—and what firms need to know: how much a feature is worth, what a segment will pay, which configuration to launch, and at what price. Where Chapter 11 asks what a brand is worth, this chapter asks the more granular question of what each part of an offering is worth, and supplies the estimators that turn survey or transaction data into those numbers.
The construct that anchors the chapter is the reservation price (also called willingness to pay, WTP): the maximum amount a consumer would pay for a product before preferring to forgo it. Reservation price is not a primitive that can simply be asked for—people are reluctant to reveal price sensitivity, and direct questions bias the answer downward—so it must be inferred from a model of utility. The dominant inferential machinery is conjoint analysis, a family of designed experiments and discrete-choice models that decompose overall preference into attribute-level partworths and, with a price attribute included, express those partworths in dollars.
The chapter proceeds from intuition to formalism to estimation. We first define reservation price and derive it from an additive utility model, making explicit the assumptions that license the derivation. We then develop conjoint analysis as the general framework: its experimental-design logic, its measurement variants (full-profile, choice-based, adaptive, hybrid, self-explicated, MaxDiff), the random-utility model that underlies the modern choice-based form, and the estimators (OLS, multinomial and mixed logit, hierarchical Bayes) that recover heterogeneous partworths. Throughout we flag what breaks identification—confounded designs, scale–heterogeneity confounds, hypothetical bias—because a partworth that is not identified is a number without a referent. Worked, reproducible R code accompanies each major method.
37.1 Reservation Price and the Additive Utility Model
The commercial questions branded around pricing—segmentation, positioning, product-line design, the go/no-go decision on a feature—reduce to a single quantity per consumer: how many dollars of value the consumer attaches to a given configuration. Two routes lead to that quantity. The direct route asks the consumer outright, but self-reported reservation prices are systematically biased downward: respondents understate WTP because they do not wish to appear price insensitive, and because a hypothetical question carries no budget consequence.1 The indirect route, which this chapter develops, recovers reservation price as a derived quantity from a fitted utility model, most commonly the coefficients of a conjoint study.
The indirect route rests on one structural assumption, which should be stated before any result: additive separability of utility across attributes. Consumption utility is taken to equal the sum of the utilities contributed by the product’s attributes, with no interaction among them. Let a product be described by a price \(P\) and by \(N\) non-price attributes, each taking one of several discrete levels. Index the attributes by \(k = 1,\dots,N\) and let \(A_k\) denote the level of attribute \(k\). The additive model writes a consumer’s utility (or rating) as
\[ U = \beta_0 + \beta_p P + \sum_{k=1}^{N} \beta_k A_k + \epsilon, \tag{37.1}\]
where the \(\beta\)’s are utility weights (partworths in conjoint language), \(\beta_0\) absorbs consumer-specific shifts in the rating scale, \(\beta_p\) is the marginal utility of price (expected to be negative), and \(\epsilon\) is an idiosyncratic error. Equation Equation 37.1 is the linear, additive special case of the random-utility model formalized in Section 37.2.2; the categorical attributes \(A_k\) are in practice expanded into level dummies, so each attribute contributes one coefficient per non-baseline level.
The dollar value of one util follows immediately. Because price enters linearly, the marginal rate of substitution between money and utility is the negative reciprocal of the price coefficient. If \(\Delta p\) is the price difference between two levels of the price attribute, the dollar value of a unit of utility is
\[ \text{\$ per util} = -\frac{\Delta p}{\beta_p}, \tag{37.2}\]
and the utility a consumer derives from attribute \(k\), expressed per unit of that attribute, is \((\beta_k / \Delta A_k)\,A_k\), where \(\Delta A_k\) is the spacing between adjacent levels of attribute \(k\). The reservation price for a configuration is then the total utility of its non-price attributes converted to dollars:
\[ r(P) = -\frac{\Delta p}{\beta_p}\sum_{k=1}^{N}\frac{\beta_k}{\Delta A_k}\,A_k. \tag{37.3}\]
The interpretation is exactly the intuition with which we began. A consumer’s reservation price for a product is the sum of the dollar values she places on each of its attributes—each attribute’s partworth, scaled by the dollar value of a util in 1. Two assumptions make Equation 37.3 legitimate, and both fail in identifiable circumstances. First, additivity: if attributes interact (a fast processor is worth more in a laptop with a good screen), the cross-terms omitted from Equation 37.1 are loaded onto the main effects and the partworths are biased. Second, linearity of the price disutility: if \(\beta_p\) is not constant across the price range—if marginal price sensitivity rises near a budget ceiling—then \(\Delta p / \beta_p\) is not a single number and 1 is only a local approximation. We return to both threats once the design machinery is in place.
We write \(\beta_p < 0\), so the dollar-per-util conversion in 1 carries an explicit negative sign that makes \(r(P)\) positive for value-adding attributes. In a ratings model the scale of \(U\) is set by the response scale and \(\beta_0\) absorbs per-respondent scale heterogeneity; in a choice model (Section 37.2.2) the scale is fixed by normalizing the error variance, which has consequences for cross-respondent comparison that we take up under the scale–heterogeneity confound.
37.2 Conjoint Analysis
Conjoint analysis—originally conjoint measurement, a term inherited from mathematical psychology and psychometrics—is the workhorse for estimating the partworths in Equation 37.1. Its premise is that consumers evaluate products holistically, as bundles, and that the analyst can recover the weight on each attribute by observing evaluations of systematically varied bundles. The name encodes the logic: the joint (“con-joint”) effect of attributes considered together is decomposed into separable contributions.
Conjoint analysis is a decompositional method that estimates the structure of a consumer’s preferences given her overall evaluations of a set of alternatives that are prespecified in terms of levels of different attributes.
The method’s enduring appeal is twofold. It improves the analyst’s ability to assess customers’ true wants—above all their price sensitivity, which direct questioning distorts—and it dissolves the “everything is important” problem that plagues self-reported importance ratings, forcing trade-offs that reveal which attributes genuinely move choice. Netzer et al. (2008) survey the field and organize it around three managerial tasks that conjoint serves—product development, pricing and segmentation, and positioning—together with parallel uses for consumers (recommendation agents), policymakers, and academic researchers; the review traces the field’s movement from designs that maximize statistical efficiency toward designs and adaptive elicitation that also respect managerial and respondent constraints.
A foundational distinction motivates why preference, not perception, is the target. Perception—beliefs about what products exist and what they are like—is largely homogeneous across consumers: people broadly agree that one car is faster than another. Preference—which product a consumer actually wants—is heterogeneous: people who agree on the facts still choose differently (Netzer et al. 2008). Conjoint is a tool for the second object, and its modern forms are built to estimate the distribution of partworths across consumers, not merely their average. The original combinatorial treatment of Green and Wind (1975) already contained the core recipe: use an orthogonal array to construct a balanced subset of attribute combinations so that each attribute’s contribution is estimated independently of the others, then collect rankings or ratings over that reduced set. Their catalog of applications—new-product development, package design, pricing and brand alternatives, service design, and industrial (organizational) buying—remains an accurate map of where conjoint is used today.
37.2.1 Designing the Attribute Space
The quality of a conjoint study is fixed before any data are collected, in the choice of attributes and levels. Four design principles recur, and each protects a specific inferential property rather than being mere good practice. Attributes must be actionable—a partworth on a feature the firm cannot build is uninterpretable as a decision input. Their number must be bounded, because respondent cognitive load grows with the dimensionality of each profile and noisy responses inflate \(\epsilon\) in Equation 37.1. Levels must span a feasible and relevant range: every level should be one the firm could actually offer and one the consumer would plausibly encounter, since extrapolating partworths outside the studied range is unsupported. Finally, the number of levels per attribute is itself a design variable with a known artifact—the number-of-levels effect, whereby an attribute is estimated to be more important simply because it is split into more levels, independent of its true importance.2
Under additivity and independence, the number of profiles the design must support equals the number of estimated dummy variables. When the analyst suspects an interaction—a violation of the additive separability that Equation 37.1 assumes—two remedies are available, and they trade respondent effort against bias. One can fold the interacting pair into a single higher-level “super-attribute” whose levels enumerate the relevant combinations, absorbing the interaction into main effects; or one can enlarge the design to include enough profiles to estimate the interaction term explicitly. The first is cheaper but coarsens the attribute space; the second is faithful but costly in profiles.
The mechanism that makes a reduced design estimable is the fractional factorial: a carefully chosen subset of the full factorial that preserves orthogonality among main effects so that partworths remain independently identified. A full factorial over \(N\) attributes with \(L\) levels each contains \(L^N\) profiles—infeasible for any realistic \(N\)— so the analyst selects a fraction in which the columns of the design matrix remain uncorrelated. Figure 37.1 summarizes the workflow from attribute definition through to managerial action.
Conjoint is rarely run in isolation. It pairs naturally with factor analysis for reducing a large attribute set to underlying dimensions, with perceptual mapping (multidimensional scaling) for positioning the estimated preferences in a perceptual space, and with cluster analysis for grouping consumers by their partworth vectors into actionable segments—the empirical expression of preference heterogeneity.
37.2.2 The Random-Utility Foundation
The ratings model in Equation 37.1 treats \(U\) as an observed (or directly reported) quantity. Modern conjoint, especially its choice-based form, instead treats \(U\) as latent and observes only the consumer’s choice among alternatives. This is the random-utility model (RUM), the foundation McFadden built for discrete choice (McFadden 1986, 2001). Consumer \(i\) derives utility from alternative \(j\) in choice set \(\mathcal{S}_t\),
\[ U_{ijt} = \mathbf{x}_{jt}^{\top}\boldsymbol{\beta}_i + \varepsilon_{ijt}, \tag{37.4}\]
where \(\mathbf{x}_{jt}\) collects the attribute levels of alternative \(j\) (including price), \(\boldsymbol{\beta}_i\) is consumer \(i\)’s partworth vector, and \(\varepsilon_{ijt}\) is an unobserved component. The consumer chooses the alternative of greatest utility, so the analyst observes \(y_{it} = \arg\max_{j \in \mathcal{S}_t} U_{ijt}\). The systematic part \(\mathbf{x}_{jt}^{\top}\boldsymbol{\beta}_i\) is exactly the additive partworth sum of Equation 37.1; what RUM adds is an explicit stochastic structure on \(\varepsilon\) that delivers a likelihood for choices.
If the \(\varepsilon_{ijt}\) are independent and identically distributed type-I extreme value, the choice probabilities take the multinomial logit (MNL) form,
\[ \Pr(y_{it} = j) = \frac{\exp(\mathbf{x}_{jt}^{\top}\boldsymbol{\beta}_i)} {\sum_{l \in \mathcal{S}_t}\exp(\mathbf{x}_{lt}^{\top}\boldsymbol{\beta}_i)}. \tag{37.5}\]
The aggregate logit of Equation 37.5 (with \(\boldsymbol{\beta}_i = \boldsymbol{\beta}\) common to all consumers) is the model behind the Guadagni and Little (1983) scanner-data application that launched choice modeling in marketing, and it is the workhorse for demand estimation more broadly (Berry, Levinsohn, and Pakes 1995). Two identification facts must be stated. First, only utility differences are identified: adding a constant to every alternative’s utility leaves choices unchanged, so one alternative’s intercept is normalized to zero. Second, the scale is not separately identified from the partworths: the error variance is normalized (to \(\pi^2/6\) in the logit), which means estimated \(\boldsymbol{\beta}\) are confounded with scale. This scale normalization is the root of the scale–heterogeneity confound discussed below. MNL also carries the independence of irrelevant alternatives (IIA) property—the odds of choosing \(j\) over \(l\) are independent of other alternatives— which is implausible when alternatives are close substitutes and which the richer models below relax.
37.2.3 Heterogeneity: Mixed Logit and Hierarchical Bayes
Because preference is heterogeneous, a single \(\boldsymbol{\beta}\) misrepresents the market and, worse, can produce biased aggregate elasticities. The mixed (random- coefficients) logit lets partworths vary across consumers, \(\boldsymbol{\beta}_i \sim g(\boldsymbol{\beta}\mid\boldsymbol{\theta})\), and integrates them out of the likelihood:
\[ \Pr(y_{it} = j) = \int \frac{\exp(\mathbf{x}_{jt}^{\top}\boldsymbol{\beta})} {\sum_{l \in \mathcal{S}_t}\exp(\mathbf{x}_{lt}^{\top}\boldsymbol{\beta})}\, g(\boldsymbol{\beta}\mid\boldsymbol{\theta})\,d\boldsymbol{\beta}. \tag{37.6}\]
Mixed logit relaxes IIA and accommodates flexible substitution, at the cost of a likelihood with no closed form that must be simulated. In marketing the dominant route to Equation 37.6 is hierarchical Bayes (HB): a Gaussian (or richer) population prior \(\boldsymbol{\beta}_i \sim \mathcal{N}(\bar{\boldsymbol{\beta}}, \boldsymbol{\Sigma})\) is placed over individual partworths, and Markov chain Monte Carlo recovers a full posterior for each respondent’s partworth vector by borrowing strength across the panel (Rossi 2014; Allenby, Leone, and Jen 1999). HB is what makes individual-level WTP estimable from a handful of choices per respondent—too few to fit a separate model per person—and it is the engine behind commercial choice-based conjoint. The same Bayesian decomposition recovers brand value from choice data by splitting utility into attribute-driven and name-driven components (Kamakura and Russell 1993; Jedidi, Jagpal, and Manchanda 2003).
Because the logit fixes the error scale, what looks like heterogeneity in partworths can instead be heterogeneity in scale—how deterministically a respondent chooses. Two respondents with identical relative preferences but different choice consistency will appear to have proportionally different \(\boldsymbol{\beta}_i\). WTP, a ratio of two coefficients (\(-\beta_k/\beta_p\)), partially cancels scale, which is one reason WTP is often more stable across respondents than raw partworths—but only if scale is common to the numerator and denominator. Mis-attributing scale to preference is a standard threat to identification in choice-based conjoint.
37.2.4 Variants of Conjoint
Conjoint is a family, not a single method. The variants differ in how preferences are elicited and therefore in what they assume about the respondent and what they can estimate. Table 37.1 contrasts them.
| Variant | Elicitation task | Estimator | When it fits |
|---|---|---|---|
| Full-profile | Rate or rank complete profiles | OLS / regression on partworths | Few attributes; ratings interpretable |
| Choice-based (CBC) | Choose one profile from a set | MNL / mixed logit / HB | Realistic; recovers WTP and shares |
| Adaptive (ACA) | Tailored questions by prior answers | Self-explicated + tradeoff, often HB | Many attributes; shorten survey |
| Hybrid | Self-explicated stage + full-profile/choice | Two-stage / HB | Many attributes with efficiency |
| Self-explicated | Rate level desirability and attribute importance | Weighted compositional sum | Many attributes; minimal load |
| MaxDiff | Pick most- and least-preferred items | Best–worst MNL | Importance ranking of many items |
Full-profile conjoint presents complete product descriptions for rating or ranking and estimates the partworths of Equation 37.1 by regression. A classic result is that ranking and rating data, though they look different, recover the same preference structure and need not be treated as fundamentally distinct response modes (Kalish and Nelson 1991).
Choice-based conjoint (CBC), also called discrete-choice conjoint, replaces the rating with a choice among profiles, often including a no-choice option. It is the most behaviorally realistic variant—people choose in markets, they do not rate—and it is estimated with the RUM machinery of Section 37.2.2, yielding WTP and market-share simulations directly. CBC is now the default in commercial practice.
Adaptive conjoint (ACA) varies the choice sets dynamically, conditioning each question on the respondent’s earlier answers to concentrate information where the respondent’s preferences are most uncertain, thereby shortening the survey for high-dimensional attribute spaces. The polyhedral methods of Toubia and Stephen (2013) formalize adaptive question selection as iteratively shrinking the set of partworth vectors consistent with a respondent’s answers, and incentive-aligned and incentive-compatible variants address the hypothetical-bias problem at the elicitation stage. Hybrid conjoint combines a self-explicated stage with full-profile or choice tasks, and self-explicated conjoint dispenses with profiles entirely, asking respondents to rate each level’s desirability and each attribute’s importance and composing partworths from the product—cheap and scalable but vulnerable to the “everything is important” inflation that holistic methods avoid. MaxDiff (best–worst scaling) asks respondents to identify the most- and least-preferred items in a set, a task that produces sharper discrimination among many items than rating scales and is estimated as a best–worst extension of MNL.
A persistent practical problem is missing data in partial profiles—respondents do not see every attribute or level. Bradlow, Hu, and Ho (2004) develop a learning-based model that imputes the missing levels by treating the partial profile as informative about the respondent’s inferences, rather than discarding incomplete responses; modern machine- learning approaches extend this to learn preferences from sparse, adaptively collected data (Ding, Li, and Chatterjee 2015). The broader frontier integrates conjoint with text and image data and with structural demand models so that designed-experiment partworths and revealed-preference market data can be reconciled (Netzer, Lattin, and Srinivasan 2008; Rao and Wang 2017; Wedel and Kannan 2016).
37.3 Worked Examples
The examples below use the conjoint and AlgDesign packages, which implement the orthogonal-array design and OLS-based partworth estimation of full-profile conjoint. They are deliberately small and reproducible; production CBC studies would substitute the HB estimator of Section 37.2.2.
37.3.1 Generating an Efficient Design
The first task is to reduce a full factorial to an estimable fractional design whose columns remain (near-)orthogonal. The example below specifies a four-attribute product and extracts an orthogonal fraction, then verifies that the selected profiles are mutually near-uncorrelated—the property that keeps partworths independently identified.
Code
library(conjoint)
set.seed(123)
# A four-attribute product; full factorial = 2 x 2 x 2 x 2 = 16 profiles
experiment <- expand.grid(
Weight = c("light", "heavy"),
Price = c("low", "high"),
Warranty = c("2yr", "3yr"),
Battery = c("10h", "20h")
)
# Reduce to an orthogonal fraction
design <- caFactorialDesign(data = experiment, type = "orthogonal")
encoded <- caEncodedDesign(design)
# Orthogonality check: off-diagonal correlations should be near zero
round(cor(encoded), 2)
#> Weight Price Warranty Battery
#> Weight 1 0 0 0
#> Price 0 1 0 0
#> Warranty 0 0 1 0
#> Battery 0 0 0 137.3.2 Estimating Partworths and Importances
With a design and a respondent’s preferences in hand, OLS recovers the partworths. The example uses the tea data shipped with the conjoint package: preferences of respondents over tea profiles described by price, variety, kind, and aroma. We estimate one respondent’s partworths, then the aggregate importances—the share of total preference range each attribute commands.
Code
library(conjoint)
data(tea)
# Single-respondent partworths via OLS on the encoded design
caModel(y = tprefm[1, ], x = tprof)
#>
#> Call:
#> lm(formula = frml)
#>
#> Residuals:
#> 1 2 3 4 5 6 7 8 9 10
#> 1.1345 -1.4897 0.3103 -0.2655 0.3103 0.1931 1.5931 -1.4310 -1.4310 1.1207
#> 11 12 13
#> 0.3690 1.1931 -1.6069
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 3.3937 0.5439 6.240 0.00155 **
#> factor(x$price)1 -1.5172 0.7944 -1.910 0.11440
#> factor(x$price)2 -1.1414 0.6889 -1.657 0.15844
#> factor(x$variety)1 -0.4747 0.6889 -0.689 0.52141
#> factor(x$variety)2 -0.6747 0.6889 -0.979 0.37234
#> factor(x$kind)1 0.6586 0.6889 0.956 0.38293
#> factor(x$kind)2 -1.5172 0.7944 -1.910 0.11440
#> factor(x$aroma)1 0.6293 0.5093 1.236 0.27150
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 1.78 on 5 degrees of freedom
#> Multiple R-squared: 0.8184, Adjusted R-squared: 0.5642
#> F-statistic: 3.22 on 7 and 5 DF, p-value: 0.1082
# Aggregate attribute importances across all respondents
caImportance(y = tpref, x = tprof)
#> [1] 24.76 32.22 27.15 15.88The importances sum to 100% and quantify how much of the preference variation each attribute explains. They are the empirical answer to “what matters?”—and, unlike self-reported importance ratings, they are forced to respect trade-offs, which is precisely why conjoint dissolves the “everything is important” pathology.
37.3.3 From Partworths to Willingness to Pay
The managerial payoff is dollar-valued WTP, obtained by dividing each attribute’s partworth by the (negative) price partworth, as in 1. The synthetic example below fits an aggregate model with a numeric price attribute so that the price coefficient is a single slope, then converts non-price partworths to dollars.
Code
set.seed(2024)
# Synthetic full-profile ratings for a 3-attribute product with numeric price
profiles <- expand.grid(
price = c(10, 20, 30), # dollars
battery = c(0, 1), # 0 = 10h, 1 = 20h
warranty = c(0, 1) # 0 = 2yr, 1 = 3yr
)
# True partworths (utils): price hurts, battery and warranty help
beta_true <- c(intercept = 5, price = -0.15, battery = 1.2, warranty = 0.6)
profiles$rating <- with(profiles,
beta_true["intercept"] + beta_true["price"] * price +
beta_true["battery"] * battery + beta_true["warranty"] * warranty +
rnorm(nrow(profiles), sd = 0.1)
)
fit <- lm(rating ~ price + battery + warranty, data = profiles)
b <- coef(fit)
# Dollar value of one util = -1 / price coefficient (Delta p = 1 here)
dollars_per_util <- -1 / b["price"]
wtp <- c(
battery_20h_vs_10h = b["battery"] * dollars_per_util,
warranty_3yr_vs_2yr = b["warranty"] * dollars_per_util
)
round(wtp, 2)
#> battery_20h_vs_10h.battery warranty_3yr_vs_2yr.warranty
#> 7.95 3.26The recovered WTP—the dollar premium a consumer would pay to upgrade battery life or extend the warranty—is the reservation-price increment of Equation 37.3, attribute by attribute. Summing these increments across a configuration’s upgraded attributes yields the configuration’s total reservation price relative to the baseline product, which is the input that pricing and product-line decisions actually consume.
37.4 Pitfalls and Identification
Three threats recur, each tracing to an assumption made explicit above.
Hypothetical bias. Stated-preference partworths can diverge from revealed behavior because survey choices carry no consequence. Incentive-aligned designs—where the respondent faces a real chance of receiving the chosen option—and the reconciliation of conjoint partworths against market data are the standard defenses (Toubia and Netzer 2017; Ding, Li, and Chatterjee 2015). WTP, being a ratio, is somewhat insulated, but level effects on the absolute scale are not.
Confounded designs. If the fractional factorial is poorly chosen, attribute columns are correlated and their partworths are not separately identified—the design, not the data, determines the estimates. The orthogonality check in the worked example is not decorative; it is the diagnostic that the partworths in Equation 37.1 are recoverable at all.
Heterogeneity and scale. Treating a heterogeneous market with a single aggregate logit biases elasticities through composition; the mixed-logit and HB estimators of Section 37.2.2 exist to fix this. But heterogeneity in choice consistency (scale) masquerades as heterogeneity in preference, so individual-level partworths must be interpreted with the scale– heterogeneity confound in mind. Segmenting on partworth vectors via cluster analysis is sound only after this confound is addressed.
A final, structural caveat returns to additivity. Equation Equation 37.1 and the reservation-price formula Equation 37.3 assume attributes contribute independently. Where they do not—where a feature’s value depends on the presence of another—the analyst must either model the interaction explicitly (more profiles) or absorb it into a higher-level attribute, as discussed in Section 37.2. A partworth estimated under a falsely additive model is biased toward the configurations over-represented in the design, and the WTP derived from it inherits that bias.
37.5 Key Takeaways
-
Reservation price is derived, not asked. Direct WTP questions bias downward; the defensible route recovers WTP from a fitted utility model as the partworth-to-price ratio in
- Additive separability is the load-bearing assumption. Equations Equation 37.1 and Equation 37.3 hold only when attributes do not interact and price disutility is locally linear; both fail in identifiable ways.
- Choice-based conjoint with random utility is the modern default. The MNL/mixed-logit/HB stack of Section 37.2.2 estimates heterogeneous, individual-level partworths from realistic choice tasks (McFadden 1986; Rossi 2014).
- Design determines identification. Orthogonal fractional factorials keep partworths separately estimable; a confounded design produces numbers without referents.
- Mind the confounds. Hypothetical bias, the number-of-levels effect, and the scale–heterogeneity confound each threaten the interpretation of partworths and the WTP derived from them.
Direct elicitation is not worthless—open-ended WTP, the Becker–DeGroot– Marschak mechanism, and contingent valuation all have their uses—but for multi-attribute products the indirect route dominates because it yields attribute-level dollar values, not just a single product-level number, and it embeds the valuation in a choice context that disciplines hypothetical bias.↩︎
The number-of-levels effect is partly a halo of the level count and partly an ordering artifact; it means importance comparisons across attributes with different numbers of levels are not clean. The defensive design is to equalize level counts where possible, or to interpret importances within, not across, attributes.↩︎