The BLP Contraction Mapping

Summary

Estimation requires inverting the random-coefficients share system to recover the vector of mean utilities $δ_{t}$ that rationalizes the observed market shares $S_{t}$ for a given guess of $θ_{2}$ . Berry (1994) / BLP (1995) show this inversion can be computed by iterating the fixed point $δ_{t}^{h + 1} \leftarrow δ_{t}^{h} + lo g S_{t} - lo g s_{t} (δ_{t}^{h}, θ_{2})$ , which is a contraction mapping with a unique fixed point. This is the “inner loop” of the nested fixed-point algorithm. Modern best practice replaces plain iteration with acceleration (SQUAREM) or Jacobian-based (Levenberg-Marquardt) solvers, which are 3-12x faster.

Overview

For each candidate $θ_{2}$ , and separately and in parallel for each market $t$ , we must solve the $J_{t}$ equations in $J_{t}$ unknowns $δ_{t}$ that set predicted shares equal to observed shares. Holding $θ_{2}$ fixed turns one large $N$ -dimensional nonlinear system into $T$ small $J_{t}$ -dimensional systems, which is the source of BLP’s scalability. The mapping $δ_{t} \equiv D_{t}^{- 1} (S_{t}, θ_{2})$ is the Berry inversion.

Main Content

The system to solve ^share-system

In market $t$ we seek the $J_{t}$ -vector $δ_{t}$ satisfying
$S_{j t} = s_{j t} (δ_{t} ∣ θ_{2}) = \int \frac{exp ( δ _{j t} + μ _{ij t} )}{\sum _{k \in J_{t}} exp ( δ _{k t} + μ _{ik t} )} f (μ_{i t} ∣ θ_{2}) d μ_{i t} .$
A unique solution exists mathematically, but it cannot be solved exactly numerically; instead we solve to a tolerance expressed in the log difference in shares:
$∥ lo g S_{t} - lo g s_{t} (δ_{t}, θ_{2}) ∥_{\infty} \leq ϵ^{t o l} .$
Tolerance is a tradeoff: too loose and numerical error propagates to $\hat{θ}$ (Dubé et al. 2012); too tight and it can never be met. The recommended $ϵ^{t o l}$ is between 1E-14 and 1E-12 (machine epsilon $\approx$ 1E-16 in double precision).

The BLP fixed point is a contraction ^contraction

Berry et al. (1995) show the relation $f (δ_{t}) = δ_{t}$ given by
$f : δ_{t}^{h + 1} \leftarrow δ_{t}^{h} + lo g S_{t} - lo g s_{t} (δ_{t}^{h}, θ_{2})$
is a contraction mapping. Iteration is linearly convergent at a rate proportional to $L (θ_{2}) / [1 - L (θ_{2})]$ , where the Lipschitz constant (Dubé et al. 2012) is
$L (θ_{2}) = δ_{t} max I_{J_{t}} - \frac{\partial l o g s _{t}}{\partial δ _{t}} (δ_{t}, θ_{2})_{\infty} < 1.$
A smaller $L$ converges faster. A larger outside-good share generally implies a smaller Lipschitz constant; as the outside share shrinks, convergence takes increasingly many steps. (Empirically, shrinking $s_{0 t}$ from 0.91 to 0.27 raised iteration counts by ~5x.)

RCNL: the contraction must be dampened ^rcnl-dampen

For random-coefficients nested logit, the plain update is no longer a contraction; it must be dampened by $(1 - ρ)$ :
$δ_{t} \leftarrow δ_{t} + (1 - ρ) [lo g S_{t} - lo g s_{t} (δ_{t}, θ_{2})] .$
Convergence becomes arbitrarily slow as $ρ \to 1$ (more within-nest substitution), making RCNL harder to estimate. Without random coefficients ( $μ_{ij t} = 0$ ) the inversion has the closed form $δ_{j t} = lo g S_{j t} - lo g S_{0 t} - ρ lo g S_{j ∣ h t}$ (Berry 1994).

Faster solvers: Newton / Levenberg-Marquardt and SQUAREM ^accelerated

Two families improve on plain iteration:

Jacobian-based. Newton-Raphson updates $δ_{t}^{h + 1} \leftarrow δ_{t}^{h} - λ Ψ_{t}^{- 1} s_{t}$ , where $Ψ_{t} = \partial s_{t} / \partial δ_{t}$ . The Levenberg-Marquardt (LM) least-squares solver $min_{δ_{t}} \sum_{j} [S_{j t} - s_{j t}]^{2}$ is the fastest, most reliable Jacobian method; its update solves $[Ψ_{t}^{'} Ψ_{t} + λ diag (Ψ_{t}^{'} Ψ_{t})] x_{t} = Ψ_{t}^{'} [S_{t} - s_{t}]$ , interpolating between Gauss-Newton ( $λ = 0$ ) and gradient descent (large $λ$ ). Cost per iteration is high (computing $J_{t} \times J_{t}$ numerical-integral Jacobians).

Accelerated fixed points. SQUAREM (Varadhan & Roland 2008) uses the residual $r^{h} = f (δ_{t}^{h}) - δ_{t}^{h}$ and curvature to take an approximate Newton step without forming the Jacobian, at essentially the cost of plain iterations. It is 3-6x faster than direct iteration (3-8x fewer iterations in simulations) and is the recommended default. (No formal convergence guarantee, as the accelerated iteration is no longer a contraction; DF-SANE is a similar but slightly slower/less robust alternative.)

Examples

Inner loop for one guess of $θ_{2}$ (NFXP step (a)):

For each market $t$ , start $δ_{t}^{0}$ at the plain-logit solution $lo g S_{t} - lo g S_{0 t}$ (or the previous $θ_{2}$ ‘s solution, a “hot start” saving 10-20% of iterations).
Run SQUAREM (or LM) until $∥ lo g S_{t} - lo g s_{t} ∥_{\infty} \leq$ 1E-14.
Return $\hat{δ}_{t} (θ_{2})$ to the outer GMM loop. Markets run in parallel across processors.

Connections

Random Coefficients Logit Model — defines the share integrals being inverted.
GMM Estimation and Instruments for Price Endogeneity — the recovered $δ_{t}$ feeds the linear index and moment conditions (this is the “inner loop” of NFXP).
Numerical Integration and Optimization in PyBLP — each share evaluation requires numerical integration; the contraction is the “inner loop,” optimization the “outer loop.”
Method of Simulated Moments — the simulated shares make this a simulated fixed point.

Second Brain

Explorer

The BLP Contraction Mapping

The BLP Contraction Mapping

Overview

Main Content

Examples

Connections

See Also

Graph View

Table of Contents

Backlinks