Practical Issues in Simulation Estimation

Summary

Implementing simulation-based estimators requires attention to several practical concerns: using common random numbers to ensure convergence of the iterative optimizer, variance reduction techniques (antithetic variates, control variates) to reduce Monte Carlo noise, careful selection of the auxiliary model for indirect inference, appropriate step sizes for numerical derivatives, and understanding the trade-off between simulation size and computational cost. This note covers implementation guidance from both Liesenfeld & Breitung (1998) and Oh & Patton (2011).

Common Random Numbers

Common Random Numbers Are Essential for Convergence

At every iteration step of the optimization over $θ$ , the criterion function is estimated via simulations. For convergence of the iterative algorithm, it is critical to use common random numbers: the same set of simulated random variables ${ε_{t}^{(r)}}$ is used to generate simulated values $y_{t}^{(r)} (θ)$ for every value of $θ$ during the optimization.

If new random draws were made at each iteration, the randomness would introduce additional noise into the objective function surface, and the algorithm would fail to converge (Hendry, 1984).

For the reduced form $y_{t} = ϱ (z_{t}, ε_{t}; θ)$ :

Before optimization: draw and store ${ε_{t}^{(r)}}_{r = 1}^{R}$ for $t = 1, \dots, T$
During optimization: for each candidate $θ$ , compute $y_{t}^{(r)} (θ) = ϱ (z_{t}, ε_{t}^{(r)}; θ)$ using the stored draws
The criterion function then varies smoothly in $θ$ (modulo the inherent non-smoothness from indicator functions in EDF-based moments)

Variance Reduction Techniques

The overall variance of simulation-based estimators consists of two components:

Irreducible component: the variance the estimator would have if based on the exact criterion function
Monte Carlo sampling variance: additional variance from evaluating the criterion function by simulation

The first component is irreducible; the second can be reduced by increasing $R$ or by variance reduction techniques.

Antithetic Variates

Definition: Antithetic Variates

To estimate a quantity $ω$ by simulations, construct two negatively correlated estimates $\overset{ω}{^}_{1}$ and $\overset{ω}{^}_{2}$ such that their average $\frac{1}{2} (\overset{ω}{^}_{1} + \overset{ω}{^}_{2})$ has lower variance than either individual estimate.

Implementation: If the reduced form error term $ε_{t}$ has a symmetric distribution around zero:

Compute $\overset{ω}{^}_{1}$ using simulated values ${ε_{t}^{(r)}}$

Compute $\overset{ω}{^}_{2}$ using ${- ε_{t}^{(r)}}$ (same draws, opposite sign)

The two estimates are negatively correlated, and their average has reduced variance. The additional computing cost is negligible (one extra pass through the model with pre-generated draws).

Control Variates

Definition: Control Variates

The control variate technique uses two components for the final Monte Carlo estimate of a quantity $ω$ :

The natural Monte Carlo estimate $\overset{ω}{^}^{*}$

An estimate $\overset{ω}{ˉ}$ created from the same set of simulated random numbers as $\overset{ω}{^}^{*}$ , with known expectation and positive correlation with $\overset{ω}{^}^{*}$

The control variate estimate is:
$\tilde{ω} = (\overset{ω}{^}^{*} - \overset{ω}{ˉ}) + E (\overset{ω}{ˉ})$
Under suitable conditions, $var (\tilde{ω}) ≪ var (\overset{ω}{^}^{*})$ .

Application to indirect inference (Calzolari, Di Iorio, and Fiorentini, 1998): For the parameter-based indirect inference estimator, the control variate adjusts by the difference $(\hat{λ} - \overset{ˉ}{λ}_{T})$ , where $\hat{λ}$ is the auxiliary model estimate from observed data and $\overset{ˉ}{λ}_{T}$ is estimated from simulated data using $\hat{λ}_{T}$ as the parameter vector. Monte Carlo experiments show that combining indirect inference with control variates substantially reduces the Monte Carlo sampling variance, especially for continuous-time models.

Auxiliary Model Selection (Indirect Inference)

For Indirect Inference, the choice of auxiliary model determines efficiency. Two approaches:

Strategy 1: Simple, Close Auxiliary Model

Choose a tractable model that captures the salient features of the structural model:

Structural Model	Natural Auxiliary Model	Rationale
Stochastic volatility	GARCH	Both capture volatility clustering
CIR interest rate	Discrete-time approximation	Both model mean reversion
Jump-diffusion	GARCH with fat tails	Both produce leptokurtic returns

Advantages: Simple to implement, few auxiliary parameters, stable estimation.

Disadvantages: May miss features of the structural model, leading to efficiency loss.

Strategy 2: Data-Dependent SNP Model

Use the SNP auxiliary model, increasing its dimension with sample size:

Advantages: Asymptotically efficient; captures all features of the data.

Disadvantages:

Over-parameterized SNP models can lead to substantial efficiency loss in small samples
Choosing the SNP dimension ( $l_{μ}$ , $l_{s}$ , $k_{u}$ , $k_{z}$ ) requires model selection (AIC/BIC)
Computational cost is higher

Over-Parameterization

Andersen, Chung, and Sørensen (1998) find evidence that score generators based on an over-parameterized SNP model lead to a substantial loss of efficiency, especially in smaller samples. Substituting an ARCH-type scale function for the polynomial scale in the SNP model improves efficiency because it directly captures the autocorrelation in variance typical of financial data.

Step Size for Numerical Derivatives

When estimating the asymptotic covariance matrix of the SMM estimator, the Jacobian $G_{0}$ must be estimated by numerical differentiation (see Proposition 3).

Step-Size Guidelines (Oh & Patton, 2011)

The step size $ε_{T, S}$ must satisfy:

$ε_{T, S} \to 0$ (consistency)

$ε_{T, S} \times min (T, S) \to \infty$ (convergence rate requirement)

Practical rule: For sample size $T$ :
$ε_{T, S} ≫ \frac{1}{T}$
$T$ Lower bound ( $1/ T$ ) Recommended $ε_{T, S}$
250 0.063 0.1
1,000 0.032 0.01 – 0.1
5,000 0.014 0.01 – 0.05

Warning: Standard numerical differentiation defaults (e.g., MATLAB’s $6 \times 1 0^{- 6}$ , or forward-difference $ϵ_{machine} \approx 1.5 \times 1 0^{- 8}$ ) are catastrophically too small for this application. Using these defaults can produce coverage rates as low as 2% for a nominal 95% confidence interval (see step-size sensitivity results).

$T$	Lower bound ( $1/ T$ )	Recommended $ε_{T, S}$
250	0.063	0.1
1,000	0.032	0.01 – 0.1
5,000	0.014	0.01 – 0.05

Simulation Size ( $R$ or $S$ ) Trade-offs

The number of simulations affects both efficiency and computation time:

$R$ (or $S / T$ )	Variance Inflation	When to Use
$R = 1$	$2 \times$ GMM variance	Never (too noisy)
$R = 5$	$1.2 \times$	Quick preliminary analysis
$R = 20$	$1.05 \times$	Standard practice
$R = 25$ (Oh & Patton)	$1.04 \times$	Recommended for copula SMM
$R = 100$	$1.01 \times$	Final results if computation permits
$R \to \infty$	$1 \times$ (= GMM)	Infeasible but useful benchmark

For the factor copula model, Oh and Patton use $S = 25 \times T$ . The 4% efficiency loss relative to $S = \infty$ is negligible compared to the ~20-40% loss from using moments rather than the likelihood.

Small Sample Properties of Indirect Inference

Andersen, Chung, and Sørensen (1998) conduct a comprehensive Monte Carlo study of the EMM estimator for the stochastic volatility model:

EMM vs. GMM: EMM is generally more efficient than standard GMM
EMM vs. MLE: Likelihood-based estimators are generally more efficient, but EMM approaches their efficiency as sample size increases
Key practical finding: Substituting an ARCH-type scale function for the polynomial scale in the SNP model improves efficiency — it more directly captures the autocorrelation in variance implied by the SV model
Over-parameterized SNP specifications lose efficiency in small samples

Implementation Checklist

For practitioners implementing simulation-based estimation:

Common random numbers: Draw and fix ${ε_{t}^{(r)}}$ before optimization begins
Sufficient simulations: Use $R \geq 20$ (or $S \geq 20 T$ ) to keep variance inflation below 5%
Appropriate step size: Set $ε_{T, S}$ between $0.01$ and $0.1$ for numerical derivatives; never use software defaults
Bootstrap for $Σ_{0}$ : Use $B \geq 1, 000$ iid bootstrap replications
Weight matrix: Identity matrix is simple and stable; efficient weight matrix gives $χ^{2}$ J-test but may be numerically unstable
Convergence: Check that the optimizer converges from multiple starting values
Variance reduction: Consider antithetic variates if the model has symmetric errors

Connections

Implementation details for Method of Simulated Moments, Indirect Inference, and Efficient Method of Moments
Step-size guidance directly affects Proposition 3 variance estimation
Monte Carlo evidence in SMM Copula Simulation and Application validates these practical recommendations

Sources

tdb136.pdf — Liesenfeld & Breitung (1998), Section 6
Oh_Patton_SMM_copulas_nov11.pdf — Oh & Patton (2011), Sections 2.4, 3
Hendry, D.F. (1984), “Monte Carlo Experimentation in Econometrics,” Handbook of Econometrics Vol. 2
Calzolari, G., F. Di Iorio, and G. Fiorentini (1998), “Control Variates for Variance Reduction in Indirect Inference,” The Econometrics Journal, forthcoming

Second Brain

Explorer

Practical Issues in Simulation Estimation

Practical Issues in Simulation Estimation

Common Random Numbers

Variance Reduction Techniques

Antithetic Variates

Control Variates

Auxiliary Model Selection (Indirect Inference)

Strategy 1: Simple, Close Auxiliary Model

Strategy 2: Data-Dependent SNP Model

Step Size for Numerical Derivatives

Simulation Size ( $R$ or $S$ ) Trade-offs

Small Sample Properties of Indirect Inference

Implementation Checklist

Connections

See Also

Sources

Graph View

Table of Contents

Backlinks

Second Brain

Explorer

Practical Issues in Simulation Estimation

Practical Issues in Simulation Estimation

Common Random Numbers

Variance Reduction Techniques

Antithetic Variates

Control Variates

Auxiliary Model Selection (Indirect Inference)

Strategy 1: Simple, Close Auxiliary Model

Strategy 2: Data-Dependent SNP Model

Step Size for Numerical Derivatives

Simulation Size (R or S) Trade-offs

Small Sample Properties of Indirect Inference

Implementation Checklist

Connections

See Also

Sources

Graph View

Table of Contents

Backlinks

Simulation Size ( $R$ or $S$ ) Trade-offs