Estimation and Structure Selection for Vines

Summary

Inference on R-vines splits into three tasks: (i) selecting the tree structure, (ii) choosing a copula family for each of the $d (d - 1) /2$ pair-copulae, and (iii) estimating the parameters. Ideally (i)-(ii) are done jointly, but in practice everything is done stepwise. The dominant approach is Dißmann’s algorithm — a greedy, bottom-up maximum-spanning-tree heuristic that maximizes dependence in the lowest trees — combined with AIC/BIC family selection and sequential (level-by-level) parameter estimation. Truncation and pruning control the parameter explosion in high dimensions.

Overview

The flexibility of vines comes at the cost of a combinatorial structure space and a parameter count that grows quadratically with dimension. The number of distinct R-vines on $d$ variables is

2^{(2 d - 2) - 1} d!,

so globally optimal structure search is infeasible beyond small $d$ . The practical strategy exploits the fact that lower trees are estimated more precisely, so the structure is built bottom-up to capture the strongest dependencies first.

Main Content

Structure selection — Dißmann's algorithm

Originally proposed for C-/D-vines by Aas et al. (2009) and extended to general R-vines by Dißmann et al. (2013). Procedure:

Compute a pairwise dependence measure (e.g. absolute Kendall’s $τ$ ) for all variable pairs and use them as edge weights.

Find the maximum spanning tree over the $d$ nodes (Prim’s algorithm) — the tree maximizing the summed edge weights — to form $T_{1}$ .

Estimate each $T_{1}$ pair-copula (family + parameters), compute the implied conditional (“pseudo-”)observations via the h-functions.

Build $T_{2}$ as a maximum spanning tree over the edges of $T_{1}$ , subject to the proximity condition; repeat up the trees.

This greedy bottom-up scheme is by far the most used in practice and requires simultaneous selection of pair-copula types and parameter estimation at each level. An alternative (Kurowicka 2011) starts by assigning the weakest conditional dependencies to the highest trees. Bayesian posterior-over-structure methods (Gruber & Czado 2015) exist but are little used in finance.

Choosing copula families

Families (Gaussian, $t$ , Gumbel, Clayton, …) are typically selected one pair at a time using a model-selection criterion — AIC, BIC, the copula information criterion (CIC), or a copula goodness-of-fit test. In a comparison of four strategies (Manner), AIC was the most reliable selection criterion. Because of sequential estimation, the family chosen at a given level depends on choices at preceding levels (observations at one level are partial derivatives of preceding-level copulae), so selection uncertainty accumulates up the trees and the final model must be carefully validated.

Parameter estimation — sequential vs joint

A PCC is a multivariate copula, so in principle parameters can be estimated by any multivariate-copula estimator: the inference-functions-for-margins (IFM) method or the maximum pseudo-likelihood (MPL) estimator. But the parameter count grows fast, making full joint MLE computationally demanding in medium/high dimensions. Aas et al. (2009) therefore proposed a sequential method: estimate parameters level by level, conditioning on the parameters from preceding levels; the sequential estimates can serve as starting values for a final joint MLE. Asymptotic properties are studied by Hobæk Haff (2013). With temporal dependence, vines are fitted on standardized residuals from ARIMA-GARCH filtering of the original series.

Truncation and pruning (Eq. 7)

To curb the parameter explosion, replace as many pair-copulae as possible by the independence copula (a set of conditional independencies):

Pruning — test individual copulae for independence and set them to independence (e.g. a Kendall’s- $τ$ test, valid as an independence test only for Gaussian copulae, or the Cramér-von Mises test of Hobæk Haff & Segers).

Truncation at level $K$ — replace all pair-copulae in trees above level $K$ by independence copulae. The density of an R-vine truncated at level $K$ is

$c_{tRV (K)} (u) = j = d - 1 \prod 1 i = d \prod m a x {j + 1, d - K + 1} c_{m_{j, j}, m_{i, j} ∣ m_{i + 1, j}, \dots, m_{d, j}}, u = (u_{1}, \dots, u_{d}) \in [0, 1]^{d} .$
At $K = 1$ the truncated R-vine is a Markov tree modeling only unconditional relationships. Truncation is justified because Dißmann’s bottom-up build puts the strongest dependence in the first trees; upper-level estimates are uncertain (repeated transformations) and barely affect lower-order dependencies. The optimal level is found by, e.g., Brechmann et al. (2012): start at $K = 1$ , increase $K$ by one, and stop (level $K_{0}$ ) when the gain from an extra tree is negligible by Vuong’s likelihood-ratio test.

Model validation (goodness-of-fit)

GOF tests assess whether the fitted vine fits the data. Early proposals use the probability integral transform (PIT) (Rosenblatt 1952; Aas et al. 2009) and the Breymann et al. transformation, plus tests based on the empirical copula and Kendall’s process. Newer high-power tests come from the information-matrix equality and a specification test (Schepsmeier 2015, 2016), shown to have excellent size and power in high dimensions.

Examples

Fitting a 4-stock portfolio vine (workflow)

Filter each return series with AR-GARCH; take standardized residuals; probability-integral-transform to $[0, 1]$ uniforms.

Build $T_{1}$ by maximum spanning tree on $∣ \overset{τ}{^}_{ij} ∣$ (Dißmann); fit each edge’s family by AIC.

Propagate pseudo-observations via h-functions; build $T_{2}, T_{3}$ under the proximity condition.

Optionally truncate after the level where Vuong’s test shows no significant gain; refit jointly by MLE using sequential estimates as starting values.

Validate with a PIT/information-matrix GOF test before using the model for VaR/cVaR.

Connections

Pair-Copula Constructions — the density whose families/parameters are being selected and estimated.
C-vines, D-vines, and Regular Vines — the structure space ( $2^{(2 d - 2) - 1} d!$ R-vines) Dißmann’s algorithm searches.
The Simplifying Assumption — what makes per-edge family selection and sequential estimation coherent.
Copula Estimation — estimation of the bivariate building blocks (IFM/MPL at the pair level).
Dependence Measures for Copulas — Kendall’s $τ$ / rank measures used as spanning-tree edge weights.

Second Brain

Explorer

Estimation and Structure Selection for Vines

Estimation and Structure Selection for Vines

Overview

Main Content

Examples

Connections

See Also

Graph View

Table of Contents

Backlinks