The RTS Smoother

Summary

The Rauch–Tung–Striebel (RTS) smoother (a.k.a. the Kalman smoother) is the closed-form solution to the Bayesian smoothing problem for a linear-Gaussian model: it computes $p (x_{k} ∣ y_{1 : T})$ , the posterior of each state given the entire data record, not just the past. It is a backward recursion run after the forward Kalman filter: starting from the last filtered estimate it sweeps $k = T - 1, \dots, 0$ , correcting each filtered $(m_{k}, P_{k})$ using future information through a smoother gain $G_{k}$ . Smoothed estimates are always at least as precise as filtered ones.

Overview

The Kalman filter gives the filtering distribution $p (x_{k} ∣ y_{1 : k})$ — conditioned only on data up to $k$ . Bayesian smoothing instead conditions on all $T$ measurements, $p (x_{k} ∣ y_{1 : T})$ with $T > k$ , so each state estimate benefits from future observations. This is the natural object for retrospective / counterfactual analysis (e.g. inferring a latent trend across a whole sample). Särkkä derives the general backward recursion (Thm. 8.1) then its linear-Gaussian special case, the RTS smoother (Thm. 8.2).

Main Content

Theorem: Bayesian (fixed-interval) smoothing equations (Särkkä Thm. 8.1)

The smoothed distributions $p (x_{k} ∣ y_{1 : T})$ for $k < T$ satisfy the backward recursion
$p (x_{k + 1} ∣ y_{1 : k}) = \int p (x_{k + 1} ∣ x_{k}) p (x_{k} ∣ y_{1 : k}) d x_{k},$ $p (x_{k} ∣ y_{1 : T}) = p (x_{k} ∣ y_{1 : k}) \int [\frac{p ( x _{k + 1} ∣ x _{k} ) p ( x _{k + 1} ∣ y _{1 : T} )}{p ( x _{k + 1} ∣ y _{1 : k} )}] d x_{k + 1}, (8.2)$
where $p (x_{k} ∣ y_{1 : k})$ is the filtering distribution and $p (x_{k + 1} ∣ y_{1 : k})$ is the one-step prediction. It is run backwards, initialized at the filtering distribution of the last step $p (x_{T} ∣ y_{1 : T})$ .

Theorem: RTS smoother (Särkkä Thm. 8.2)

For the linear-Gaussian model, the smoothed distribution is Gaussian, $p (x_{k} ∣ y_{1 : T}) = N (x_{k} ∣ m_{k}^{s}, P_{k}^{s})$ , computed by the backward recursion for $k = T - 1, \dots, 0$ :
$m_{k + 1}^{-} = A_{k} m_{k}$ $P_{k + 1}^{-} = A_{k} P_{k} A_{k}^{T} + Q_{k}$ $G_{k} = P_{k} A_{k}^{T} [P_{k + 1}^{-}]^{- 1} (smoother gain)$ $m_{k}^{s} = m_{k} + G_{k} [m_{k + 1}^{s} - m_{k + 1}^{-}]$ $P_{k}^{s} = P_{k} + G_{k} [P_{k + 1}^{s} - P_{k + 1}^{-}] G_{k}^{T} (8.6)$
where $m_{k}, P_{k}$ are the filtered mean/covariance from the Kalman filter. The recursion is initialized at the last time step with $m_{T}^{s} = m_{T}$ , $P_{T}^{s} = P_{T}$ .

Reading the equations

The first two lines are exactly the Kalman prediction step (Eq. 4.20); since the filter already computes $m_{k + 1}^{-}, P_{k + 1}^{-}$ , they can be stored during the forward pass to avoid recomputation. The gains $G_{k}$ can likewise be precomputed.

$G_{k} = P_{k} A_{k}^{T} [P_{k + 1}^{-}]^{- 1}$ is the smoother gain; the bracket $m_{k + 1}^{s} - m_{k + 1}^{-}$ is the discrepancy between the smoothed future and what the filter predicted for it — the correction that future data injects into the present.

Smoothing never increases uncertainty: $P_{k}^{s} ⪯ P_{k}$ for $k < T$ , with equality only at $k = T$ (the endpoint, which has no future to borrow from). This is the forward–backward structure: one Kalman pass forward, one RTS pass backward.

Derivation (Särkkä §8.2): via the Gaussian conditioning lemmas applied to $p (x_{k}, x_{k + 1} ∣ y_{1 : k})$ and the Markov property $p (x_{k} ∣ x_{k + 1}, y_{1 : T}) = p (x_{k} ∣ x_{k + 1}, y_{1 : k})$ .

An alternative two-filter smoother (Fraser–Potter / Kitagawa) factors $p (x_{k} ∣ y_{1 : T}) \propto p (x_{k} ∣ y_{1 : k - 1}) p (y_{k : T} ∣ x_{k})$ combining a forward and a backward filter; Särkkä prefers the RTS forward–backward form (§8.3).

Algorithm

run Kalman filter forward, store m_k, P_k, m⁻_{k+1}, P⁻_{k+1}  for all k
initialize  m_T^s = m_T ,  P_T^s = P_T
for k = T-1 down to 0:
  G_k   = P_k Aᵀ (P⁻_{k+1})⁻¹
  m_k^s = m_k + G_k (m_{k+1}^s − m⁻_{k+1})
  P_k^s = P_k + G_k (P_{k+1}^s − P⁻_{k+1}) G_kᵀ

Examples

RTS smoother for the Gaussian random walk (Särkkä Ex. 8.1)

For the scalar local-level model the backward recursion is
$m_{k + 1}^{-} = m_{k}, P_{k + 1}^{-} = P_{k} + Q,$ $m_{k}^{s} = m_{k} + \frac{P _{k}}{P _{k + 1}^{-}} (m_{k + 1}^{s} - m_{k + 1}^{-}), P_{k}^{s} = P_{k} + (\frac{P _{k}}{P _{k + 1}^{-}})^{2} [P_{k + 1}^{s} - P_{k + 1}^{-}] . (8.16)$
where $m_{k}, P_{k}$ are the filtered values from Kalman Ex. 4.2. Särkkä’s Fig. 8.2 shows the smoother variance is uniformly below the filter variance, except at the final step where they coincide.

RTS smoother for car tracking (Särkkä Ex. 8.2)

Applying the backward recursion to the 4-D car-tracking filter lowers the position RMSE from $0.43$ (Kalman filter) to $0.27$ (RTS smoother): conditioning each position on the whole trajectory produces a visibly smoother, more accurate estimate.

Connections

The Kalman Filter — supplies the filtered $(m_{k}, P_{k})$ and predicted $(m_{k + 1}^{-}, P_{k + 1}^{-})$ this recursion consumes
Linear-Gaussian State-Space Models — the model being smoothed
Marginal Likelihood via the Kalman Filter — smoothing supplies the sufficient statistics for EM-based parameter estimation (Fisher’s identity)
State-Space Models and the Kalman Filter - Overview — pipeline context

Second Brain

Explorer

The RTS Smoother

The RTS Smoother

Overview

Main Content

Examples

Connections

See Also

Graph View

Table of Contents

Backlinks