Survival Analysis

Summary

Survival analysis studies the time until an event of interest occurs. Its defining feature is censoring — some subjects don’t experience the event during the study, so their exact event time is unknown. The three core methods are Kaplan-Meier estimation, the log-rank test, and Cox proportional hazards regression.

Core Concepts

Survival Function $S (t)$

The probability of surviving (no event) past time $t$ :

S (t) = P (T > t)

Displayed as a step-function that declines at each event time.

Hazard Function $h (t)$

The instantaneous rate of event occurrence at time $t$ , given survival to $t$ :

h (t) = Δ t \to 0 lim \frac{P ( t \leq T < t + Δ t ∣ T \geq t )}{Δ t}

The hazard ratio (HR) compares hazard rates between groups — HR > 1 means higher event rate.

Censoring

Type	Description
Right censoring	Event hasn’t occurred by study end or subject lost to follow-up (most common)
Left censoring	Event occurred before observation began
Interval censoring	Event occurred between two known timepoints

Warning

Censoring must be noninformative — the reason for censoring should be unrelated to the event risk. If sicker patients drop out more (informative censoring), results are biased.

The Three Core Methods

1. Kaplan-Meier Estimator

Nonparametric estimate of $S (t)$ :

\hat{S} (t) = t_{i} \leq t \prod (1 - \frac{d _{i}}{n _{i}})

where $d_{i}$ = events at time $t_{i}$ and $n_{i}$ = subjects at risk just before $t_{i}$ .

Produces the familiar step-function survival curve
Reports median survival (time when $\hat{S} (t) = 0.5$ ) and survival at fixed timepoints (e.g., 5-year survival)
Cannot adjust for covariates

2. Log-Rank Test

Tests $H_{0}$ : no difference in survival between groups.

Compares the entire survival distribution, not just specific timepoints
Distribution-free (no parametric assumptions)
Limitation: poor power when survival curves cross (one group favored early, another late)
Cannot estimate effect size or adjust for confounders

3. Cox Proportional Hazards Model

The workhorse for multivariable survival analysis:

h (t ∣ X) = h_{0} (t) exp (β_{1} X_{1} + β_{2} X_{2} + \dots + β_{p} X_{p})

$h_{0} (t)$ : baseline hazard (left unspecified — semiparametric)
$exp (β_{j})$ : hazard ratio for covariate $X_{j}$
Adjusts for confounders while estimating treatment effects
No distributional assumptions on survival times

Proportional Hazards Assumption

Warning

The model assumes that hazard ratios are constant over time. If the treatment effect changes over the study period (e.g., surgery helps early but not late), the PH assumption is violated. Always test this before interpreting results.

When to Use Each

Method	Purpose	Covariates?	Effect size?
Kaplan-Meier	Visualize & describe survival	No	No
Log-Rank	Compare groups (unadjusted)	No	No
Cox PH	Multivariable analysis	Yes	Yes (HR)

Sample Size for Survival Studies

Power depends on the number of events, not total sample size:

Calculate events needed to detect a minimum HR at desired power
Estimate proportion of subjects who will experience the event
Derive total sample size

Rule of thumb: at least 10 events per covariate in a Cox model.

Advanced Extensions

Parametric models: Weibull, exponential — more efficient if distributional assumptions hold
Competing risks: multiple event types that preclude each other (e.g., death from cancer vs. death from other causes)
Recurrent events: events that can happen multiple times (e.g., hospitalizations)
Frailty models: random effects for clustered data (analogous to Hierarchical Models)

Connection to Bayesian Methods

Bayesian survival analysis places priors on hazard functions or regression coefficients:

Bayesian Cox models: priors on $β$ provide regularization, especially useful with many covariates
Nonparametric Bayesian: Dirichlet process priors on the baseline hazard
Posterior predictive checks (Model Checking) apply directly — simulate event times and compare to data

Second Brain

Explorer

Survival Analysis

Survival Analysis

Core Concepts

Survival Function $S (t)$

Hazard Function $h (t)$

Censoring

The Three Core Methods

1. Kaplan-Meier Estimator

2. Log-Rank Test

3. Cox Proportional Hazards Model

Proportional Hazards Assumption

When to Use Each

Sample Size for Survival Studies

Advanced Extensions

Connection to Bayesian Methods

See Also

Graph View

Table of Contents

Backlinks

Second Brain

Explorer

Survival Analysis

Survival Analysis

Core Concepts

Survival Function S(t)

Hazard Function h(t)

Censoring

The Three Core Methods

1. Kaplan-Meier Estimator

2. Log-Rank Test

3. Cox Proportional Hazards Model

Proportional Hazards Assumption

When to Use Each

Sample Size for Survival Studies

Advanced Extensions

Connection to Bayesian Methods

See Also

Graph View

Table of Contents

Backlinks

Survival Function $S (t)$

Hazard Function $h (t)$