Impact of Informative Censoring �on Kaplan-Meier Estimate��
Charlie Ahn
September 30, 2023
Outline
2
Lung data
3
Leung, Elashoff, and Afifi (1997) reported survival times in weeks of 61 patients with inoperable adenocarcinoma of lung.
Survival probabilities from Kaplan-Meier (KM)
4
0.14* | 3.43 | 11.86* | 29.14 |
0.14* | 3.43 | 13.00 | 32.14* |
0.29* | 3.71 | 14.43 | 33.14* |
0.43 | 3.86 | 15.57* | 40.57 |
0.43* | 6.00* | 15.71 | 47.29* |
0.57* | 6.14 | 16.57* | 48.57 |
0.57* | 6.14* | 17.29* | 49.43 |
1.86* | 6.86 | 18.43 | 53.86 |
2.86 | 8.71* | 18.57 | 61.86 |
3.00* | 9.00 | 18.71* | 66.57 |
3.00* | 9.43 | 20.71 | 68.71 |
3.14 | 10.57* | 21.29* | 68.96 |
3.14 | 10.71 | 23.86* | 72.86 |
3.29* | 10.86 | 26.00* | 72.86 |
3.29* | 11.14 | 27.57* | |
Survival probabilities from Kaplan-Meier (KM)
5
Leung, Elashoff, and Afifi (1997) stated “Observations were censored because the patients experienced metastasic disease or a significant increase in the size of their primary lesion. Such disease progression usually indicates a shortened residual survival time.”
In such situations, the KM method will overestimate the actual survival function, and we cannot apply the standard procedure (assuming non-informative censoring).
Survival probabilities with eventual survival times
6
As expected, the KM rates are overestimated.
Q: Can we correct it?
Informative Censoring
7
Lagakos (1979, Biometrics) mentioned the following three situations in which the independence assumption (censoring times independent of survival times) is of questionable validity:
Models Proposed for Informative Censoring
8
Model for Censored Survival Analysis
X = min(T, C),
where T and C are non-negative RVs representing lifetime and censoring time, respectively.
9
Slud & Rubinstein (1983, Biometrika)
10
Slud & Rubinstein generalized the KME with informative censoring situations. They assumed:
ρ(t) = h(t | C ≤ t ) / h(t | C > t ) measures the relative risk of failure at time T among previously censored subjects as compared with subjects not yet censored, where h(t | .) is the conditional hazard function of T.
Slud & Rubinstein
Numerator: P(t < T < t+δ | T > t, C ≤ t) for subjects previously censored
= P(t < T < t+δ, T > t, C ≤ t) / P(T > t, C ≤ t)
Applying the total probability: P(A) = P(AB) + P(ABc) or, P(AB) =P(A) - P(ABc)
P(t < T < t+δ, T > t, C ≤ t) = P(t < T < t+δ) - P(t < T < t+δ, C > t)
P(T > t, C ≤ t) = P(T > t) - P(T > t, C > t)
11
🡪 f(t) - ψ(t)
🡪 S(t) - Sx(t)
Denominator: P(t < T < t+δ | T > t, C > t) for subjects not yet censored
= P(t < T < t+δ, T > t, C > t) / P(T > t, C > t) 🡪 ψ(t) /Sx(t)
ρ(t) = { [f(t) - ψ(t)]/[S(t) - Sx(t)] } / {ψ(t) /Sx(t)} = [f(t)/ψ(t) – 1] [S(t)/Sx(t)-1]-1
Slud & Rubinstein
ρ > 1: + dependence between censoring and death
ρ < 1: − dependence between censoring and death
ρ = 1: S-R estimates = KM rates
12
Survival probabilities with eventual survival times
13
14
ρ > 1: + dependence between
censoring and death
ρ < 1: − dependence between
censoring and death
15
As ρ increases from 1, the S-R method produces the estimates closer to the true survival probabilities.
When ρ =4, the S-R estimates are very close to the true survival probabilities.
16
Efron’s self-consistent estimator (Link, 1989 JASA)
Self-consistent algorithm to calculate S(t) = Pr( T > t )
17
Which subject would have higher chance of surviving past t=15?
18
Pr(T>15)
Subject A censored (xA =6)
Subject B censored (xB =10)
t
0
6
10
15
Which subject would have higher chance of surviving past t=15?
19
Pr(T>15)
Subject A censored (xA =6) P(T > 15 |T >6)
Subject B censored (xB =10) P(T > 15 |T >10)
t
0
6
10
15
Self-consistent estimator
20
|
|
|
|
|
| EVERYONE |
|
| CENSORED |
time�xi | #risk �ri | event �di | censored �qi | 1 - di/ri | S(xi) | # I(xi > t) �t = 15 | S(k)(t)/S(xi)�contribution | #qi's |
|
0 | 21 | 0 | 0 | 1.0000 | 1.0000 | 0 |
|
|
|
6 | 21 | 3 | 1 | 0.8571 | 0.8571 | 0 | 0.6111 | 1 | 0.6111 |
7 | 17 | 1 | 1 | 0.9412 | 0.8067 | 0 | 0.6493 | 1 | 0.6493 |
10 | 15 | 1 | 2 | 0.9333 | 0.7529 | 0 | 0.6957 | 2 | 1.3913 |
13 | 12 | 1 | 0 | 0.9167 | 0.6902 | 0 |
|
|
|
16 | 11 | 1 | 3 | 0.9091 | 0.6275 | 11 |
|
| 2.6517 |
22 | 7 | 1 | 0 | 0.8571 | 0.5378 | ∑ I(xi > t) |
|
| ∑ S(t)/S(xi) |
23 | 6 | 1 | 5 | 0.8333 | 0.4482 | 0.5238 |
|
| 0.1263 |
| sum | 9 | 12 |
|
| S(k) = | 0.5238 | S(k+1) = | 0.6501 |
Which subject would have higher chance of surviving past t=15?
21
Pr(T>15)
Subject A censored (xA =6) P(T > 15 |T >6)=.5238/.8571=.6111
Subject B censored (xB =10) P(T > 15 |T >10)=.5238/.7529=.6957
t
0
6
10
15
Self-consistent estimator
22
|
|
|
|
|
| EVERYONE |
|
| CENSORED |
time�xi | #risk �ri | event �di | censored �qi | 1 - di/ri | S(xi) | # I(xi > t) �t = 15 | S(t)/S(xi)�contribution | #qi's |
|
0 | 21 | 0 | 0 | 1.0000 | 1.0000 | 0 |
|
|
|
6 | 21 | 3 | 1 | 0.8571 | 0.8571 | 0 | 0.7585 | 1 | 0.7585 |
7 | 17 | 1 | 1 | 0.9412 | 0.8067 | 0 | 0.8059 | 1 | 0.8059 |
10 | 15 | 1 | 2 | 0.9333 | 0.7529 | 0 | 0.8634 | 2 | 1.7268 |
13 | 12 | 1 | 0 | 0.9167 | 0.6902 | 0 |
|
|
|
16 | 11 | 1 | 3 | 0.9091 | 0.6275 | 11 |
|
| 3.2911 |
22 | 7 | 1 | 0 | 0.8571 | 0.5378 | ∑ I(xi > t) |
|
| ∑ S(t)/S(xi) |
23 | 6 | 1 | 5 | 0.8333 | 0.4482 | 0.5238 |
|
| 0.1567 |
| sum | 9 | 12 |
|
| S(k) = | 0.6501 | S(k+1) = | 0.6805 |
Self-consistent estimator
23
|
|
|
|
|
| EVERYONE |
|
| CENSORED |
time�xi | #risk �ri | event �di | censored �qi | 1 - di/ri | S(xi) | # I(xi > t) �t = 15 | S(k)(t)/S(xi)�contribution | #qi's |
|
0 | 21 | 0 | 0 | 1.0000 | 1.0000 | 0 |
|
|
|
6 | 21 | 3 | 1 | 0.8571 | 0.8571 | 0 | 0.8052 | 1 | 0.8052 |
7 | 17 | 1 | 1 | 0.9412 | 0.8067 | 0 | 0.8556 | 1 | 0.8556 |
10 | 15 | 1 | 2 | 0.9333 | 0.7529 | 0 | 0.9167 | 2 | 1.8333 |
13 | 12 | 1 | 0 | 0.9167 | 0.6902 | 0 |
|
|
|
16 | 11 | 1 | 3 | 0.9091 | 0.6275 | 11 |
|
| 3.4941 |
22 | 7 | 1 | 0 | 0.8571 | 0.5378 | ∑ I(xi > t) |
|
| ∑ S(t)/S(xi) |
23 | 6 | 1 | 5 | 0.8333 | 0.4482 | 0.5238 |
|
| 0.1664 |
| sum | 9 | 12 |
|
| S(k) = | 0.6902 | S(k+1) = | 0.6902 |
Modified Kaplan-Meier Estimator (MKME)
24
S(t | Z = z ) is decreasing in z, so individuals with high frailties tend to have smaller lifetimes.
Link considered the survival function conditional on frailty Z = z.
A = { z | z ≥ a): KME overestimates S(t)
A = { z | z ≤ a): KME underestimates S(t)
Frailty a=0.333 provides the true S(t) in Link method
25
| |
| | | | | |
∑ I(xi > t) | 0.18644 |
| 0.297975 | |
| 0.0838937 |
|
S(k) = | 0.4844 | S(k+1) = | 0.4844 | S(k) = | 0.2703 | S(k+1) = | 0.2703 |
Lung | Data |
|
|
|
| True S(40)= | 0.2712 |
1st term |
| KM | 2nd term | | Link | 2nd term | |
∑ I(xi > t) | 0.18644 |
| 0.114688 | 0.400000 |
| 0.0351561 |
|
S(k) = | 0.18644 | S(k+1) = | 0.3011 | S(k) = | 0.18644 | S(k+1) = | 0.2216 |
Lung | Data |
|
|
|
| True S(40)= | 0.2712 |
| | | | | | | |
∑ I(xi > t) | 0.18644 |
| 0.185219 | 0.400000 |
| 0.0536489 |
|
S(k) = | 0.3011 | S(k+1) = | 0.3717 | S(k) = | 0.2216 | S(k+1) = | 0.2401 |
Lung | Data |
|
|
|
| True S(40)= | 0.2712 |
| | | | | | |
|
∑ I(xi > t) | 0.18644 |
| 0.228648 | 0.400000 |
| 0.0645791 |
|
S(k) = | 0.3717 | S(k+1) = | 0.4151 | S(k) = | 0.2401 | S(k+1) = | 0.2510 |
Lung | Data |
|
|
|
| True S(40)= | 0.2712 |
iter #1
iter #2
iter #3
iter #6
converged
References
26
Edwards, Edwards Lifesciences, and the stylized E logo are trademarks or service marks of Edwards Lifesciences Corporation. All other trademarks are the property of their respective owners.
© 2022 Edwards Lifesciences Corporation. All rights reserved.
Thank you
Edwards Lifesciences • Route de l’Etraz 70, 1260 Nyon, Switzerland • edwards.com