1 of 28

Impact of Informative Censoring �on Kaplan-Meier Estimate

Charlie Ahn

September 30, 2023

2 of 28

Outline

  • Lung Data
  • Informative Censoring
  • Slud-Rubinstein Method
  • Efron’s Self-consistent Algorithm for KM estimate
  • Link Method Based on Self-consistent Algorithm

2

3 of 28

Lung data

3

Leung, Elashoff, and Afifi (1997) reported survival times in weeks of 61 patients with inoperable adenocarcinoma of lung.

4 of 28

Survival probabilities from Kaplan-Meier (KM)

4

0.14*

3.43

11.86*

29.14

0.14*

3.43

13.00

32.14*

0.29*

3.71

14.43

33.14*

0.43

3.86

15.57*

40.57

0.43*

6.00*

15.71

47.29*

0.57*

6.14

16.57*

48.57

0.57*

6.14*

17.29*

49.43

1.86*

6.86

18.43

53.86

2.86

8.71*

18.57

61.86

3.00*

9.00

18.71*

66.57

3.00*

9.43

20.71

68.71

3.14

10.57*

21.29*

68.96

3.14

10.71

23.86*

72.86

3.29*

10.86

26.00*

72.86

3.29*

11.14

27.57*

5 of 28

Survival probabilities from Kaplan-Meier (KM)

5

Leung, Elashoff, and Afifi (1997) stated “Observations were censored because the patients experienced metastasic disease or a significant increase in the size of their primary lesion. Such disease progression usually indicates a shortened residual survival time.”

In such situations, the KM method will overestimate the actual survival function, and we cannot apply the standard procedure (assuming non-informative censoring).

6 of 28

Survival probabilities with eventual survival times

6

As expected, the KM rates are overestimated.

Q: Can we correct it?

7 of 28

Informative Censoring

  1. Patients remove themselves from study for reasons possibly related to therapy and thereby censor their survival time
  2. Patients experiencing a specific critical event such as metastatic spread of disease are removed from study
  3. Failure times from causes of secondary interest are recorded as censored observations of the failure times from the causes of primary interest

7

Lagakos (1979, Biometrics) mentioned the following three situations in which the independence assumption (censoring times independent of survival times) is of questionable validity:

8 of 28

Models Proposed for Informative Censoring

  • Efron (1967 Proceedings of the 5th Berkeley Symposium) introduced the self-consistent estimate that coincides with Kaplan and Meier's product limit estimate.
  • Link (1989, JASA) used Efron's self-consistency algorithm and proposed a modified Kaplan-Meier estimator (MKME).
  • Slud and Rubinstein (1983 Biometrika) modeled ρ(t) = h(t | C < t) / h(t | C ≥ t) that led to the generalized Kaplan-Meier estimate.
  • Leung, et al (1997 Ann. Rev. Pub. Health) concluded that that assuming ignorable censoring can lead to a biased estimate of the survival function. They compared several models for informative censoring and concluded that Slud-Rubinstein’s estimate outperformed.

8

9 of 28

Model for Censored Survival Analysis

  • Observations X are of the form,

X = min(T, C),

where T and C are non-negative RVs representing lifetime and censoring time, respectively.

  • Under the usual assumption of independence of T and C, the Kaplan-Meier estimator (KME) is the appropriate estimator of S(t) = Pr(T> t).
  • The KME can lead to substantial overestimates of survival probabilities if the event of censoring indicates an unfavorable prognosis for future survival.
  • We consider two alternative models based on (1) self consistent estimator of the survival function and (2) the ratios of two hazards- hazard of those who have been censored and hazard of those who are still in the study.

9

10 of 28

Slud & Rubinstein (1983, Biometrika)

10

Slud & Rubinstein generalized the KME with informative censoring situations. They assumed:

ρ(t) = h(t | C t ) / h(t | C > t ) measures the relative risk of failure at time T among previously censored subjects as compared with subjects not yet censored, where h(t | .) is the conditional hazard function of T.

11 of 28

Slud & Rubinstein

Numerator: P(t < T < t+δ | T > t, C ≤ t) for subjects previously censored

= P(t < T < t+δ, T > t, C ≤ t) / P(T > t, C ≤ t)

Applying the total probability: P(A) = P(AB) + P(ABc) or, P(AB) =P(A) - P(ABc)

P(t < T < t+δ, T > t, C ≤ t) = P(t < T < t+δ) - P(t < T < t+δ, C > t)

P(T > t, C ≤ t) = P(T > t) - P(T > t, C > t)

11

🡪 f(t) - ψ(t)

🡪 S(t) - Sx(t)

Denominator: P(t < T < t+δ | T > t, C > t) for subjects not yet censored

= P(t < T < t+δ, T > t, C > t) / P(T > t, C > t) 🡪 ψ(t) /Sx(t)

ρ(t) = { [f(t) - ψ(t)]/[S(t) - Sx(t)] } / {ψ(t) /Sx(t)} = [f(t)/ψ(t) – 1] [S(t)/Sx(t)-1]-1

12 of 28

Slud & Rubinstein

ρ > 1: + dependence between censoring and death

ρ < 1: dependence between censoring and death

ρ = 1: S-R estimates = KM rates

12

 

  • The Kaplan-Meier estimator may overestimate the survival function if censoring and death are positively correlated, and underestimate the survival function if they are negatively correlated (Leung, et al, 1997)
  • Let’s apply S&R method to Lung data with ρ > 1.

13 of 28

Survival probabilities with eventual survival times

13

14 of 28

  •  

14

 

ρ > 1: + dependence between

censoring and death

ρ < 1: dependence between

censoring and death

15 of 28

  •  

15

As ρ increases from 1, the S-R method produces the estimates closer to the true survival probabilities.

When ρ =4, the S-R estimates are very close to the true survival probabilities.

16 of 28

16

Efron’s self-consistent estimator (Link, 1989 JASA)

17 of 28

Self-consistent algorithm to calculate S(t) = Pr( T > t )

  • This alternative expression of KME is the sum of two parts:
  • the percentage of observations (censored or uncensored) beyond time t
  • the estimated percentage that would have survived beyond time t but were censored before t.

  • 1) has the form of the common right sided sample c.d.f.
  • 2) is the portion of probabilities to KME from censored patients. S(t)/S(xi) in (2) is the conditional probability that T > t given T > xi. The closer to t, xi is, the higher contribution is given to the probability of surviving past to t=15.

17

18 of 28

Which subject would have higher chance of surviving past t=15?

18

Pr(T>15)

Subject A censored (xA =6)

Subject B censored (xB =10)

t

0

6

10

15

19 of 28

Which subject would have higher chance of surviving past t=15?

19

Pr(T>15)

Subject A censored (xA =6) P(T > 15 |T >6)

Subject B censored (xB =10) P(T > 15 |T >10)

t

0

6

10

15

20 of 28

Self-consistent estimator

20

 

 

 

 

 

 

EVERYONE

 

 

CENSORED

time�xi

#risk �ri

event �di

censored �qi

1 - di/ri

S(xi)

# I(xi > t) �t = 15

S(k)(t)/S(xi)�contribution

#qi's

 

0

21

0

0

1.0000

1.0000

0

 

 

 

6

21

3

1

0.8571

0.8571

0

0.6111

1

0.6111

7

17

1

1

0.9412

0.8067

0

0.6493

1

0.6493

10

15

1

2

0.9333

0.7529

0

0.6957

2

1.3913

13

12

1

0

0.9167

0.6902

0

 

 

 

16

11

1

3

0.9091

0.6275

11

 

 

2.6517

22

7

1

0

0.8571

0.5378

∑ I(xi > t)

 

 

∑ S(t)/S(xi)

23

6

1

5

0.8333

0.4482

0.5238

 

 

0.1263

 

sum

9

12

 

 

S(k) =

0.5238

S(k+1) =

0.6501

21 of 28

Which subject would have higher chance of surviving past t=15?

21

Pr(T>15)

Subject A censored (xA =6) P(T > 15 |T >6)=.5238/.8571=.6111

Subject B censored (xB =10) P(T > 15 |T >10)=.5238/.7529=.6957

t

0

6

10

15

22 of 28

Self-consistent estimator

22

 

 

 

 

 

 

EVERYONE

 

 

CENSORED

time�xi

#risk �ri

event �di

censored �qi

1 - di/ri

S(xi)

# I(xi > t) �t = 15

S(t)/S(xi)�contribution

#qi's

 

0

21

0

0

1.0000

1.0000

0

 

 

 

6

21

3

1

0.8571

0.8571

0

0.7585

1

0.7585

7

17

1

1

0.9412

0.8067

0

0.8059

1

0.8059

10

15

1

2

0.9333

0.7529

0

0.8634

2

1.7268

13

12

1

0

0.9167

0.6902

0

 

 

 

16

11

1

3

0.9091

0.6275

11

 

 

3.2911

22

7

1

0

0.8571

0.5378

∑ I(xi > t)

 

 

∑ S(t)/S(xi)

23

6

1

5

0.8333

0.4482

0.5238

 

 

0.1567

 

sum

9

12

 

 

S(k) =

0.6501

S(k+1) =

0.6805

23 of 28

Self-consistent estimator

23

 

 

 

 

 

 

EVERYONE

 

 

CENSORED

time�xi

#risk �ri

event �di

censored �qi

1 - di/ri

S(xi)

# I(xi > t) �t = 15

S(k)(t)/S(xi)�contribution

#qi's

 

0

21

0

0

1.0000

1.0000

0

 

 

 

6

21

3

1

0.8571

0.8571

0

0.8052

1

0.8052

7

17

1

1

0.9412

0.8067

0

0.8556

1

0.8556

10

15

1

2

0.9333

0.7529

0

0.9167

2

1.8333

13

12

1

0

0.9167

0.6902

0

 

 

 

16

11

1

3

0.9091

0.6275

11

 

 

3.4941

22

7

1

0

0.8571

0.5378

∑ I(xi > t)

 

 

∑ S(t)/S(xi)

23

6

1

5

0.8333

0.4482

0.5238

 

 

0.1664

 

sum

9

12

 

 

S(k) =

0.6902

S(k+1) =

0.6902

24 of 28

Modified Kaplan-Meier Estimator (MKME)

24

S(t | Z = z ) is decreasing in z, so individuals with high frailties tend to have smaller lifetimes.

Link considered the survival function conditional on frailty Z = z.

A = { z | z a): KME overestimates S(t)

A = { z | z a): KME underestimates S(t)

25 of 28

Frailty a=0.333 provides the true S(t) in Link method

25

 

∑ I(xi > t)

 0.18644

 

0.297975

 

0.0838937

 

S(k) =

0.4844

S(k+1) =

0.4844

S(k) =

0.2703

S(k+1) =

0.2703

 Lung

  Data

 

 

 

 

True S(40)=

0.2712

1st term

 

 KM

2nd term

Link

2nd term

∑ I(xi > t)

0.18644

 

0.114688

0.400000

 

0.0351561

 

S(k) =

0.18644

S(k+1) =

0.3011

S(k) =

0.18644

S(k+1) =

0.2216

 Lung

  Data

 

 

 

 

True S(40)=

0.2712

∑ I(xi > t)

 0.18644

 

0.185219

0.400000

 

0.0536489

 

S(k) =

0.3011

S(k+1) =

0.3717

S(k) =

0.2216

S(k+1) =

0.2401

 Lung

  Data

 

 

 

 

True S(40)=

0.2712

 

∑ I(xi > t)

 0.18644

 

0.228648

0.400000

 

0.0645791

 

S(k) =

0.3717

S(k+1) =

0.4151

S(k) =

0.2401

S(k+1) =

0.2510

 Lung

  Data

 

 

 

 

True S(40)=

0.2712

iter #1

iter #2

iter #3

iter #6

converged

26 of 28

References

  • Efron, B. (1967), "The Two Sample Problem With Censored Data," in Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability (Vol. 4), Berkeley: University of California Press, pp. 831-853.
  • Kaplan, E. L., and Meier, P. (1958), "Nonparametric Estimation From Incomplete Observations," Journal of the American Statistical Association, 53, 457-481.
  • Lagakos, S. W. (1979), "General Right Censoring and Its Impact on the Analysis of Survival Data," Biometrics, 35, 139-156.
  • Lagakos, S. W., and Williams, J. S. (1978), "Models for Censored Survival Analysis: A Cone Class of Variable-Sum Models," Biometrika, 65, 181-189.
  • Leung, K.M., Elashoff, R.M., Afifi, A.A. (1997). Censoring issues in survival analysis, Annu. Rev. Public Health 1997. 18:83–104.
  • William A. Link (1989) A Model for Informative Censoring, JASA, 84:407, 749-752,
  • Robertson, J. B., and Uppuluri, V. R. R. (1984), "A Generalized Kaplan-Meier Estimator," The Annals of Statistics, 12,366-371.
  • Slud EV, Rubinstein LV. (1983). Dependent competing risks and summary survival curves. Biometrika, 70:643–49
  • Williams, J. S., and Lagakos, S. W. (1977), "Models for Censored Survival Analysis: Constant-Sum and Variable-Sum Models," Biometrika, 64,215-224.

26

27 of 28

Edwards, Edwards Lifesciences, and the stylized E logo are trademarks or service marks of Edwards Lifesciences Corporation. All other trademarks are the property of their respective owners.

© 2022 Edwards Lifesciences Corporation. All rights reserved.

Thank you

Edwards Lifesciences • Route de l’Etraz 70, 1260 Nyon, Switzerland edwards.com

28 of 28