Kaplan-Meier Method for Calculating Cumulative Incidence

Calculating Cumulative Incidence with the Kaplan-Meier Method

To calculate cumulative incidence we must take into consideration varying follow-up times.

The Kaplan-Meier Method: The Analysis: Those dates divide the follow-up time of the cohort into a number of discrete pieces. The proportion surviving (probability) is calculated for each discrete piece and the overall cumulative probability of surviving is calculated by multiplying together the individual probabilities.

Every member of the cohort has to be assigned a date first seen and a date last seen or a date diagnosed.

Cumulative Probability

Probability of two independent events occurring is the product of the two probabilities for each occurring alone Probability of living to time 2 given that one has already lived to time 1 In order to calculate cumulative incidence, you need to understand or least accept on faith the following. It is a fundamental theorem of probability that the cumulative probability of two independent events is the product of their individual probabilities. So the probability of flipping two heads in a row with a fair coin is 1/2 x 1/2 = 1/4 .

The Kaplan-Meier method of calculating the cumulative probability of the disease outcome is to treat each separate discrete piece of time as an independent trial. There was some probability of the outcome during the first time period; there was another probability of the outcome during the second time period. The probability of the outcome during both time periods together is the product of the individual probabilities.

Students sometimes balk at treating the two time periods as independent events. They say, "How can they be considered independent when it is many of the same persons in each time period?" The answer is that the probability in the second time period is conditional on a given person already having lived through the first time. So the probability of the outcome in the second period is the probability conditional on not having experienced the outcome up until that point in time. A similar mistake is made by gamblers who think that because a coin has come up tails four times in row the probability of heads on the next toss is better than 1/2. IT IS NOT.

Example - Kaplan Meier Estimates

Using the data from Follow-up Starting Times Szklo and Nieto (Szklo, M., & Nieto, F. (2007). Epidemiology: Beyond the Basics (2nd Edition ed.). Boston: Jones and Bartlett Publishers) produced the following cumulative survival table.


Cumulative survival is calculated by multiplying probabilities for each prior failure time: Deaths occurred at 6 different times during follow-up, so there are 6 discrete pieces of time (D = death).

Data One Month Follow-up: Data Three Month Follow-up: Why not calculate a probability of survival when the one person was lost at 2 months? Because the probability of survival for the 9 would be 9/9 = 1 and 1 times the previous cumulative survival leaves it unchanged.

Survival Probabilities

Cannot calculate by multiplying each event probability (=probability of repeating event) The cumulative probability is calculated with the survival probabilities because it is only survival that happens repeatedly. To use the probability of the event each time you would be calculating a probability of repeated diagnoses, not what you want.

At the end of multiplying together all of the individual survival probabilities to get the cumulative probability of 0.18, the cumulative probability of death can be obtained by subtracting from 1. 1 – 0.18 = 0.82.

Since it is a proportion, it has no time unit connected to it, so time period has to be added

Example - Kaplan Meier Analysis

The following is a graph showing a Kaplan-Meier analysis of cumulative survival after breast cancer among patients grouped by whether they carry either the BRCA1 or the BRCA2 breast cancer gene mutation (N=58) versus patients without either mutation (N=979)

( Lee, J. S., Wacholder, S., Struewing, J. P., McAdams, M., Pee, D., Brody, L. C., et al. (1999). Survival after breast cancer in Ashkenazi Jewish BRCA1 and BRCA2 mutation carriers. J Natl Cancer Inst, 91(3), 259-263). 0226_4brca1.JPG

Notice that the lines are graphed in a stepwise fashion. Note also that the two curves lie on top of one another for about two years, but there is a suggestion that the mutation carriers have better survival beyond two years or so. To read cumulative survival for a group from the graph, pick a time point, such as 24 months, draw a line straight up to intersect the survival curve and then a horizontal line that intersects the y-axis. Where it intersects the y-axis is the estimate of the proportion surviving at 24 months of follow-up (about 44% in these data for either group).

If the cumulative incidence of death had been plotted instead of the cumulative incidence of survival (always an option), the graph would have started in the lower left-hand corner at 0 and moved up toward 1 (inverting the curve pictured).

Comparing Two KM Curves

As you can see in the two Kaplan Meier curves (below) the risk ratio would be different for different follow-up times.


When a Kaplan-Meier analysis is presented in the medical literature, a p-value that summarizes the probability that the two curves differ over their entire lengths is usually given.