Start presentation

## Slide 1: Comparing IncidenceMove in Exposed and Unexposed

Why Use IncidenceMove Rates?

1. To calculate incidence from population-based disease registries
2. To compare disease incidence in a cohort with a rate from the general population
3. To compare incidence from a time-varying exposure in persons while exposed and unexposed

• Research question: In a Medicaid database is there an association between use of non-aspirin non-steroidal anti-inflammatory drugs (NSAID) and coronary artery disease (CAD)?

• How would you study the relationship between NSAID use and CAD?

This is a common type of problem in cohort studies that examine exposures which can change over time, such as smoking, changes in diet, exercise, medications, etc. The database records time periods when medications were being used and time periods when they were not used. Some persons will never use the medications and some will use it continuously, but there is a large group who either initiate or discontinue use during the time under study and another group who change in both directions. The same person can go on and off the medication multiple times. It is difficult to see how you would analyze such data using the cumulative incidence approach.

## Slide 2: Calculating Stratified Person-time IncidenceMove Rates in Cohorts

• For persons followed in a cohort some potential risk factors may be fixed but some may be variable
• gender is fixed; taking medications or getting regular exercise are behaviors that can change over time
• Adding up person-time in an exposure category to get a denominator of time at risk is one way to deal with risk factors that change over time

For exposures that are time-varying, adding up the total person-time of exposure and non-exposure to form two denominators is a very useful way of dealing with the problem of changing exposure groups. The numerators for each denominator then become the number of events that occurred among persons at times when they were exposed and similarly for the events among the unexposed. This approach assumes that the exposure’s effect on the event is predominately during the time the exposure is occurring. It therefore makes sense for exposures whose association with the outcome quickly disappears when the exposure is removed. This assumption would not work very well for exposures which have lasting effects long after the actual exposure is removed. For an exposure like smoking with known long-term health risks, linking events to time periods when an individual was and was not smoking would probably assume too close a relationship between the time of exposure and the outcome.

## Slide 3: Analysis of Changing Exposure and Disease IncidenceMove

• Tennessee Medicaid data base, 1987-1998
• Use of NSAIDs could change over 11 years of study: same person could be in both using and non-using group at different times
• Could construct some fixed classification of persons as never, sometime, and frequent users and do cumulative incidence in each group.

Here is an example from a manuscript that analyzed Medicaid recipients in Tennessee over a ten-year period. In a prospective cohort with an exposure that can vary over time, subjects cannot be classified in groups by their amount of exposure at baseline because their future medication use cannot be known in advance. In a cohort analysis done retrospectively (such as this example) or at the end of follow-up in a prospective cohort, it is possible to classify individuals by some measure of total use, but such an approach would give a rather crude categorization (say, 2 or 3 groups) that doesn’t do a good job of distinguishing length of time of use, which may vary from months to years. IncidenceMove rates during times of use and non-use that account for time of exposure more accurately may be more informative, and they are easily calculated. The assumption is that NSAID use will have an immediate effect during usage in lowering the incidence of cardiovascular events.

## Slide 4: Analysis of Changing Exposure with Person-time Rates

• Person-time totaled for using and not using NSAIDs; MI or CAD death outcome
• 181,441 person-years of use (persons who were new users of NSAIDS)
• 181,441 person-years of non-use (persons, matched by age, sex, and calendar date)
• A person can contribute to the denominator both for use and non-use but only after a 365 day “wash out” period between use and non-use

- Ray, Lancet, 2002

The study endpoint event was hospitalization for myocardial infarction or death from coronary artery disease. The following were censoring variables: death, reaching age 85, entry into a nursing home, life-threatening illness (other than CAD), or end of study. Subjects with no history of prior NANSAIDS use who began use after having been enrolled in the database for 365 days (in order to have data on prior illnesses) were compared with subjects not using NANSAIDS. A subject who stopped using was eligible to be in the non-using group later, but to avoid any carryover effect, only after 365 days of non-use had elapsed.

## Slide 5: Analysis of Changing Exposure with Person-Time Rates

• Rate for NSAID use = 12.02 per 1000 pers-yrs
• Rate for non use = 11.86 per 1000 pers-yrs
• Rate ratio = 1.01
• Concluded no evidence that NSAIDS reduced risk of CHD events

- Ray, Lancet, 2002

Ideally, this is a question that should be resolved by a clinical trial, but a clinical trial of this question may never be done. In the absence of a randomized trial, an observational cohort study is the second best choice. Again, a prospective cohort with better measurements of all the potential confounders, in particular aspirin use, would be preferable, but to get the numbers required would mean a very large cohort followed for a number of years. Possible but very expensive. Analyzing existing data is less desirable, but it does provide an opportunity to assemble a cohort analysis on a large number over many years at minimal expense. The question that remains is whether they were able to get adequate control of confounders.

## Slide 6: Cumulative IncidenceMove Versus IncidenceMove Rate

http://twiki.library.ucsf.edu/twiki/bin/viewfile/CTSpedia/TICRDisOccurIIExposure?rev=1;filename=IncRate_Cumul.JPG

## Slide 7: Assumptions for Survival and Person-Time Analyses

http://twiki.library.ucsf.edu/twiki/bin/viewfile/CTSpedia/TICRDisOccurIIExposure?rev=1;filename=Survival.JPG

Note that the text is here using the headings “Survival Analysis” and “Person-time” for the cumulative incidence and incidence rate approaches to analyzing data that features time to an event. This is the graphic from the text that gives the assumptions of both methods. Note that they have the same assumptions with the exception of how the risk is calculated over intervals.

• Analysis of data from National Cancer Institute’s Follow-up of Diagnoses 1978 –1998 (SEER program):
• Overall survival cohort method = 40% at 20 years
• Overall survival with period analysis allowing for temporal trend changes in survival in recent calendar periods = 51% at 20 years

- Brenner, The Lancet, Oct 12, 2002

The longer the follow-up period of an analysis, the greater the threat that changes in the underlying incidence rate of the outcome may be causing an estimate of cumulative incidence to be invalid. In this example, looking at cancer mortality in the nationwide NCI SEER program (Surveillance, Epidemiology, and End Results), the researchers estimated that improvements in survival in more recent years would have resulted in a 20-year cumulative incidence of 51% survival rather than the 40% that the registry data actually show.

## Slide 9: Improving Cancer Survival Times by Calendar Period

http://twiki.library.ucsf.edu/twiki/bin/viewfile/CTSpedia/TICRDisOccurIIExposure?rev=1;filename=SEER.JPG

- Brenner, The Lancet, Oct 12, 2002

This is an example of analyzing data by calendar time cohorts. The analysis of biases in the cumulative incidence rates was performed by dividing the persons into 5-year cohorts based on the time period of their cancer diagnosis. This uses all cancer types in the SEER database. Done this way each more recent 5-year cohort has a 5-year shorter total follow-up time, but a clear trend toward overall improved survival appears. The top dotted line for the 1998 cohort is an extrapolation out to 20-years as there was at the time of the analysis no follow-up data on this group. So the estimate for the overall improvement in 20-year survival is the difference between the extrapolation of 1998 to 20 years versus the actual 20-year follow-up on the 1978 cohort.

## Slide 10: Summary Points

• Person-time incidence rate or density is not the same thing as cumulative incidence and is not a proportion

• Person-time incidence rate can be calculated with individual or average population data
• Allows incidence estimates in large populations that are not completely enumerated
• Allows comparison with population reference rates from other data sources
• Allows accumulation of time at risk for different exposure strata

## Slide 11: Problem Set - Disease Occurrence II - IncidenceMove Rates

Q1: Lung cancer death rates

http://twiki.library.ucsf.edu/twiki/bin/viewfile/CTSpedia/TICRDisOccurIIExposure?rev=1;filename=Lung_cancer.JPG

Q1A - Calculate the lung cancer death rate per person-month for the following data using individual level data. The study began in 8/95, ended 5/01, and all subjects still in follow-up were administratively censored on that date. Dates are in mo/yr.

Q1B - Calculate the 95% confidence interval for the lung cancer death rate.

Q1C - Convert the rate to per 100 person-years.

Q2: The SEER cancer registry recorded 128 cases of gastric cancer in Alameda County in 2001. If you are given the information that the 2000 Census counted 1,385,000 persons living in Alameda county at the end of the year 2000 and also estimated from its Current Population Survey that the county population grew by 2% in 2001, can you calculate gastric cancer incidence in Alameda county for 2001? If yes, what was the incidence? If you believe more information is needed, what would you need to calculate incidence? (Answers: TICR Disease Occurrence II - Answers)

Q3: A researcher at Kaiser Permanente is interested in the effect of Plavix (clopidogrel; an inhibitor of platelet aggregation whose effects on bleeding time typically end within 5 days of discontinuation of the drug) use on the incidence of repeat stroke in persons who have suffered one stroke.

Q3A - Given the Kaiser patient database contains good information on the diagnosis of stroke in Kaiser members and also records prescription drug use, how would you go about estimating the incidence of second stroke in Plavix users and the incidence of second stroke in non-users? (Answers: TICR Disease Occurrence II - Answers)

Q3B - In a separate analysis, the researcher also wants to assess the relationship between oral steroid use (defined as equal to or greater than 5 mg of Prednisone daily or the equivalent dosage of another oral steroid medication) and second stroke. Discuss whether second stroke incidence in steroid users and non-users can be measured using the same method you proposed for Plavix users and non-users. Clinical note: Side effects from steroids can be long-lasting, well after the last dose of steroids has been used. (Answers: TICR Disease Occurrence II - Answers)

Serologic markers of Epstein-Barr virus infection and nasopharyngeal carcinoma in Taiwanese men. N Engl J Med. 2001:1877-82.

BACKGROUND:
It is probable but unproven that Epstein-Barr virus (EBV) has a role in nasopharyngeal carcinoma. We determined whether antibodies against EBV are present before the development of nasopharyngeal carcinoma.

METHODS: A total of 9699 men were enrolled between 1984 and 1986. Blood samples were examined for IgA antibodies against EBV capsid antigen and neutralizing antibodies against EBV-specific DNase. During 131,981 person-years of follow-up, 22 pathologically confirmed new cases of nasopharyngeal carcinoma that were diagnosed more than one year after recruitment were ascertained through linkage with the National Cancer Registry of Taiwan.

RESULTS: The cumulative risk of nasopharyngeal carcinoma per 100,000 person-years was 11.2 for subjects who tested positive for neither serologic marker, 45.0 for those who had one marker, and 371.0 for those who had both markers. After adjustment for age and the presence or absence of a family history of nasopharyngeal carcinoma, the relative risk of nasopharyngeal carcinoma was 32.8 for subjects with both markers (95 percent confidence interval, 7.3 to 147.2; P<0.001) and 4.0 for subjects with one marker (95 percent confidence interval, 1.6 to 10.2; P=0.003), as compared with subjects with neither marker.

CONCLUSIONS: IgA antibodies against EBV capsid antigen and neutralizing antibodies against EBV DNase are predictive of nasopharyngeal carcinoma.

Q4A - What kind of incidence measure of nasopharyngeal carcinoma can be derived using the available data in the abstract – cumulative incidence or incidence rate? (Answers: TICR Disease Occurrence II - Answers)

Q4B - Do you agree with the authors’ terminology regarding incidence of nasopharyngeal carcinoma at the beginning of the results section? (Answers: TICR Disease Occurrence II - Answers)

Q5: Read the following text from:

Parity, reproductive factors, and the risk of pancreatic cancer in women.
Skinner HG, Michaud DS, Colditz GA, Giovannucci EL, Stampfer MJ, Willett WC , Fuchs CS. Cancer Epidemiol Biomarkers Prev. 2003:433-8.

We examined incidence rates of pancreatic cancer in a prospective cohort study of women. During follow-up (1976-1998), there were 2.4 million years of person time, and 243 cases of pancreatic cancer were identified.

Clinical note: Virtually all cases of pancreatic cancer get diagnosed prior to death, and, unfortunately, most cases die shortly after diagnosis.

Q5-A: From the information provided above, calculate the incidence of pancreatic cancer in this study population? (Answers: TICR Disease Occurrence II - Answers)

Q5-B: What other information about the study might you want to know in order to determine the validity of the estimate of the incidence of pancreatic cancer? (Answers: TICR Disease Occurrence II - Answers)

Q5-C: If you learned that the study had missed 47 cases of pancreatic cancer among women who had been lost to follow-up, bias would be a concern. What are you concerned could be biased and in which direction? (Answers: TICR Disease Occurrence II - Answers)

Q5-D: If you were the authors of this study, what else could you do to obtain the least biased estimate of the incidence of pancreatic cancer in this study population? (Answers: TICR Disease Occurrence II - Answers)

Q5-E: Do you ultimately think you could come up with an unbiased (or nearly unbiased!) estimate of pancreatic cancer incidence in this study population? (Answers: TICR Disease Occurrence II - Answers)