Title: UCSF - Introduction: Cohort Study Design

Slide 1: Human Subjects Studies

Study design begins with the unit of observation. Either the observations are made on individuals or they are grouped summaries of measurements made on individuals (usually the mean or median value of a measurement) or they are measurements which apply to a group of individuals (such as air temperature, water supply, etc). Any set of measurements on individuals can be converted to a group measure by taking the mean, etc., but group means cannot be converted meaningfully to individual measurements since each person will get the same value.

Individual measurements are the gold standard, but ecological studies, the common name for studies that use group measurements, have a role. Associations observed between group variables have often led to individual level research. Some types of data are only available at the group level. Some variables only apply to geographic areas; for example, air quality.

The danger in looking at associations between variables at the group level is that the association may not hold at the individual level. This is known as the ecological fallacy.

Slide 2: Three Keys to Study Design

1 Identify the population that is the Study Base
1 Determine how the experience of the Study Base population will be sampled
1 Consider the timing of measurements relative to the time period of the experience of the Study Base

Our presentation of study design is based on understanding how the three main types of study relate to the concept of a study base. A study base, or reference population, is a defined population whose disease experience during some period of time is the source of the study data. Identifying the study base answers the question: What population gave rise to the disease diagnoses in the study? Understanding the study base concept provides the clearest guidance to understanding valid case-control design, the study design that is most often a cause of confusion.

Sampling is the second key element of study design. Sampling is the process by which individuals belonging to a larger target population are selected for study. Sampling is obvious in some study designs but less so in others, such as case-control designs, but is the key to understanding a properly designed case-control study.

Measurement of predictor variables and outcome variables is the third key component of study design. There is much confusion around applying the terms retrospectiv e and prospective to study designs. If you focus on when the measurements were made in relation to when the disease outcome was measured or detected, you will avoid confusion about which came first. The timing of the measurements should be looked at separately from the timing of carrying out the study. A study may be carried out after the disease outcomes have occurred but use measurements that were made before they occurred. <br/>

Slide 3: Study Base

* The study base is the population who experience the disease outcomes you will observe in your study.
* In a cohort study, the study base is an explicitly defined cohort.
* In a cross-sectional study, the study base is a hypothetical cohort sampled at one point in time.
* In a case-control study, the study base is the cohort, either explicit or hypothetical, that gave rise to the cases.

In a cohort study, the study base is an explicitly defined group of individuals based on some set of characteristics at a given time, called time zero. This group of individuals is then followed forward in time. We will look at cross-sectional and case-control studies in the setting of a cohort because a cohort study explicitly defines its study base as the individuals who are recruited into the cohort. The members of the cohort are the population whose disease experience will be observed during the study follow-up period. Looking at cross-sectional and case-control studies in the setting of a cohort makes clear how they sample (or do not sample!) the cohort study base.

By a hypothetical cohort we mean that a cohort study was not performed, but that some possible or hypothetical cohort of individuals among whom the disease diagnoses were made can be defined. That group of individuals could in theory have been enrolled in a cohort study. An example would be the members of the Kaiser Permanente HMO at some point in time. If all the members of Northern California Kaiser Permanente, say, had been enrolled in a cohort study of a particular disease five years ago, measurement of the disease outcome and possible predictors of disease during the past five years would provide cohort study data. Even outside a well defined population like an HMO, there is always a hypothetical cohort among whom the cases would have been diagnosed if they had been enrolled in a cohort study. For example, if some group of persons with the disease diagnosis during a five-year period had been enrolled in a cohort at the beginning of the five-year period along with a sample of their neighbors who were not diagnosed with the disease, they would constitute a cohort. In practice, it may be difficult to identify this kind of less well defined hypothetical cohort.