- Models dependence of the mean of continuous outcome on multiple predictors simultaneously
- By including multiple predictors we can try to
- control confounding of treatment effects by indication, risk factor effects by demographics, other covariates
- examine mediation of treatment, risk factor effects
- assess interaction of treatment effects or exposure with sex, race/ethnicity, genotype, other effect modifiers
- get at causal mechanisms in observational data
- also: account for stratified or multi-center design of RCT, increase precision of estimates

- Systematic:
- how does the average value of outcome y depend on values of the predictors?

- Random:
- at each observed value of the predictors, values of y

are distributed about the predicted average - assumed distribution of deviations underlies

hypothesis tests, p-values, and confidence intervals

- at each observed value of the predictors, values of y

- In abstract terms, model written as
- Σ[y|x]= \xDF0 + \xDF1x1 + \xDF2x2 + \xB7\xB7\xB7 + \xDFpxp

- Σ[y|x]: Expected or average value of y for a given set of predictors x = x1,x2,\xB7\xB7\xB7 ,xp
- \xDFj: change in average value of outcome y per unit increase in predictor xj, holding all other predictors

constant - \xDF0 (the intercept): average value of the outcomey when

all predictors = 0 - "Linear predictor" common to linear, logistic, Cox, and longitudinal models

- \xDFj: change in average value of outcome y per unit increase in predictor xj, holding all other predictors

constant - Hold x2,...,xp constant, and let x1= k:
- Σ[y|x]= \xDF0 + \xDF1k+ \xDF2x2 + \xB7\xB7\xB7 + \xDFpxp (1)

- Now increase x1 by one unit to k+1:
- Σ[y|x]= \xDF0 + \xDF1(k +1)+ \xDF2x2 + \xB7\xB7\xB7 + \xDFpxp (2)

- Subtracting(1) from(2) gives \xDF1, for every value of k as well as x2,...xp
- Note: assumes x1 does not interact with x2,...xp

- \xDF0: average value of outcome y when all predictors = 0
- Let x1 = x2 = \xB7\xB7\xB7 = xp
`0. Then E[y|x]`

\xDF0 + \xDF1x1 + \xDF2x2 + \xB7\xB7\xB7 + \xDFpxp = \xDF0 - Intercept: where the regression line meets the y-axis in single-predictor models

- Same as in single-predictor model
- For many continuous predictors like age, SBP, LDL, no one has value 0
- Solution: center them on their sample means, so new variable has value 0 for observations at the mean
- For binary predictors, 0 is the usual coding for the reference group, so not a problem for interpretation
- With centering, \xDF0 estimates expected value of y for participant at reference level of binary predictors, mean of centered continuous predictors
- Values and interpretation of other coefficients unaffected

- Same as in single-predictor model
- Rescaled variable Xrs = X/k
- Coefficient for Xrs interpretable as increase in mean of outcome for a k-unit increase in X
- If k = SD(X), coefficient forXrs interpretable as increase in mean of outcome for a 1 SD increase in X
- \xDFˆ(Xrs)= k\xDFˆ(X);SE(\xDFˆ), 95% CI for \xDFˆalso rescaled
- P-value for \xDF, intercept coefficient unaffected
- Can accomplish the same thing using lincom

- Outcome yi varies from the average at xi by an amount oi
- ε represents unmeasured sources of variation, error
- As in single-predictor model, four assumptions about o:

2. mean zero at every value of x

3. constant variance

4. statistically independent

- These assumptions underlie hypothesis tests, confidence intervals, p-values, also model checking

- Nodistributional assumptions(e.g. Normality)
- predictorscanbecontinuous,discrete(e.g. counts),

categorical(dichotomous, nominal, ordinal)

- predictorscanbecontinuous,discrete(e.g. counts),
- Linear regression works better if
- predictors are relatively variable
- there are no excessively "influential" points

- Assumed measured without error(otherwise "regression dilution bias" and residual confounding)

- Fitted value:ˆyi = \xDFˆ0 + \xDFˆ1xi1 + \xB7\xB7\xB7 + \xDFˆpxip - estimated average or expected value of outcome y when

x = xi, the predictor values for observation i- now depends on multiple predictors instead of just one

- Residual: ri = yi -yˆi =ˆoi
- difference between datapoint and fitted value
- sample analogue of oi, used in checking model fit
- not obvious what "vertical" means with multiple predictors

- Method for fitting linear regression models
- OLS finds values of regression coefficients which minimize residual sumof squares(RSS; i.e. sumof squared residuals)
- Good statistical properties: unbiased, efficient, easy to compute, but sensitive to outliers
- For normally distributed outcomes, OLS is equivalent to "maximumlikelihood" (methodusedforlogistic,Cox, some repeated measures, many other models)

- Upper left(ANOVA table)
- Total SS =Σn/i=1(yi-\xAFy)2: variability of outcome yi=1(yi - about the sample average \xAFy n
- Total MS =(yi -y\xAF)2/(n -1): sample variance i=1 of outcome y n
- Model SS = (ˆyi -y\xAF)2: variability of outcome i=1 accounted for by predictors included in model
- Model MS: numerator of model F-statistic n
- Residual SS =(yi -yˆi)2: residual variability i=1 not accounted for by predictors, what OLS minimizes
- Residual MS = yi)2/(n -p): sample i=1(yi - ˆvariance of residuals

- Multipredictor linear regression is a tool for estimating how the average value of a continuous outcome depends

on multiple predictors simultaneously - Inferential machinery evaluates precision of estimates and whether sampling error can account for findings
- Coefficients generally interpretable as the change in theaverage value of the outcome per unit increase in the

predictor, holding all other predictors constant - Power helped by effect size, sample size, variability of predictor; hurt by correlation with other predictors,

variability left unexplained

- Can account for the some or all of the unadjusted association between a predictor and an outcome
- Controlling confounding the primary reason for doing multi-predictor regression
- Confounders must be associated with predictor and independently with outcome
- Only an association adjusted for confounders can be viewed as possibly causal

- Primary predictor and confounder are correlated:
- values of primary predictor larger in subgroup 2 than subgroup 1
- conversely, those with larger values of primary predictor more likely in subgroup 2

- Both continuousprimarypredictor andbinary confounder independently predict higher values of outcome
- Unadjusted effect of primary predictor partly reflects effect of being in subgroup 2
- Adjustment for the confounder fixes the problem

- Unadjusted estimateforprimarypredictor(6.2)
- Estimates an observable trend in whole population
- Causal interpretation misleading in most contexts

- Adjusted estimate(3.3) may have a causalinterpretation, because the effect of the confounder is not ignored
- Regression lines for subgroups 1 and 2:
- slopes estimate predictor/outcome association within

each subgroup("holding subgroup constant") - assumedparallel(nointeraction - sameeffectinboth

subgroups)

- slopes estimate predictor/outcome association within

- When the primary predictor and confounder are positively correlated, both predict higher(or lower)
- Values of the outcome adjusted coefficient for primary predictor is attenuated: that is, closer to zero than unadjusted coefficient in this case, still non-zero and signficant
- Typical pattern for confounding

- Confounding can also "mask" an independent association
- Example: needlestick injuries and HIV-seroconversion
- overall, AZT prophylaxis does not predict seroconversion, but* use of AZT associated with severity of injury * severity of injury predicts seroconversion
- protective effect of AZT unmasked after controlling

for severity of injury

- Positively correlated, with opposite effects on outcome:

Example: injury severity, AZT, and seroconversion - Negatively correlated, with similar effects on outcome:

Example: average BMI decreases with age in HERS

cohort, but both predict increased SBP

- Average BMI decreases with age in HERS cohort, but both predict increased SBP
- Adjustment for age increases BMI slope estimate from .21 to .30 mmHg per kg/m2
- Negative confounding is not all that uncommon
- Implications for predictor selection: univariate screening, "forward" selection procedures may miss some negatively confounded predictors

- Were all important confounders adjusted for?
- Were they measured accurately?
- Were their effects modeled adequately?
- modeled non-linearities in response to continuous

predictors(Session 6) - no omittedinteractions(Session5)
- no gross extrapolations

- modeled non-linearities in response to continuous
- Modeling difficulties used to argue for propensity scores

- Confounders must be associated with predictor and independently with outcome
- Unadjusted, adjusted coefficients estimate different things
- Unadjusted association may be partly or completely explained or, conversely, unmasked after adjustment
- Regression controlsfor confounding byjointly modeling effects ofpredictor and confounders(VGSMSect. 4.4)
- Bigger samples don't help, except by making it easier to adjust
- Controlling for covariates is easy enough, but residual confounding is difficult to rule out

- Confounders are thought to cause the primary predictor, or are correlates of such a cause
- In contrast, mediators are on the causal pathway from primary predictor to the outcome
- In models, mediation and confounding behave alike and must be distinguished on substantive grounds
- Example: to what extent is effect of BMI on SBP mediated by its effects on glucose levels?

- Use a series of models to show that:
- primary predictor independently predicts mediator
- mediator predicts outcome independently of primary predictor
- adjustment for mediator attenuates estimate for primary predictor

- The models:
- regress mediator on predictor and confounders
- regress outcome on predictor and confounders
- regress outcome on predictor, mediator, and confounder

- Interpretation of coefficient estimates for primary predictor:
- before adjustment for mediator: overall effect
- after adjustment: effect, if any, via pathways other than the mediator

- Assess mediation by difference between coefficients for primary predictor before and after adjustment for mediator
- Hypothesis tests, CIs for difference and proportion of effect explained abitharder(seebookfor references)
- Example: is association of BMI with SBP mediated by glucose levels?

- BMI independently predicts higher glucose: 1.7 mg/dL (95% CI 1.4-1.9) for each kg/m2

increase in BMI - A 10 mg/dL increase in glucose levels is independently associated withhigherSBP:0.5 mmHg(95%CI0.3-0.7)
- Overall BMI effect: before adjustment for glucose levels, each additional kg/m2 predicts an increase of .25 mmHg (95% CI 0.12-0.38) in average SBP
- Direct BMI effect via other pathways: after adjustment for glucose levels, each kg/m2 predicts an increase of only .16 mmHg(95%CI0.03-0.30)
- Degree of attenuation(PTE):glucoselevels explain (.25-.16)/.25*100 = 34% of the effect of BMI on SBP

- An observational analysis even when the primary predictor is treatment in RCT; must control for

confounding of mediator effects. - Evidence for mediation potentially stronger in longitudinal data
- but when predictor is both a mediator and a confounder, fancier methods required: e.g., "marginal structural models"

- "Negative" mediation is possible: glitazones, weight, bone loss; HT, statin use, CHD events

- TZDs cause bone loss in mouse models.
- In HABC, TZD use not associated with bone loss, after controlling for confounders by indication
- TZDs also cause weight gain, which is protective against bone loss
- TZDs do predict bone loss, after controlling for weight gain: adverse effect emerges after controlling for

beneficial effect via weight gain - In HERS, statin use differentially increased in placebo group, and controlling for this makes HT look a bit protective

- Regression coefficients change when either a confounder or a mediator is added to the model; which is which depends on how you draw the causal arrows(statistics not informative)
- Negative mediation is possible
- Must control for confounders of mediator
- Estimated independent effect of primary predictor
- before adjustment for mediator: overall effect
- after adjustment: direct effect via other pathways

(assumingboth models adjust for confounders)

- Positive continuous variables commonly log-transformed outcomes: normalize and equalize variance
- predictors: get rid of non-linearity, interaction
- more about this is session 6

- Bothlog-10(HIV viralload) and natural log transformations used
- How does this affect interpretation of regression coefficients

- For natural-log or log-10 transformed predictor xj, \xDFˆj estimates the increase in the mean of the outcome for each 1-log increase in log-transformed xj - equivalently a 2.7-fold or 10-fold increase in untransformed value of xj.
- \xDFˆjln(1+k/100) estimates the change in the mean of the outcome for each k% increase in untransformed xj.
- Note: p-value for test of \xDFj =0 unaffected by choice of k
- Use \xDFˆjlog10(1+k/100) if xj is log10
- transformed

- Use nlcom to get interpretable estimates with confidence interval(lincom does not allow log() as argument)