# Multiple Predictor Linear Model

Start presentation

## Slide 1: Multiple predictor linear regression

• Models dependence of the mean of continuous outcome on multiple predictors simultaneously
• By including multiple predictors we can try to
• control confounding of treatment effects by indication, risk factor effects by demographics, other covariates
• examine mediation of treatment, risk factor effects
• assess interaction of treatment effects or exposure with sex, race/ethnicity, genotype, other effect modifiers
• get at causal mechanisms in observational data
• also: account for stratified or multi-center design of RCT, increase precision of estimates

## Slide 2: Components of the Linear Model

• Systematic:
• how does the average value of outcome y depend on values of the predictors?
• Random:
• at each observed value of the predictors, values of y
are distributed about the predicted average
• assumed distribution of deviations underlies
hypothesis tests, p-values, and confidence intervals

## Slide 3: Systematic part of the model

• In abstract terms, model written as
• Σ[y|x]= \xDF0 + \xDF1x1 + \xDF2x2 + \xB7\xB7\xB7 + \xDFpxp
• Σ[y|x]: Expected or average value of y for a given set of predictors x = x1,x2,\xB7\xB7\xB7 ,xp
• \xDFj: change in average value of outcome y per unit increase in predictor xj, holding all other predictors
constant
• \xDF0 (the intercept): average value of the outcomey when
all predictors = 0
• "Linear predictor" common to linear, logistic, Cox, and longitudinal models

## Slide 4: Interpretation of regression coefficients

• \xDFj: change in average value of outcome y per unit increase in predictor xj, holding all other predictors
constant
• Hold x2,...,xp constant, and let x1= k:
• Σ[y|x]= \xDF0 + \xDF1k+ \xDF2x2 + \xB7\xB7\xB7 + \xDFpxp (1)
• Now increase x1 by one unit to k+1:
• Σ[y|x]= \xDF0 + \xDF1(k +1)+ \xDF2x2 + \xB7\xB7\xB7 + \xDFpxp (2)
• Subtracting(1) from(2) gives \xDF1, for every value of k as well as x2,...xp
• Note: assumes x1 does not interact with x2,...xp

## Slide 5: Interpretation of regression coefficients

• \xDF0: average value of outcome y when all predictors = 0
• Let x1 = x2 = \xB7\xB7\xB7 = xp `0. Then E[y|x]` \xDF0 + \xDF1x1 + \xDF2x2 + \xB7\xB7\xB7 + \xDFpxp = \xDF0
• Intercept: where the regression line meets the y-axis in single-predictor models

## Slide 6: Review: centering predictors

• Same as in single-predictor model
• For many continuous predictors like age, SBP, LDL, no one has value 0
• Solution: center them on their sample means, so new variable has value 0 for observations at the mean
• For binary predictors, 0 is the usual coding for the reference group, so not a problem for interpretation
• With centering, \xDF0 estimates expected value of y for participant at reference level of binary predictors, mean of centered continuous predictors
• Values and interpretation of other coefficients unaffected

## Slide 7: Review: rescaling predictors

• Same as in single-predictor model
• Rescaled variable Xrs = X/k
• Coefficient for Xrs interpretable as increase in mean of outcome for a k-unit increase in X
• If k = SD(X), coefficient forXrs interpretable as increase in mean of outcome for a 1 SD increase in X
• \xDFˆ(Xrs)= k\xDFˆ(X);SE(\xDFˆ), 95% CI for \xDFˆalso rescaled
• P-value for \xDF, intercept coefficient unaffected
• Can accomplish the same thing using lincom

## Slide 8: Random part of the model

yi =Σ[y|xi]+εi

• Outcome yi varies from the average at xi by an amount oi
• ε represents unmeasured sources of variation, error
• As in single-predictor model, four assumptions about o:
1. Normally distributed
2. mean zero at every value of x
3. constant variance
4. statistically independent
• These assumptions underlie hypothesis tests, confidence intervals, p-values, also model checking

## Slide 9: Assumptions about the predictors

• Nodistributional assumptions(e.g. Normality)
• predictorscanbecontinuous,discrete(e.g. counts),
categorical(dichotomous, nominal, ordinal)
• Linear regression works better if
• predictors are relatively variable
• there are no excessively "influential" points
• Assumed measured without error(otherwise "regression dilution bias" and residual confounding)

## Slide 10: Update of two details

• Fitted value:ˆyi = \xDFˆ0 + \xDFˆ1xi1 + \xB7\xB7\xB7 + \xDFˆpxip - estimated average or expected value of outcome y when
x = xi, the predictor values for observation i
• now depends on multiple predictors instead of just one
• Residual: ri = yi -yˆi =ˆoi
• difference between datapoint and fitted value
• sample analogue of oi, used in checking model fit
• not obvious what "vertical" means with multiple predictors

## Slide 11: Ordinaryleast squares(OLS)

• Method for fitting linear regression models
• OLS finds values of regression coefficients which minimize residual sumof squares(RSS; i.e. sumof squared residuals)
• Good statistical properties: unbiased, efficient, easy to compute, but sensitive to outliers
• For normally distributed outcomes, OLS is equivalent to "maximumlikelihood" (methodusedforlogistic,Cox, some repeated measures, many other models)

## Slide 12: Multi-predictor linear model for glucose

Multi-predictor linear model for glucose

• Upper left(ANOVA table)
• Total SS =Σn/i=1(yi-\xAFy)2: variability of outcome yi=1(yi - about the sample average \xAFy n
• Total MS =(yi -y\xAF)2/(n -1): sample variance i=1 of outcome y n
• Model SS = (ˆyi -y\xAF)2: variability of outcome i=1 accounted for by predictors included in model
• Model MS: numerator of model F-statistic n
• Residual SS =(yi -yˆi)2: residual variability i=1 not accounted for by predictors, what OLS minimizes
• Residual MS = yi)2/(n -p): sample i=1(yi - ˆvariance of residuals

## Slide 13: Interpreting Stata regression output

Interpreting STATA regression output

## Slide 14: Summary of model

• Multipredictor linear regression is a tool for estimating how the average value of a continuous outcome depends
on multiple predictors simultaneously
• Inferential machinery evaluates precision of estimates and whether sampling error can account for findings
• Coefficients generally interpretable as the change in theaverage value of the outcome per unit increase in the
predictor, holding all other predictors constant
• Power helped by effect size, sample size, variability of predictor; hurt by correlation with other predictors,
variability left unexplained

## Slide 15: Confounding

• Can account for the some or all of the unadjusted association between a predictor and an outcome
• Controlling confounding the primary reason for doing multi-predictor regression
• Confounders must be associated with predictor and independently with outcome
• Only an association adjusted for confounders can be viewed as possibly causal

## Slide 18: Primary predictor, confounder, and outcome

Primary predictor, confounder and outcome

• Primary predictor and confounder are correlated:
• values of primary predictor larger in subgroup 2 than subgroup 1
• conversely, those with larger values of primary predictor more likely in subgroup 2
• Both continuousprimarypredictor andbinary confounder independently predict higher values of outcome
• Unadjusted effect of primary predictor partly reflects effect of being in subgroup 2
• Adjustment for the confounder fixes the problem

## Slide 19: Interpretation of results

• Estimates an observable trend in whole population
• Causal interpretation misleading in most contexts
• Adjusted estimate(3.3) may have a causalinterpretation, because the effect of the confounder is not ignored
• Regression lines for subgroups 1 and 2:
• slopes estimate predictor/outcome association within
each subgroup("holding subgroup constant")
• assumedparallel(nointeraction - sameeffectinboth
subgroups)
Behavior of regression coefficients for this case
• When the primary predictor and confounder are positively correlated, both predict higher(or lower)
• Values of the outcome adjusted coefficient for primary predictor is attenuated: that is, closer to zero than unadjusted coefficient in this case, still non-zero and signficant
• Typical pattern for confounding

## Slide 20: Another case: so-called negative confounding

• Confounding can also "mask" an independent association
• Example: needlestick injuries and HIV-seroconversion
• overall, AZT prophylaxis does not predict seroconversion, but* use of AZT associated with severity of injury * severity of injury predicts seroconversion
• protective effect of AZT unmasked after controlling
for severity of injury

## Slide 21: Negative confounding: two scenarios

Negative confounding may arise between predictors that are
1. Positively correlated, with opposite effects on outcome:
Example: injury severity, AZT, and seroconversion
2. Negatively correlated, with similar effects on outcome:
Example: average BMI decreases with age in HERS
cohort, but both predict increased SBP

## Slide 22: Summary: negative confounding

• Average BMI decreases with age in HERS cohort, but both predict increased SBP
• Adjustment for age increases BMI slope estimate from .21 to .30 mmHg per kg/m2
• Negative confounding is not all that uncommon
• Implications for predictor selection: univariate screening, "forward" selection procedures may miss some negatively confounded predictors

## Slide 23: Confounding is difficult to rule out

• Were all important confounders adjusted for?
• Were they measured accurately?
• Were their effects modeled adequately?
• modeled non-linearities in response to continuous
predictors(Session 6)
• no omittedinteractions(Session5)
• no gross extrapolations
• Modeling difficulties used to argue for propensity scores

## Slide 24: Summary

• Confounders must be associated with predictor and independently with outcome
• Regression controlsfor confounding byjointly modeling effects ofpredictor and confounders(VGSMSect. 4.4)
• Bigger samples don't help, except by making it easier to adjust
• Controlling for covariates is easy enough, but residual confounding is difficult to rule out

## Slide 25: Causal diagrams Mediation

• Confounders are thought to cause the primary predictor, or are correlates of such a cause
• In contrast, mediators are on the causal pathway from primary predictor to the outcome
• In models, mediation and confounding behave alike and must be distinguished on substantive grounds
• Example: to what extent is effect of BMI on SBP mediated by its effects on glucose levels?

## Slide 26: Examining mediation

• Use a series of models to show that:
• primary predictor independently predicts mediator
• mediator predicts outcome independently of primary predictor
• adjustment for mediator attenuates estimate for primary predictor
• The models:
• regress mediator on predictor and confounders
• regress outcome on predictor and confounders
• regress outcome on predictor, mediator, and confounder

## Slide 27: Mediation

• Interpretation of coefficient estimates for primary predictor:
• before adjustment for mediator: overall effect
• after adjustment: effect, if any, via pathways other than the mediator
• Assess mediation by difference between coefficients for primary predictor before and after adjustment for mediator
• Hypothesis tests, CIs for difference and proportion of effect explained abitharder(seebookfor references)
• Example: is association of BMI with SBP mediated by glucose levels?

## Slide 28: Mediation of BMI by glucose levels

• BMI independently predicts higher glucose: 1.7 mg/dL (95% CI 1.4-1.9) for each kg/m2
increase in BMI
• A 10 mg/dL increase in glucose levels is independently associated withhigherSBP:0.5 mmHg(95%CI0.3-0.7)
• Overall BMI effect: before adjustment for glucose levels, each additional kg/m2 predicts an increase of .25 mmHg (95% CI 0.12-0.38) in average SBP
• Direct BMI effect via other pathways: after adjustment for glucose levels, each kg/m2 predicts an increase of only .16 mmHg(95%CI0.03-0.30)
• Degree of attenuation(PTE):glucoselevels explain (.25-.16)/.25*100 = 34% of the effect of BMI on SBP

## Slide 29: Mediation issues

• An observational analysis even when the primary predictor is treatment in RCT; must control for
confounding of mediator effects.
• Evidence for mediation potentially stronger in longitudinal data
• but when predictor is both a mediator and a confounder, fancier methods required: e.g., "marginal structural models"
• "Negative" mediation is possible: glitazones, weight, bone loss; HT, statin use, CHD events

## Slide 30: Negative mediation

• TZDs cause bone loss in mouse models.
• In HABC, TZD use not associated with bone loss, after controlling for confounders by indication
• TZDs also cause weight gain, which is protective against bone loss
• TZDs do predict bone loss, after controlling for weight gain: adverse effect emerges after controlling for
beneficial effect via weight gain
• In HERS, statin use differentially increased in placebo group, and controlling for this makes HT look a bit protective

## Slide 31: Summary: mediation

• Regression coefficients change when either a confounder or a mediator is added to the model; which is which depends on how you draw the causal arrows(statistics not informative)
• Negative mediation is possible
• Must control for confounders of mediator
• Estimated independent effect of primary predictor
• before adjustment for mediator: overall effect
• after adjustment: direct effect via other pathways

## Slide 32: Interpreting results for log-transformed variables

• Positive continuous variables commonly log-transformed outcomes: normalize and equalize variance
• predictors: get rid of non-linearity, interaction