Section 9C�� Logistic Regression and Propensity scores
propensity
In a randomized trial, we know the probability (“propensity”) that a person with a certain set of covariates/ risk factors (age, gender …) will be in group A or B. In a randomized trial with 50:50 allocation, the probability is 50% for being in A or B for all variables. Everyone has the SAME propensity and, on average, the covariates are the same in A and B. These are linked.
Controlling for confounding
In comparing group A to B in a non randomized study, one may have confounding as the risk factors are not necessarily balanced between the two groups.
One option to control for confounding is to include all the potential covariates in a multivariate model. If there are only a few covariates, another option is to make strata. Within a stratum, there would be no association between treatment (A or B) and covariates. For example, if gender and smoking were the only risk factors, could compare A to B in male smokers, female smokers, male non smokers and female non smokers.
However, if we knew the probability of each person being assigned to treatment A (= 1- prob of assignment to B), one can shows that if one stratifies or matches on this probability (this propensity), the average values of the covariates within each stratum (or each match) are (at least) roughly the same between the two treatments! That is, it is not necessary to use all the covariate variables directly to make (too many) strata or matches.
Those with the same propensity have the same (or very similar) covariate values.
Example
| smoker | Non smoker |
male | 30% | 53% |
female | 12% | 85% |
Percent choosing treatment A by covariates
Stratifying on propensity (percent on treatment A) creates four strata.
Logistic for estimating propensity
While we do not know the probability of assignment to A (and B) we can model it using logistic regression. Here the “outcome” is treatment group (A or B) and the potential confounders are the predictors. We can then use the logit score to “summarize” all the covariates into a single score. We can then make strata or match using this score or use it as a single continuous covariate. We do not have to be concerned whether this model for A or B is “correct”, or have any meaning as long as the strata made in this way produce balance.
Example: tooth whitening
Is a new treatment for "whiter teeth" better than the standard treatment? Sample of n=350 people. | ||||
| ||||
t test - comparing mean gray scale scores (high is bad) | ||||
Unadjusted scores - observational study | ||||
This is not a randomized trial | ||||
|
|
|
|
|
Group | n | mean | SD | SEM |
STD | 208 | 39.45 | 24.1 | 1.67 |
NEW | 142 | 42.51 | 20.8 | 1.75 |
| | | | |
Mean difference |
| 3.06 |
| 2.49 |
| t=-1.23, | | p=0.219 |
|
Covariate comparison- not the same
STD, n=208 |
| NEW, n=142 |
| p value | ||||||
| mean | SD | sem |
| mean | SD | sem |
|
| |
age | 22.36 | 6.47 | 0.45 |
| 24.4 | 6.33 | 0.53 |
| 0.004 | |
Sugar use | 6.10 | 3.08 | 0.21 |
| 5.84 | 3.06 | 0.26 |
| 0.435 | |
|
|
|
|
|
|
|
|
|
| |
|
| PCT | SE |
|
| PCT | SE |
|
| |
Male | 28.4% | 3.1% |
|
| 47.2% | 4.2% |
| 0.0003 | ||
|
|
|
|
|
|
|
|
|
| |
Floss | 28.9% | 3.1% |
|
| 35.9% | 4.0% |
| 0.1629 | ||
Yearly clean | 31.7% | 3.2% |
|
| 32.4% | 3.9% |
| 0.8960 | ||
|
|
|
|
|
|
|
|
| ||
drink coffee | 42.3% | 3.4% |
|
| 74.7% | 3.7% |
| <0.0001 | ||
|
|
|
|
|
|
|
|
|
| |
drink tea | 30.8% | 3.2% |
|
| 62.7% | 4.1% |
| <0.0001 | ||
|
|
|
|
|
|
|
|
|
| |
use mouthwash | 22.1% | 2.9% |
|
| 25.4% | 3.7% |
| 0.4827 | ||
Logistic model for “new tx”-propensity
variable | Log OR | SE | p value |
Intercept | -1.798 | 0.5417 | 0.0009 |
Age | 0.0214 | 0.0196 | 0.2744 |
Male | 0.3898 | 0.2559 | 0.1277 |
Floss | 0.3280 | 0.2601 | 0.2073 |
Yearly clean | -0.0543 | 0.2556 | 0.8319 |
Sugar use | -0.0401 | 0.0393 | 0.3078 |
Coffee | 0.9042 | 0.2767 | 0.0011 |
Tea | 0.8681 | 0.2570 | 0.0007 |
mouthwash | -0.1009 | 0.2844 | 0.7228 |
Logit Score = -1.798 + 0.0214 Age + 0.3898 Male + 0.3280 Floss – 0.0543 Y clean – 0.0401 Sugar + 0.9042 coffee + 0.8681 tea – 0.1009 mouthwash
Propensity (“new tx”) = exp(score) / [1 + exp(score)] = P(x)
Make strata (or could match)
stratum | score | STD n | NEW n | total n |
1 | 0.0-0.2 | 83 | 4 | 87 |
2 | 0.2-0.4 | 49 | 39 | 88 |
3 | 0.4-0.6 | 38 | 50 | 88 |
4 | 0.6-1.0 | 38 | 49 | 87 |
Covariate compare by propensity strata
| | mean age | | |
tx | stratum 1 | stratum 2 | stratum 3 | stratum 4 |
STD | 18.0 | 24.8 | 25.5 | 25.6 |
NEW | 25.2 | 23.5 | 23.7 | 25.8 |
p value | 0.0668 | 0.2648 | 0.1696 | 0.8743 |
| | | | |
| | mean sugar use | | |
tx | stratum 1 | stratum 2 | stratum 3 | stratum 4 |
STD | 6.55 | 5.63 | 6.05 | 5.76 |
NEW | 7.62 | 6.66 | 5.55 | 5.33 |
p value | 0.4616 | 0.1587 | 0.3865 | 0.5455 |
| | | | |
| | pct male | | |
tx | stratum 1 | stratum 2 | stratum 3 | stratum 4 |
STD | 3.6% | 24.5% | 44.7% | 71.1% |
NEW | 0.0% | 30.8% | 46.0% | 65.3% |
p value | 0.078 | 0.514 | 0.906 | 0.566 |
| | | | |
| | pct who floss | | |
tx | stratum 1 | stratum 2 | stratum 3 | stratum 4 |
STD | 20.5% | 34.7% | 26.3% | 42.1% |
NEW | 25.0% | 23.1% | 30.0% | 53.1% |
p value | 0.838 | 0.225 | 0.702 | 0.307 |
Covariate compare by propensity strata
| | pct yearly tooth clean | | |
tx | stratum 1 | stratum 2 | stratum 3 | stratum 4 |
STD | 26.5% | 40.8% | 34.2% | 28.9% |
NEW | 75.0% | 25.6% | 32.0% | 34.7% |
p value | 0.070 | 0.126 | 0.827 | 0.566 |
| | | | |
| | pct drink coffee | | |
tx | stratum 1 | stratum 2 | stratum 3 | stratum 4 |
STD | 0.0% | 34.7% | 86.8% | 100.0% |
NEW | 0.0% | 46.2% | 78.0% | 100.0% |
p value | 1.000 | 0.274 | 0.271 | 1.000 |
| | | | |
| | pct drink tea | | |
tx | stratum 1 | stratum 2 | stratum 3 | stratum 4 |
STD | 0.0% | 8.2% | 57.9% | 100.0% |
NEW | 0.0% | 25.6% | 60.0% | 100.0% |
p value | 1.000 | 0.040 | 0.842 | 1.000 |
| | | | |
| | pct use mouthwash | | |
tx | stratum 1 | stratum 2 | stratum 3 | stratum 4 |
STD | 19.3% | 14.3% | 28.9% | 31.6% |
NEW | 50.0% | 25.6% | 16.0% | 32.7% |
p value | 0.226 | 0.186 | 0.150 | 0.915 |
Gray scale means by propensity strata�(quartiles)
| STD | STD | NEW | NEW |
| n | mean | p value | score |
stratum | n | mean | n | mean |
|
| difference |
|
|
1 | 83 | 21.3 | 4 | 27.5 |
| 87 | 6.2 | 0.5304 | 0-.2 |
2 | 49 | 43.9 | 39 | 36.9 |
| 88 | -7.0 | 0.0915 | 0.2-0.4 |
3 | 38 | 53.9 | 50 | 40.6 |
| 88 | -13.3 | 0.0014 | 0.4-0.6 |
4 | 38 | 58.9 | 49 | 50.2 |
| 87 | -8.7 | 0.0358 | 0.6+ |
|
|
|
|
|
|
|
|
|
|
total n | 208 |
| 142 |
|
| 350 |
|
|
|
|
|
|
|
|
|
|
|
|
|
adjusted mean | 44.5 |
| 38.8 |
|
| -5.7 | 0.06 |
| |
|
|
|
|
|
|
|
|
|
|
unadjusted mean | 39.4 |
| 42.5 |
|
| 3.1 | 0.21 |
| |
|
|
|
|
|
|
|
|
|
|
adj mean | 52.2 |
| 42.5 |
|
| -9.7 |
|
| |
stratum 2,3,4 |
|
|
|
|
|
|
|
| |
Propensity score as continuous covariate�Regression on gray scale
variable | Regression coefficient | SE | p value |
Intercept | 52.57 | 1.69 | < 0.0001 |
New tx | -9.77 | 2.31 | < 0.0001 |
Logit score | 17.56 | 1.43 | < 0.0001 |
New tx * logit score | -7.94 | 2.76 | 0.0042 |
R square = 0.328, SDe = 18.8
Q- If the propensity score is a good proxy for the 8 covariates, what should happen if any or all of the 8 covariates are added to the above model?
Propensity score as continuous covariate
As the propensity to choose the NEW treatment increases, the mean difference between the two treatments increases.
Matching
Can use the propensity score to MATCH.
Those treated and comparison with the same P(x) propensity score will have (about) the same values of their covariates (x).
Can compute difference in scores between treated and non treated comparison. Match on pair with the smallest score difference.
Propensity weights
logistic regression on the treatment
logit score -> odds -> P(x) =propensity score
As an alternative to matching, can weigh observations by propensity weights.
If treatment group A: wt = 1/P(x)
if comparison group B: wt = 1/[1- P(x) ]
The weighted distribution of the covariates will be the same in the treated and non treated groups.
Advantages of propensity score
1. Reduces all the covariates to one dimension
2. Easy to check if the two groups being compared overlap on the score (ie on the covariates)
3. Does not extrapolate beyond the range of the data (unlike linear regression)
4. Robust – Does not matter if model for propensity score is incorrectly specified as long as covariates are the same in the strata or matches made by the score.
Does not requite linearity or additivity (no interactions) to be true.
Disadvantages
Can only have two groups (can be modified)
Don’t directly assess effects of covariates on outcome
Can check propensity score overlap�between the two groups
Lack of overlap indicates that some subjects have covariate values on one group that are completely absent in the other group.
Regression adjustment- not propensity�Y= gray scale
Term | Estimate | Std Error | p value |
Intercept | -35.88 | 1.20 | 0.0000 |
Tx A | -7.71 | 0.60 | 0.0000 |
age | 2.98 | 0.04 | 0.0000 |
male | 4.84 | 0.62 | 0.0000 |
floss | -4.78 | 0.60 | 0.0000 |
clean | -9.71 | 0.59 | 0.0000 |
sugar | 1.16 | 0.09 | 0.0000 |
coffee | 7.73 | 0.66 | 0.0000 |
tea | 6.49 | 0.63 | 0.0000 |
mwash | -2.62 | 0.65 | 0.0001 |