Notes_Fixed_Effects

Some notes on whether I should group mean center the independent variable in fixed effects regression modelling. Note much of what I write I've found this Paul Allison article (http://pauldallison.com/downloads/Allison.SM90.pdf) and this Stata FAQ on interpreting the intercept of a fixed effects model (http://www.stata.com/support/faqs/stat/xtreg2.html) as useful references.

Let us imagine a world in which an individuals outcome is a function of several independent variables;

(Eq1)

Y_i = \beta_1(X_i) + Z_j + e_i

In this notation, X_i varies between observations within individual j, and Z_j is constant for each individual. To put a face on it, lets say Y_i is the score of an answer given on stack overflow, X_i is the users current reputation, and for an example lets say Z_j is the "tag effect" (i.e. some tags have more views, and so those tags tend to get higher scores simply due to more exposure to potential upvotes). For simplicity I am assuming Z_j does not vary within a users posts (i.e. they always post for answers in the same tag), but can vary between users.

This individual level equation would then imply an aggregate equation of the form;

(Eq2)

\bar{Y_j} = \beta_1(\bar{X_j}) + Z_j + \bar{e_j}

From this, one can then obtain \beta_1 , without observing Z_j, from subtracting the aggregate equation from the individual equation;

(Eq3)

(Y_i - \bar{Y}) = \beta_1(X_i - \bar{X_j}) + (Z_j - Z_j) + (e_i - \bar{e_j})

As far as I can tell, this is the reason why the fixed effects model is preferred. We can control for any time invariant effects on our outcome without the need to observe those time invariant effects. Now, in my original question, I asked if it is inappropriate to not mean center the independent variable, X_i, by its group average, \bar{X_j}. I believe it is inappropriate, and here is why.

Using the same notation above, lets imagine X_i is itself correlated with Z_j (if it wasn't, there would be no need for the fixed effects model to begin with, and \beta_1 would be identifiable). To make it simple, I specify X_i itself as a causal outcome of Z_j;

(Eq4)

X_i = \lambda(Z_j) + \mu_i

(\mu in this notation is just an error term). Now again, this would imply an aggregate equation of;

(Eq5)

\bar{X_j} = \lambda(Z_j) + (\mu_i - \bar{\mu})

Hence, if I fail to subtract \bar{X_j} from X_i, it will be absorbed by the error term in the equation (Eq3);

(Eq6)

(Y_i - \bar{Y}) = \beta_1(X_i) + (Z_j - Z_j) + (-\bar{X_j} + e_i - \bar{e_j})

Making the simplifying assumption that e_i and \mu_i have means of zero, and substituting the right hand side of equation (Eq5) for\bar{X_j}, one then has;

(Eq7)

(Y_i - \bar{Y}) = \beta_1(X_i) + ((-\lambda(Z_j) - \mu_i) + e_i)

This is simple to see that without subtracting \bar{X_j} from X_i, X_i is still correlated with the error term (and hence I have not controlled for unobserved Z_j's at all!) I'm not quite sure what the repercussions are with the negative sign in front of the Z_j in the error term, but regardless it appears to me that it only makes sense to mean center X_i as well as the dependent variable.