HLM for Survey Data

4
67
Posted by Elizabeth S, community karma 67

I'm considering a Hierarchical Linear Model for a paper I'm writing and have a couple questions. I'm not altogether clear about what effects should be specified as fixed and what effects should be specified as random. Most of the examples I've found assume the data are experimental or quasi-experimental,but  this is not true for my data. The data are survey-based where respondents are clustered within neighborhoods. Could someone clarify these terms using examples that someone in the social sciences would understand?

over 12 years ago

3 Comments

4
67
Julia Burdick-Will, community karma 67

If you need more specifics about buiding models with custered data the following text books are good and relatively easy to read references with a lot of observational social science examples. 

Fitzmaurice, Laird, and Ware. 2004. Applied Longitudinal Analysis. - This gives a general overview of modeling longitudinal and clustered data and some examples with a few different software programs. 

Raudenbush and Bryk. 2002. Hierarchical Linear Modles: Applications and Data Analysis Methods. - This focuses exclusively on applications using the HLM program, but also has good descriptions of models using different types of data. 

over 12 years ago
login to leave comment
4
227
Will Hauser, community karma 227

Hi Elizabeth,

As you probably know, HLM stems from ANOVA which is a statistical method used to evaluate data collected using experimental research designs. Thus many examples assume a basic understanding of ANOVA and the related terminology. This is obviously not true for those of us in social sciences where experimental designs are typically impractical, impossible, or unethical.

In these experimental designs the researcher is usually most interested in the effect of the stimulus which is set at predetermined levels (e.g. no dose, 1 dose, 2 doses, etc.). These effects are then set as "fixed" because the researcher is only interested in the net effect of those stimulus dosages at those predetermined levels. The effects specified as random are those that are typically seen as confounding variables that need to be statistically controlled (age or sex for example).

Ironically, in the social sciences the situation is very often the exact opposite. Let's begin by discussing exactly what the hierarchical method accomplishes. As you know, the technique accounts for the clustering of cases within larger units. In your case it is people nested within neighborhoods. It could be observations nested within people. Or it could be students nested within schools and schools nested within districts. There are very likely unobserved factors that tend to make observations similar within each cluster. One thing that HLM accomplishes is that it accounts for this cluster-based similarity in the computation of standard errors. The point estimates don't change vs. OLS regression but the standard errors are usually larger which means that OLS regression tends to downwardly bias standard errors when the data are clustered.

But this is only the tip of the iceberg. The really great thing about HLM is that it can simultaneously estimate a separate regression equation for each cluster. Lets return to the school example for a second. It may be that the effect of being poor is different in one school than it is in another. Maybe being poor is *only* a hindrance in a school full of wealthy kids. With a variable specified as fixed you get the effect for that variable under the assumption that it is the same in all schools. When it is specified as random the effect of that variable can be different in each school. In some schools it may affect the outcome and in others it may not. It may have a big effect in one school and a small one in another. The reported effect is then an average effect across all schools weighted by the size of each cluster (the point estimate is more precise in larger clusters). I should also note here that an effect can only be specified as random when it is nested within a larger aggregation. In a 2 level model only level 1 effects can be random. In a 3 level model both level 1 and level 2 effects may be specified as random.

The next step is that this variation in the effect of student-level poverty can then be modeled as a consequence of level 2 (school level) factors (e.g. proportion of wealthy students, number of students, availability of technology/computers/internet). This is sometimes referred to as "slopes as outcomes" regression meaning that the the variation in the slope of the level 1 random effects is predicted by the model.

So you have to think critically about what level 1 variables should have homogeneous effects across all of the level 2 units; these should be specified as fixed. Those effects that you think will be different within each level 2 unit should be specified as random.

I would also add that random effects are particularly taxing on the model. Odds are that if you specify all effects as random then the model will not converge. So choose wisely. I would recommend starting with a baseline model that has no predictors. From this you can get an estimate of the amount of variation in the outcome at each level. If there is not a significant amount of variation at level 2 then a hierarchical framework isn't really even necessary. Next add in your level 1 predictors; this is a random intercept model because only the intercept is random, all effects are fixed. Note how the amount of variation at level 2 has probably gotten smaller. This is because you have accounted for compositional effects - it may be that schools are only different because they have different populations of students. Next specify your level 2 variables. Then specify some effects as random. Lastly, create some cross level interaction terms (level 1 variable specified as random effect X level 2 variable) and put them into the model and see if they account for any of the variation in the random effect. At each step you can do a chi square test or look at some other model fit criterion to see if the changes to the model are improving fit.

Lastly, I would remind you to be sensitive to how you go about mean centering your level 1 variables. Choosing group or grand mean centering will have an effect on the point estimates you observe. There's a great article by Enders and Tofighi in the June 2007 issue of Psychological Methods that covers this issue if you are not yet comfortable with it.

Caveat: I am neither statistician nor economist. Consider this a jumping off point rather than a definitive work on the subject.

over 12 years ago
Will: any HLM references you would recommend, specifically for crafting the model?
Brian Cody – over 12 years ago
Raudenbush and Bryk is precisely what I read as an introductory topic to this text. Conceptually it is ok although the exampls are geared for someone working in education - so the examples may or may not translate well to your particular field. The bigger problem, the way I see it, is that you should choose a book that has concrete examples (i.e. sample datasets) and syntax for the particular statistical package you intend to use. So the Raudenbush and Bryk is good if you plan to use HLM for the analysis. Not so much if you are a SAS user. I work almost exclusively within the Stata environment and found the Rabe-Hesketh and Skrondal text satisfactory. I would recommend it for Stata users. I've also read some, but not all, of West, Welch, and Galecki's text on linear mixed models. It has examples for Stata, SAS, R, SPSS, and HLM so it might be a good choice for someone who uses multiple programs or who hasn't decided what program to use just yet. Picking texts is difficult. You want something that has examples that make sense within the context of your academic field but at the same time you want a text that will actually show you how to do x, y, and z in the particular software package you plan to use. And there's also the standard concerns about how well-organized, thoughtful, conceptually clear, and up-to-date the text is. For me it's always a gamble which is not so great given the typically excessive costs of most stats books.
Will Hauser – over 12 years ago
login to leave comment
2
49
Michael Bishop, community karma 49

In addition to Raudenbush and Bryk, I would recommend Gelman and Hill's book.  More info here: http://www.stat.columbia.edu/~gelman/arm/  It uses R.

over 12 years ago
login to leave comment