Accounting for random slopes using cluster robust standard errors

Accounting for random slopes using cluster robust standard errors

Jan 6, 2026 · 6 min read

Including random slopes (when warranted) in multilevel models is necessary to avoid making Type I errors as a result of underestimated standard errors. We typically investigate the presence of random slopes (RS) by using a likelihood ratio test and comparing two competing models. Often, random slopes are included when we have cross level interactions. If a random intercept (RI) model is used when a random slope model is warranted, the regression coefficients will be the same though the standard errors will be incorrect.

We wrote an article in the Journal of Experimental Education that shows that you can use a simpler RI but together with cluster robust standard errors (CRSEs) to approximate the (fixed effect) results from an RS model. The use of CRSEs will adjust the standard errors appropriately and can also serve as a diagnostic if comparing model-based standard errors vs. CRSEs.

The abstract of the article reads:

Although random intercept (RI) multilevel models (MLMs) are commonly used, the inclusion of random slopes, when warranted, is necessary to avoid Type I errors for variables that randomly vary by group. However, instead of explicitly modeling the random slope, an alternative approach could be to use a more parsimonious RI model together with cluster-robust standard errors (CRSEs). Although the traditionally used CRSEs (CR0) can still underestimate standard errors when only a few clusters are present, we investigate a variant (i.e., the CR2) that has been shown to beeffective with a limited number of clusters. Results of a Monte Carlo simulation show that using a RI model together with the CR2 can effectively account for violations of homoscedasticity, resulting in acceptable coverage probability rates for all conditions tested. However, when used with a limited number of clusters, a properly specified random slope model has more power to detect effects for level-2 predictors and cross-level interaction (CLI) terms.

An implication of this is that RI models are easier (and faster) to fit. Another implication is that the often-used SEM software lavaan cannot (at the time of writing this) fit random slope models but can use traditional sandwich standard errors (by using the estimator = 'MLR' option). However, if an RI model is fit together with CRSEs, results should be similar to that of a RS model.

We show an example of this using a small simulation using 1,000 replications. We just have one level-2 predictor, one level-1 predictor, and a cross-level interaction. The data generating process has a random slope. NOTE: when modeling this using lavaan, the interaction variable has to be created separately and added as a predictor.

Simulation

We use a small simulation of 50 clusters with 10 individuals per cluster.

library(lmerTest)
library(lavaan)

n2 <- 50 #no of level 2 units
n1 <- 10 #no of units at level 1 per cluster
tot <- n2 * n1
reps <- 1000 #replications

## model to be estimated using lavaan
ms <- '
  level: 1
  y ~ x1 + int
  level: 2
  y ~ w1
'

## creating containers
cfs1 <- cfs2 <- cfs3 <- cfs4 <- 
  ses1 <- ses2 <- ses3 <- ses4 <- matrix(NA, ncol = 4, nrow = reps)
icc <- numeric()

## begin simulation
set.seed(20260106) #for replicability

for (i in 1:reps){

  # generate the data
  w1 <- rep(rnorm(n2), each = n1)
  e2 <- rep(rnorm(n2), each = n1)
  x1 <- rnorm(tot)
  e1 <- rnorm(tot)
  e2s <- rep(rnorm(n2, 0 , sqrt(.2)), each = n1) #random slope
  id <- rep(1:n2, each = n1)
  
  y <- w1 * .25 + x1 * .5 + .25 * x1 * w1 + e2 + e1 + e2s * x1
  int <- x1 * w1 #cross level interaction for lavaan
  dat <- data.frame(id, w1, x1, y, int)
  
  m0 <- lmer(y ~ 1 + (1|id), data = dat) #null model
  m1 <- lmer(y ~ w1 + x1 + w1 * x1 + (1|id), data = dat) #ri
  m2 <- lmer(y ~ w1 + x1 + w1 * x1 + (x1|id), data = dat) #rs
  
  # #using lavaan
  l1 <- sem(model = ms, data = dat, cluster = 'id') #same as RI
  l2 <- sem(model = ms, data = dat, cluster = 'id', estimator = 'MLR')  
  ## saving the output
  res1 <- summary(m1)$coef
  res2 <- summary(m2)$coef
  
  icc[i] <- as.numeric(performance::icc(m0)[1])   
  
  cfs3[i, ] <- parameterestimates(l2)[c(13, 10, 1, 2), 6]
  ses3[i, ] <- parameterestimates(l2)[c(13, 10, 1, 2), 7]  
  
  cfs1[i, ] <- res1[ ,1]
  cfs2[i, ] <- res2[ ,1]
  ses1[i, ] <- res1[ ,2]
  ses2[i, ] <- res2[ ,2]
  
  cfs4[i, ] <- parameterestimates(l1)[c(13, 10, 1, 2), 6]
  ses4[i, ] <- parameterestimates(l1)[c(13, 10, 1, 2), 7] 
  
  if(i %% 50 == 0) cat(i,'..') #my basic progress counter

}

Analysis of Results

mean(icc)
[1] 0.4087556

The regression coefficient estimates are all similar to each other:

data.frame(
  cfs = row.names(res1),
  pop = c(0, .25, .5, .25), #population values
  ri = colMeans(cfs1),
  sem.ri = colMeans(cfs4),
  rs = colMeans(cfs2),
  sem.mlr = colMeans(cfs3)
)
          cfs  pop          ri      sem.ri          rs     sem.mlr
1 (Intercept) 0.00 0.002910729 0.002909412 0.002132698 0.002909412
2          w1 0.25 0.250323074 0.250323007 0.250687997 0.250323007
3          x1 0.50 0.502794531 0.502809691 0.503064188 0.502809691
4       w1:x1 0.25 0.250094542 0.250097290 0.249843687 0.250097290

We save the empirical standard errors (the ‘true’ values or the standard deviation of the coefficients) so we can compute relative bias of the standard errors.

tv1 <- apply(cfs1, 2, sd)
tv2 <- apply(cfs2, 2, sd)
tv3 <- apply(cfs3, 2, sd)
tv4 <- apply(cfs4, 2, sd)

# standard error estimates
data.frame(
  cfs = row.names(res1),
  ri = colMeans(ses1),
  sem.ri = colMeans(ses4),
  rs = colMeans(ses2),
  sem.mlr = colMeans(ses3)
)
          cfs         ri     sem.ri         rs    sem.mlr
1 (Intercept) 0.15036244 0.14728788 0.14952362 0.14710271
2          w1 0.15280281 0.14967951 0.15193503 0.14488835
3          x1 0.05189142 0.05178148 0.08144460 0.08199301
4       w1:x1 0.05286951 0.05275631 0.08274414 0.08065533

From the output, we can see that the RI model and the SEM model (also using an RI specification) are similar. The RS model and the SEM model with robust standard errors are more similar to each other.

We can then compute the relative bias in standard errors.

bias <- data.frame(
  cfs = row.names(res1),
  ri = (colMeans(ses1) - tv1) / tv1,
  sem.ri = (colMeans(ses4) - tv4) / tv4,
  rs = (colMeans(ses2) - tv2) / tv2,
  sem.mlr = (colMeans(ses3) - tv3) / tv3
)
bias
          cfs          ri        sem.ri           rs     sem.mlr
1 (Intercept) -0.02566800 -0.0455899754 -0.030157970 -0.04678992
2          w1  0.02142401  0.0005514581  0.020163448 -0.03147565
3          x1 -0.40783262 -0.4090621826 -0.032105729 -0.06428387
4       w1:x1 -0.38857941 -0.3899013250  0.003460016 -0.06726389

Visually, we can see that the standard errors for both RI models (using lmer and sem) are underestimated by over 30% for the cross-level interaction and the level-1 predictor. The RS model and the RI model with robust standard errors (estimated using lavaan) are close to each other.

library(tidyr)
library(ggplot2)
tall <- pivot_longer(bias, ri:sem.mlr)
ggplot(tall, aes(x = cfs, y = value, col = name)) +
  geom_jitter(width = .15) +
  labs(x = "Coefficient", y = 'Standard error bias', col = '') +
  theme_bw()

So, if fixed effects are primarily of interest, RI models are acceptable so long as CRSEs are used (when appropriate). The CRSEs account for unknown forms of heterogeneity. NOTE that there can be instances (based on your data) when cluster robust inferences can fail. In any case, this is helpful if researchers want to use latent variable modeling software (e.g., lavaan) in R to fit their multilevel models.

References:

Huang, F. L., & Zhang, B. (2025). Accounting for random slopes using cluster-robust standard errors in multilevel models. The Journal of Experimental Education, 1–20. https://doi.org/10.1080/00220973.2025.2565180