Spatial Regression

HES 505 Fall 2025: Session 26

Matt Williamson

Motivation

  • Last class: Moran’s I showed evidence of spatial autocorrelation
  • Ordinary linear models assume independent residuals
  • But in spatial data, nearby units often influence one another
  • Today: What happens when we don’t account for spatial structure, and what SAR/SEM models do instead

When LM Fails

Linear model (LM):

\[ y = X\beta + \varepsilon, \qquad \varepsilon \sim N(0, \sigma^2 I) \]

  • Coefficient estimates remain unbiased (if X is exogenous)
  • Standard errors are wrong → misleading p-values
  • Residuals remain autocorrelated → violates assumptions

Two Types of Spatial Dependence

  1. Spatial Lag (SAR/LAG) — outcomes spill over
  2. Spatial Error (SEM) — unobserved spatial process

SAR (Spatial Lag) Model

\[ y = \rho W y + X\beta + \varepsilon \]

  • ( ): strength of spillover/interaction
  • Direct effect: X → y
  • Indirect effect: neighbors’ X → y

SEM (Spatial Error) Model

\[ y = X\beta + u, \qquad u = \lambda W u + \varepsilon \]

  • Spatial structure is in the errors, not y
  • ( ) describes strength of spatial autocorrelation in the unobserved process
  • Coefficients ( ) retain their LM meaning

SAR vs SEM: How to Tell Them Apart?

  • SAR: outcomes influence neighbors (spillover)
  • SEM: residuals autocorrelated due to unobserved spatial process
  • Use theory + residual diagnostics (Moran’s I)

What Students Should Learn

  • LM ignores spatial dependence → unreliable inference
  • SAR: interactive spillover model → interpret effects, not raw ()
  • SEM: correlated errors → () interpretation same as LM
  • Both help resolve residual autocorrelation
  • Model selection guided by theory + diagnostics