Spatial Regression
HES 505 Fall 2025: Session 26
Motivation
- Last class: Moran’s I showed evidence of spatial autocorrelation
- Ordinary linear models assume independent residuals
- But in spatial data, nearby units often influence one another
- Today: What happens when we don’t account for spatial structure, and what SAR/SEM models do instead
When LM Fails
Linear model (LM):
\[ y = X\beta + \varepsilon, \qquad \varepsilon \sim N(0, \sigma^2 I) \]
- Coefficient estimates remain unbiased (if X is exogenous)
- Standard errors are wrong → misleading p-values
- Residuals remain autocorrelated → violates assumptions
Two Types of Spatial Dependence
- Spatial Lag (SAR/LAG) — outcomes spill over
- Spatial Error (SEM) — unobserved spatial process
SAR (Spatial Lag) Model
\[ y = \rho W y + X\beta + \varepsilon \]
- ( ): strength of spillover/interaction
- Direct effect: X → y
- Indirect effect: neighbors’ X → y
SEM (Spatial Error) Model
\[ y = X\beta + u, \qquad u = \lambda W u + \varepsilon \]
- Spatial structure is in the errors, not y
- ( ) describes strength of spatial autocorrelation in the unobserved process
- Coefficients ( ) retain their LM meaning
SAR vs SEM: How to Tell Them Apart?
- SAR: outcomes influence neighbors (spillover)
- SEM: residuals autocorrelated due to unobserved spatial process
- Use theory + residual diagnostics (Moran’s I)
What Students Should Learn
- LM ignores spatial dependence → unreliable inference
- SAR: interactive spillover model → interpret effects, not raw ()
- SEM: correlated errors → () interpretation same as LM
- Both help resolve residual autocorrelation
- Model selection guided by theory + diagnostics