Spatial Regression

HES 505 Fall 2025: Session 26

Matt Williamson

Motivation

Last class: Moran’s I showed evidence of spatial autocorrelation
Ordinary linear models assume independent residuals
But in spatial data, nearby units often influence one another
Today: What happens when we don’t account for spatial structure, and what SAR/SEM models do instead

When LM Fails

Linear model (LM):

\[ y = X\beta + \varepsilon, \qquad \varepsilon \sim N(0, \sigma^2 I) \]

Coefficient estimates remain unbiased (if X is exogenous)
Standard errors are wrong → misleading p-values
Residuals remain autocorrelated → violates assumptions

Two Types of Spatial Dependence

Spatial Lag (SAR/LAG) — outcomes spill over
Spatial Error (SEM) — unobserved spatial process

SAR (Spatial Lag) Model

\[ y = \rho W y + X\beta + \varepsilon \]

( ): strength of spillover/interaction
Direct effect: X → y
Indirect effect: neighbors’ X → y

SEM (Spatial Error) Model

\[ y = X\beta + u, \qquad u = \lambda W u + \varepsilon \]

Spatial structure is in the errors, not y
( ) describes strength of spatial autocorrelation in the unobserved process
Coefficients ( ) retain their LM meaning

SAR vs SEM: How to Tell Them Apart?

SAR: outcomes influence neighbors (spillover)
SEM: residuals autocorrelated due to unobserved spatial process
Use theory + residual diagnostics (Moran’s I)

What Students Should Learn

LM ignores spatial dependence → unreliable inference
SAR: interactive spillover model → interpret effects, not raw ()
SEM: correlated errors → () interpretation same as LM
Both help resolve residual autocorrelation
Model selection guided by theory + diagnostics