Spatial Autocorrelation I

HES 505 Fall 2025: Session 25

Matt Williamson

Objectives

By the end of today you should be able to:

Use the spdep package to identify the neighbors of a given polygon based on proximity, distance, and minimum number
Understand the underlying mechanics of Moran’s I and calculate it for various neighbors
Distinguish between global and local measures of spatial autocorrelation
Visualize neighbors and clusters

Revisiting Spatial Autocorrelation

The World Is Not Random

From Manuel Gimond

Kriging Revisited

Kriging Revisited

Assumes autocorrelation to facilitate prediction
Semivariogram decribes how autocorrelation changes with distance
But what about statistical inference?

Spatial Autocorrelation and Inference

Attributes (features) are often non-randomly distributed
Especially true with aggregated data
Affects estimates of precision (and significance)

Global Moran’s I

\[ I = \frac{n}{W} \cdot \frac{\sum_{i=1}^{n}\sum_{j=1}^{n} w_{ij}(x_i - \bar{x})(x_j - \bar{x})}{\sum_{i=1}^{n}(x_i - \bar{x})^2} \]

where \(W = \sum_{i=1}^{n}\sum_{j=1}^{n} w_{ij}\)

A first-order estimate of spatial autocorrelation

Moran’s I

…but How Do We Calculate \(w_{i}\)

How do we define \(I(d)\) for areal data?
What about \(w_{ij}\)?
We can use spdep for that!!

Finding Neighbors

Queen, rook, (and bishop) cases impose neighbors by contiguity
Weights calculated as a \(1/ num. of neighbors\)

More Formally

\[ E[I] = \frac{-1}{n-1} \]

When \(n\) is large, E[I] approaches 0
But large relative to what?

Testing for Spatial Autocorrelation

\(I\) can be estimated by fitting a model of the lagged average as a function of the measurement

\[ \mu_{lag} = \beta \times measurement \]

where \(\beta = I\)

Comparing observed to expected

We can generate the expected distribution of Moran’s I coefficients under a Null hypothesis of no spatial autocorrelation
Using permutation and a loop to generate simulations of Moran’s I

Significance testing

Pseudo p-value (based on permutations)
Analytically (sensitive to deviations from assumptions)
Using Monte Carlo

Local Indicators of Spatial Autocorrelation

Sometimes we want to know about second-order autocorrelation
Is there clustering around different locations (but maybe not everywhere)?

Local Moran’s I

\[ I_i = \frac{(x_i - \bar{x})}{\sum_{k=1}^{n}(x_k - \bar{x})^2/n} \sum_{j=1}^{n} w_{ij}(x_j - \bar{x}) \]

Can identify areas where clustering is great than expected
Proportional to Global Moran’s I

Autocorrelation and Inference

Interpolation assumes autocorrelation
Inference is interested in mechanisms
Tests for mechanisms get confused by autocorrelation
Diagnosing autocorrelation helps plan next steps (Weds)