Statistical Modeling I

HES 505 Fall 2025: Session 23

Matt Williamson

Objectives

By the end of today you should be able to:

Define the likelihood function and its relationship to statistical infrerence.
Recognize key assumptions of statistical models and how spatial data may challenge those assumptions.
Simulate fake data with known relationships
Fit simple linear and generalized linear models to spatial data

Inference for First Order Properties

\[ \begin{equation} z(\mathbf{x}) = \mu(\mathbf{x}) + \epsilon(\mathbf{x}) \end{equation} \]

Often we actually care about the “drivers” of \(\mu(x)\)
Inference not prediction
“Which spatial attributes drive \(z(\mathbf{x})\) ?”
Does \(\mu(x)\) increase or decrease with changes in a particular variable?”

Using regression to estimate mu

\[ \begin{equation} z(\mathbf{s}) \sim Distr(\mu, \sigma)\\ \mu = w_0 + \sum_{i=1}^{m}w_iX_i(\mathbf{s}) + \epsilon \end{equation} \]

When \(z(s)\) is binary → logistic regression
When \(z(s)\) is continuous → linear (gamma) regression
When \(z(s)\) is discrete → Poisson regression
Assumptions about \(\epsilon\) matter!!

Common Regression Forms

\[ \begin{aligned} y &\sim N(\bar{y}, \bar{\sigma}) &&\quad\text{(linear regression)}\\ y &= w_0 + \sum_{i=1}^{m}w_iX_i(\mathbf{s}) + \epsilon\\ \\ y &\sim \text{Bern}(p) &&\quad\text{(logistic regression)}\\ \text{logit}(y) &= w_0 + \sum_{i=1}^{m}w_iX_i(\mathbf{s}) + \epsilon\\ \\ y &\sim \text{Poisson}(\lambda) &&\quad\text{(Poisson regression)}\\ \log(\lambda) &= w_0 + \sum_{i=1}^{m}w_iX_i(\mathbf{s}) + \epsilon \end{aligned} \]

Key components

Distributional assumptions - the likelihood
\(w_i\) is ‘spatial weight’, equivalent to \(\beta\) in typical regression
link function scales linear model to appropriate support

Estimating parameters

You measure \(y\)
Your model expresses your hypothesis about the rules governing \(y\)
We need estimates of \(w_i\) to complete our rule

Estimating parameters via the Likelihood Function

\[ p(y_i \mid w, \sigma) = \frac{1}{\sqrt{2\pi\sigma^2}} \exp\!\left(-\frac{(y_i - w_iX_i(\mathbf{s}))^2}{2\sigma^2}\right)\\ p(\mathbf{y}\mid w, \sigma) = \prod_{i=1}^n \frac{1}{\sqrt{2\pi\sigma^2}} \exp\!\left(-\frac{(y_i - w_iX_i(\mathbf{s}))^2}{2\sigma^2}\right) \]

Estimating parameters via the likelihood function

Maximum Likelihood: treats \(w_i\) as fixed, unknown constants. Uncertainty calculated after…
Bayesian = \(w_i\) is a random quantity which varies within the constraints of our prior. Uncertainty built into the estimation process.

MLE: “What’s the best dart throw?”
Bayesian: “Given where darts tend to land, what is the distribution of possible aiming points?”

Deriving model assumptions from the likelihood

Shape of likelihood function gives an indication of residual behavior
\(\prod\) signals independent observations
Moments of the distribution add assumptions (i.e. Poisson mean = variance)

Logistic Regression and Distribution Models

Why do we create distribution models?

To identify important correlations between predictors and the occurrence of an event
Generate maps of the ‘range’ or ‘niche’ of events
Understand spatial patterns of event co-occurrence
Forecast changes in event distributions

General analysis situation

From Long

Spatially referenced locations of events \((\mathbf{y})\) sampled from the study extent
A matrix of predictors \((\mathbf{X})\) that can be assigned to each event based on spatial location

Goal: Estimate the probability of occurrence of events across unsampled regions of the study area based on correlations with predictors

Modeling Presence-Absence Data

Random or systematic sample of the study region
The presence (or absence) of the event is recorded for each point
Hypothesized predictors of occurrence are measured (or extracted) at each point

From By Ragnvald - Own work, CC BY-SA 3.0

Logistic regression

We can model favorability as the probability of occurrence using a logistic regression
A link function maps the linear predictor \((\mathbf{x_i}'\beta + \alpha)\) onto the support (0-1) for probabilities
Estimates of \(\beta\) can then be used to generate ‘wall-to-wall’ spatial predictions

\[ \begin{equation} y_{i} \sim \text{Bern}(p_i)\\ \text{link}(p_i) = \mathbf{x_i}'\beta + \alpha \end{equation} \]

Comparison with Machine Learning

Statistical models describe a generative process: how’d the data get here
Machine learning models ask something different
Underlying modes of prediction