Statistical Modelling I
Content for Monday, November 10, 2025
Last class we looked at tree-based machine-learning models for classifying data (i.e., presence/abscence, categorization) or estimating the pseudo-regression relationships between variables and measured outcomes. Those approaches rely on learning the distributions necessary for the likelihood from the data themselves (rather than from a theoretical probability distribution). Today we’ll look at alternative approaches using tradititional statistical models.
Resources
Bigger Picture
Logistic regression: a brief primer by (Stoltzfus 2011) is a nice introduction to logistic regression.
Is my species distribution model fit for purpose? Matching data and models to applications by (Guillera-Arroita et al. 2015) is an excellent, concise description of the relations between data collection, statistical models, and inference.
Predicting species distributions for conservation decisions by (Guisan et al. 2013) is a foundational paper describing some of the challenges with making conservation decisions based on the outcomes of species distribution models.
Techincal details
Statistical Models from the
terrapackage documentation has some step-by-step examples of usingterrawith baseRto fit statistical modles.Statistical Learning from (Lovelace et al. 2019) has an extended example of fitting a variety of modeling approaches to data and evaluating their performance.
Objectives
By the end of today you should be able to:
Define the likelihood function and its relationship to statistical infrerence.
Recognize key assumptions of statistical models and how spatial data may challenge those assumptions.
Simulate fake data with known relationships
Fit simple linear and generalized linear models to spatial data