Machine Learning Models: Trees

Content for Thursday, October 30, 2025

In the previous set of lectures, we thought about explanation of spatial patterns through a fairly simple lens. Namely, that the best explanation for variation in values is a function of distance. As such, the best prediction of new values take into account the measurement and some function of distance. In deterministic methods, we assume that the measurements we have should simpley be smoothed based on distance. In probabilistic methods, we allow for the idea that the underlying mean of the process is unknown (and can vary) and then exploit spatial covariance to more meaningfully account for relationship between distance and the mean of the process (allowing for potential second-order effects). It is often the case, however, that we want to know more about this unknown mean value (which factors are most important, how do they affect the process, etc). This might be because we are interested in inference about those factors more than we care about complete predictions. In this case, me might use statistical-learning models to take advantage of the data we have in a way that is computationally efficient. We’ll talk about some of the simpler methods for doing that today: tree based methods.

Resources

Objectives

By the end of today you should be able to:

  • Differentiate supervised from unsupervised classification

  • Recognize the linkage between statistical learning models and interpolation

  • Define the key elements of tree-based classifiers

  • Articulate the differences in statistical learning for spatial data

View all slides in new window Download PDF of all slides

References

Cutler, D. R., T. C. Edwards Jr., K. H. Beard, A. Cutler, K. T. Hess, J. Gibson, and J. J. Lawler. 2007. RANDOM FORESTS FOR CLASSIFICATION IN ECOLOGY. Ecology 88:2783–2792.
James, G., D. Witten, T. Hastie, and R. Tibshirani. 2021. Classification. Pages 129–195 An introduction to statistical learning: With applications in r. Springer US, New York, NY.
Pebesma, E., and R. Bivand. 2023. Spatial data science: With applications in R. Chapman; Hall/CRC, Boca Raton.