Meeting Your Data
Content for Wednesday, September 3, 2025
Now that you know a bit about the “why” and “how” of this course, it’s time to start actually working with data. We’ll spend a bit of time thinking about the ‘nature’ of data, the qualities of “good” data, and the responsibilities of data creators and users. We’ll also practice some generic approaches for getting your own data into R
and accessing data via APIs and functions.
Readings
Setting the Stage
Geographies of conservation II: Technology, surveillance and conservation by algorithm by Adams (2019) provides a critical perspective on the role of new data-sensing technology in the environment.
The ethics of big data as a public good: Which public? Whose good? by Taylor (2016) highlights some of the difficulties that arise when “big data” is largely owned and created by private companies.
A Survey of Data Quality Requirements That Matter in ML Development Pipelines by Priestley et al. (2023) provides a practical discussion of the attributes of “good” data particularly in the context of applied machine learning. While a bit broader in focus, many of the 4 dimensions of data quality are directly relevant to the discussions in the other articles.
Data justice and biodiversity conservation by Pritchard et al. (2022) provides an accessible introduction to the concept of data justice and frameworks available for achieving it.
Technical Details
The “Wrangle” section of R for Data Science from Wickham (2016) provides the logic behind the tidyverse
approach to data import and manipulation.
The “Data in R
” section of Introduction to R by (Douglas et al. 2022) gives an important overview of the different data types in R
and how they are represented as objects within the R
environment.
Objectives
By the end of today, you should be able to:
Recognize the role that data selection and documentation plays in reproducible workflows
Summarize key debates surrounding the role of (spatial) data in solving environmental problems
Describe FAIR and CARE principles for data and their relationship to existing debates
Read data into your
R
environment.Inspect the data and summarize it using tables and simple plots
Slides
The slides for today’s lesson are available online as an HTML file. Use the buttons below to open the slides either as an interactive website or as a static PDF (for printing or storing for later). You can also click in the slides below and navigate through them with your left and right arrow keys.