```{r}#| eval: false## load the packages necessarylibrary(tidyverse)## read in the datalandmarks_csv <-read_csv("/Users/mattwilliamson/Google Drive/My Drive/TEACHING/Intro_Spatial_Data_R/Data/2023/assignment01/landmarks_ID.csv")## How many in each feature classtable(landmarks_csv$MTFCC)```
Reproducible scripts
Comments explain what the code is doing
Operations are ordered logically
Only relevant commands are presented
Useful object and function names
Script runs without errors (on your machine and someone else’s)
Flipping the script
Toward Efficient Reproducible Workflows
Scripts can document what you did, but not why you did it!
Scripts separate your analysis products from your report/manuscript
What is literate programming?
Documentation containing code (not vice versa!)
Direct connection between code and explanation
Convey meaning to humans rather than telling computer what to do!
Why literate programming?
Your analysis scripts are computer software
Integrate math, figures, code, and narrative in one place
Explaining something helps you learn it
Introducing Quarto
What is Quarto?
End-to-End process between data and report
Explicit linkage between each step (including iteration)
Each step involves trials and choices
What is Quarto?
A multi-language platform for developing reproducible documents
A ‘lab notebook’ for your analyses
Allows transparent, reproducible scientific reports and presentations
Key components
Metadata and global options: YAML
Text, figures, and tables: Markdown and LaTeX
Code: knitr (or jupyter if you’re into that sort of thing)
For this class…
We’ll use headers to outline the analysis
We’ll use code chunks for small, self-contained operations
We’ll create our own functions for repeated operations
We’ll knit our documents into a standalone, readable document
Version control, reproducibility, and sanity
Version control in general
Track changes without version explosion (via git)
Create specific snapshots of a project to facilitate experimentation (via commit and branches)
Create centralized backups and ease collaboration (via GitHub)
Version control and reproducibility
Documenting changes to code, manuscripts, figures increases transparency of the scientific process
Collaboration with other programmers is easier and less risky
Automates the sharing of code and original data
Version control and sanity
commit early, commit often
use sensible messages to remind yourself where you were
make sure you always have the most up-to-date version