25 Reproducibility
How to regenerate data and render this book.
25.1 Prerequisites
- Raw Excel exports live in
data/(usage logs, assignment logs, internal consultation reports). - Name canonicalization lives in
data/config/name_mapping.csv. If missing or stale, regenerate with the scripts below. - Required R packages:
dplyr,ggplot2,lubridate,readxl,stringr,knitr,kableExtra, plus optional analytics packages (igraph,ggraph,tidygraph,DescTools,forecast,pROC,Kendall,moments). Install withinstall.packages()or pin viarenv.
25.2 Steps to Rebuild
- Regenerate processed data:
#| label: load-script
#| eval: false
source("R/process_data.R")- Refresh the name mapping (only when new names appear):
#| eval: false
source("R/generate_mapping.R") # builds/updates data/config/name_mapping.csv
source("R/update_mapping.R") # optional helper to review changes- Render the book:
quarto render .25.3 Environment Management
- Prefer reproducible environments (e.g.,
renv::init()followed byrenv::snapshot()orpak::lockfile()) so package versions are pinned. - If renders need to be stable across days, replace
date: todayin_quarto.ymlwith a fixed date string. - Store secrets outside the repo; only anonymized pathologist codes are written to
data/processed/pathologist_codes.rds.