Week 2 Notes - Course Introduction

Published

September 15, 2025

Key Concepts Learned

  • Visualization matters: summary statistics can obscure patterns; outliers may represent critical communities.
  • Ethical responsibility to report MOE, show uncertainty, and avoid misleading visuals.
  • Grammar of Graphics (ggplot2):
    Data → Aesthetics → Geometries → Visuals, with additional layers like scales, themes, and annotations.
  • Exploratory Data Analysis (EDA): a detective process of exploring distributions, relationships, anomalies, and reliability before modeling.
  • Data Joins: left, right, inner, and full joins for integrating datasets in policy analysis.

Coding Techniques

  • ggplot2 basics:
    • Structure: ggplot(data) + aes(x=var1, y=var2) + geom_*() + layers.
    • Aesthetic mappings: x, y, color, fill, size, shape, alpha.
    • Adding layers with + for customization.
  • EDA tools: histograms, scatterplots, boxplots for detecting distributions and outliers.
  • Data reliability: use coefficient of variation (CV) thresholds to categorize estimates as reliable, somewhat reliable, or unreliable.
  • dplyr joins: left_join(), right_join(), inner_join(), full_join().

Questions & Challenges

  • How to best visualize uncertainty so that policymakers actually understand and act on it.
  • Practice needed in customizing ggplot themes for clear and professional presentation.
  • Still clarifying when to collapse categories or aggregate geographies to reduce statistical uncertainty.

Connections to Policy

  • Poor visualization (e.g., ignoring MOEs) can directly misinform policy and harm vulnerable communities.
  • Reporting reliability aligns with the AICP Code of Ethics — transparency is both technical best practice and ethical obligation.
  • Visualizations tailored for different audiences (legislators, community groups, media) can shift how data drives decisions.

Reflection

  • The most interesting insight was Anscombe’s Quartet: identical summary statistics but radically different patterns when visualized — proving the necessity of graphs.
  • I’ll apply this by always starting analysis with EDA and incorporating data quality checks into my visuals.
  • This week reinforced that visualization is not just aesthetic — it is a tool for ethical communication in public policy.