Week 9 Notes - Critical Perspectives on Predictive Policing

Published

November 3, 2025

Key Concepts Learned

Part 1: The Seductive Promise of Predictive Policing
Part 2: The Dirty Data Problem
- Definition
  - Traditional definition (data mining): Missing data, Incorrect data, Non-standardized formats
  - Extended definition: “Data derived from or influenced by corrupt, biased, and unlawful practices, including data that has been intentionally manipulated or ‘juked,’ as well as data that is distorted by individual and societal biases.”
- Forms of Dirty Data
  - Fabricated/Manipulated Data
  - Systematically Biased Data
  - Missing/Incomplete Data
  - Proxy Problems
Part 3: Technical Fixes Can’t Solve Social Problems
Part 4: Consequences and Harms
Part 5: Can Reform Work?
- Consent Decrees
  - Training on constitutional policing
  - Early intervention systems for problem officers
  - Revised use-of-force policies
  - Community oversight
  - Data collection improvements
Part 6: A Framework for Critical Evaluation
- Questions to Ask About Any Predictive Policing System
  - 1. Data Provenance
  - 1. Variable Selection
  - 1. Validation
  - 1. Deployment
  - 1. Transparency & Accountability
  - 1. Alternatives
Technical Foundations
- Modeling Workflow highly important
- The Core Logic: “Broken Windows Theory”
- 1. Local Spatial Autocorrelation
- 1. Count Regression Fundamentals
  - Problems for counts: negative values are impossible for counts, counts often have variance ≠ mean, counts are discrete (not continuous), count data is skewed (not normal errors) Overdispersion common!!
  - The Poisson Distribution-Appropriate for count data
    - Key property: Mean = Variance = λ
  - Poisson Regression Model
    - Log link: Ensures λi>0 (counts can’t be negative)
    - Linear relationship on log scale
  - Interpreting Poisson Coefficients
    - On log scale: β1 = change in log(expected count) per unit increase in X1
    - On count scale (exponentiate指数): exp(β1) = multiplicative effect on expected count (exp(β1)-1)是变化
  - Check for overdispersion: Dispersion=Residual Deviance/Degrees of Freedom (If ≈ 1: Poisson is fine, If > 1: Overdispersion, If > 2-3: Serious overdispersion → Use Negative Binomial)
  - Negative Binomial Regression

Coding Techniques

[New R functions or approaches]
[Quarto features learned]

Questions & Challenges

What I didn’t fully understand
- the basic workflow: pull-commit-push
Areas needing more practice
- remember the essential dplyr functions

Connections to Policy

[How this week’s content applies to real policy work]

Reflection

[What was most interesting]
[How I’ll apply this knowledge]

Key Concepts Learned

Negative Binomial Regression

Coding Techniques

Questions & Challenges

Connections to Policy

Reflection