Key Concepts Learned
- Predictive policing claims to increase efficiency and objectivity, but these claims rely heavily on data quality.
- “Dirty data” includes biased, incomplete, fabricated, or misleading data.
- Over-policing in certain neighborhoods creates feedback loops where the data reflect policing patterns rather than true crime patterns.
- Different generations of predictive policing tools (hotspot maps, risk terrain modeling, machine learning) all risk reproducing structural bias.
- Technical improvements alone cannot overcome flawed data or deep social inequities.
Coding Techniques
- This week focused on conceptual and ethical issues rather than new coding techniques.
- Examples highlighted how biased data can lead to misleading model outputs, even with sophisticated algorithms.
Questions & Challenges
- How can analysts detect when data are biased or incomplete?
- Can predictive systems be designed to avoid reinforcing inequality, or is this mainly a social and institutional problem?
- How should accuracy be balanced against fairness, accountability, and community trust?
Connections to Policy
- Predictive policing influences real-world deployment of law enforcement resources.
- Dirty data can lead to unjustified surveillance or enforcement in marginalized communities.
- Policymakers must understand data provenance and the ethical risks before adopting algorithmic tools.
- Transparency and oversight are critical for maintaining public trust.
Reflection
- This week emphasized that analytical rigor is not enough—context and ethics matter.
- Models built on biased data can cause real harm even when they appear accurate.
- Going forward, I will be more critical about where my data come from and whose experiences they represent or exclude.