Week 10 Notes – Logistic Regression for Binary Outcomes

Published

November 10, 2025

Key Concepts Learned

Logistic regression is used when the outcome is binary (yes/no).
Linear regression is inappropriate for binary outcomes because predictions can fall outside 0–1 and violate assumptions.
Logistic regression models probabilities using the logistic function.
Coefficients are interpreted on the log-odds scale; exponentiated coefficients become odds ratios.
A probability threshold (such as 0.5) converts predicted probabilities into binary decisions.
Confusion matrices summarize model performance using metrics like sensitivity, specificity, and precision.
ROC curves and AUC evaluate overall discrimination across all possible thresholds.
A model may perform differently across demographic groups, even when overall accuracy is high.

Fit logistic regression using glm(..., family = "binomial").
Use predict(model, type = "response") to obtain predicted probabilities.
Create confusion matrices to calculate metrics such as sensitivity and specificity.
Loop through multiple thresholds to evaluate different tradeoffs.
Use pROC::roc() to create ROC curves and compute AUC.

How to choose the optimal threshold for real-world decision making.
Interpreting odds ratios correctly when predictors are on different scales.
Understanding how different threshold choices influence false positives and false negatives.
Assessing equity and subgroup performance in a structured way.

Logistic regression is widely used for risk assessment in criminal justice, health, and public services.
Threshold decisions determine who receives interventions or additional scrutiny.
Policymakers must consider the different costs of false positives versus false negatives.
Fairness analysis is essential to ensure that models do not disproportionately burden certain groups.

Logistic regression reframed prediction for me as a probability and decision problem, not just a statistical fit.
Threshold selection is fundamentally a policy choice, not a purely statistical one.
I will be more intentional about documenting threshold decisions and considering equity impacts in future analyses.