MUSA 5080 Final Project
2025-12-08
SEPTA carries thousands of passengers daily. But for Transit Police, a critical question remains unanswered:
Does high ridership bring safety (“Eyes on the Street”) or does it attract crime (“Potential Targets”)?
Help SEPTA Transit Police transition from reactive to predictive deployment.
We identify “High-Risk Anomalies”—stops where environmental context and timing converge to create danger.
SEPTA Ridership (Summer 2025) (OpenDataPhilly)
Aggregated by Stop
Key Feature: Weekday vs. Weekend split
Role: Measures “Exposure”
Crime Incidents (OpenDataPhilly)
Robbery, Assault, Theft
Metric: Total Count (Corrected for days)
(OpenDataPhilly)
(ACS Data)

We suspect ridership plays competing roles in public safety. A simple regression averages these effects, potentially hiding the truth.
Method: Split the data into Weekdays and Weekends, and use the interaction term to isolate the impact of Ridership.
The Model Specification:
\[Crime = \beta_0 + \underbrace{\beta_1(Ridership)}_{\text{Guardian Effect}} + \beta_2(Weekend) + \underbrace{\mathbf{\beta_3(Ridership \times Weekend)}}_{\text{Target Effect (The Shift)}}\]


Model Performance Improves with Each Layer
| Model | MAE (Error Count) | RMSE | Improvement |
|---|---|---|---|
| 1. Ridership Only | 17.268 | 28.877 | - |
| 2. + Interaction | 17.209 | 28.868 | + 0.3% |
| 3. + Env & Demo | 12.777 | 21.039 | + 26% |
| 4. + Fixed | 11.409 | 18.343 | + 33.9% |
| 5. + Temporal Lag | 7.453 | 11.629 | + 56.8% |
| 6. Refined | 7.475 | 11.662 | + 56.7% |
Conclusion
Context Matters. Adding environmental variables and temporal lags reduced prediction error by 56.7%.

Spreading officers evenly across all high-ridership stops wastes resources on safe stations, while leaving true “High-risk Anomalies” unguarded.
We must distinguish between these scenarios to redeploy effectively.
Team:
Xinyuan Cui | Yuqing Yang | Jinyang Xu