Question anomalies - investigate outliers and unusual patterns
Document limitations - prepare honest communication about data quality
Combining datasets through spatial joins
left_join() - Keep all rows from left dataset
right_join() - Keep all rows from right dataset
inner_join() - Keep only rows that match in both
full_join() - Keep all rows from both datasets
Coding Techniques
Formatting Charts: Grammar of Graphics
Data → Aesthetics → Geometries → Visual
Code
# Formatting Chartsggplot(demo_data) +aes(x = total_popE, y = median_incomeE) +geom_point(alpha =0.7) +geom_smooth(method ="lm", se =TRUE) +labs(title ="Income vs Population in Pennsylvania Counties",subtitle ="2018-2022 ACS 5-Year Estimates",x ="Total Population",y ="Median Household Income ($)",caption ="Source: U.S. Census Bureau ACS" ) +theme_minimal() +scale_y_continuous(labels = scales::dollar) +scale_x_continuous(labels = scales::comma)