COURSE INTRODUCTION

WEEK 1 NOTES

Author

Tess Vu

Published

September 8, 2025

KEY CONCEPTS LEARNED

  • Main Concepts
    • Data science skills and public policy knowledge are often used in tandem, the latter of which inherently possesses massive quantifiable and qualitative data, so data science concepts and tools are important to clean, wrangle, and make all the big data easier to understand for people who are familiar with the technical and political intricacies behind them, and especially for people who are unfamiliar with it all. Communicating this crossover is crucial to inform constituents and government representatives on existing or creating local, state, and/or federal legislation.
    • Reproducible research is important, which means consistent and concise documentation, relevant file and variable naming, pushing updates with change comments. A little goes a long way, commit little changes constantly to avoid large commits that take time to look through and debug.
    • Applying technical skills to public issues, which not only means making details as transparent as possible, but also utilizing accessible and open-source software and other tools that many communities and individuals can use and pull from.
  • Technical Skills
    • Syntax and semantic familiarity with R and Markdown
    • Keeping track of folder paths for organization.
    • Consistent notation in script files for documentation.
    • Crafting relevant and readable variable names for external audiences.
    • Repitition is the mother of learning for programming languages.

CODING TECHNIQUES

  • “group_by()” and “summarize()” functions go hand-in-hand to essentially isolate specific pockets of data within the tables and derive certain calculations from them, like averages, medians, standard deviations, etc.
  • “select()” and “filter()” functions are for columns and rows, respectively, and are inverted in ArcGIS.
  • “mutate()” function is used to create new variables, specifically new columns or modifying existing columns in data frames.
  • Quarto combines multiple languages and makes it readable, so familiarity with using Markdown to format websites and communicate visualization and data is important. Don’t shy away from playing with bold, italics, embedding links and images, etc. to refine portfolio design.

QUESTIONS & CHALLENGES

  • What happens in the background when rendering in Quarto, why can’t the website be deployed just from the GitHub website? Is it to establish the “docs” folder when doing it locally, because there was an error where the folder didn’t exist when trying to publish the portfolio just from the website.
  • Need more practice with understanding local and GitHub file structures and how they relate to one another as well as R functions. There’s a basic understanding with coding syntax and semantics, but what complexities occur in the background with data storage and the abstractness of it?

CONNECTIONS TO POLICY

  • There’s a great potential for using predictive analytics to provide useful information and turn it into public-serving action, but they’re also tools that can be purposefully or unintentionally misused to exacerbate biases and existing social stratification, like predictive analytics for crime and policing.
  • Data science is useful for cleaning raw data, which will inevitably be collected in massive numbers due to the inherently large nature of cities and people.

REFLECTION

  • The most interesting part was adjusting and troubleshooting conflicts and errors encountered with GitHub, R, and Markdown, which is integral to real-life circumstances using data science tools.
  • Individually using GitHub will provide the skills to collaboratively work on it with others, because applying these skills is often done in a group due to the collective nature of public and urban policy, and if not in a group, it will inevitably be shared with an audience.
  • Trial and error is central to learning programming languages and holistically understanding how different data, files, and functions tie-in and work together. When coming across issues with getting the notes page to publish and show, much of the journey was adjusting file locations and syntax around the .qmd and .yml files.