MUSA 5080 Notes #1
Week 1: Introduction to R and dplyr
Note
Week 1: Introduction to R and dplyr
Date: 09/08/2025
Git & GitHub
Git
GitHub
1. Git
Version control system that tracks changes in files
- “Track changes” for code projects
- Time machine for your work
- Collaboration tool for teams
2. GitHub
Cloud hosting for Git repositories
- Backup your work in the cloud
- Share projects with others
- Deploy websites (like our portfolios)
- Collaborate on code projects
3. Key GitHub Concepts
Repository (repo): Folder containing your project files
- Commit: Snapshot of your work at a point in time
- Push: Send your changes to GitHub cloud
- Pull: Get latest changes from GitHub cloud
Markdown Basics
1. Text Formatting
**Bold text**
*Italic text*
***Bold and italic***`code text`
~~Strikethrough~~
2. Headers
# Main Header
## Section Header
### Subsection Header
3. Lists
## Unordered List
- Item 1
- Item 2
- Sub-item A
- Sub-item B
## Ordered List
1. First item
2. Second item
3. Third item
4. Links and Images
[Link text](https://example.com)
[Link to another page](about.qmd)

Basic R
1. Tibbles better?
# Traditional Data Frame
class(data)
# Convert to tibble
<- as_tibble(data)
car_data class(car_data)
- Shows first 10 rows by default
- Displays column names
- Fits nicely on a screen
2. Dplyr
library(tidyverse)
# Load car sales data
<- read_csv("data/car_sales_data.csv")
car_data
# Basic exploration
glimpse(car_data)
names(car_data)
# The power of pipes - read as "then"
<- data %>%
car_summary filter(`Year of manufacture` >= 2020) %>% # Recent models only
select(Manufacturer, Model, Price, Mileage) %>% # Key variables
mutate(price_k = Price / 1000) %>% # Convert to thousands
filter(Mileage < 50000) %>% # Low mileage cars
group_by(Manufacturer) %>% # Group by brand
summarize( # Calculate statistics
avg_price = mean(price_k, na.rm = TRUE),
count = n()
)
Summary
Tip
This week I mainly familiarized myself with several commonly used advanced tools (not limited to classroom use), and also learned the basic usage of R and dplyr functions. I need to reinforce the function usage in a timely manner.