Week 6 gave you the tidyverse fundamentals. Week 7 is where you become productive: ggplot2 for charts that rival anything you'll see in a published report, stringr for the messy text columns that show up in every real dataset, and lubridate for the date arithmetic that used to be painful. By the end of this week, you write R code you would be happy to show in an interview.
What You'll Learn This Week
ggplot2 Fundamentals
The grammar of graphics in one page
aes(): which column → which visual property), one or more geometric layers (geom_*), optional scales, facets, and a theme. Each + adds a layer.
Always set titles and labels
Faceting, Themes and Palettes
Same chart, many lenses
Pick a palette on purpose
| Palette type | When to use |
|---|---|
| Sequential (Blues) | Ordered numeric data |
| Diverging (RdBu) | Data with a meaningful zero |
| Qualitative (Set2) | Unordered categories |
| viridis | Colour-blind friendly; prints well in grayscale |
String Manipulation with stringr
Cleaning the world's messy text data
Regex in 30 seconds
.- any character;*- zero or more.\d- digit;\w- word char;\s- whitespace.[abc]- any of a/b/c;[^abc]- none of.^- start;$- end;()- capture group.
Dates with lubridate
Parsing and arithmetic that finally make sense
Advanced dplyr
across(), case_when() and friends
Project Hygiene
Code that won't embarrass you in six months
A clean project layout
data/- raw and processed CSVs (gitignore raw)R/- helper functions (utils.R, plots.R)scripts/- pipelines:01_clean.R,02_eda.R,03_model.Routput/- generated plots and tablesREADME.md- how to run, in five lines or fewer
here::here() for paths and renv for package versions. Future you and your colleagues will thank you.
Project: Complete R Analytics Report
Deliver a single R Markdown report on week7-retail-sales.csv (provided in /data/):
- Three faceted ggplot2 charts of monthly revenue by region.
- A correlation heat-map of numeric columns.
- A clean-text section: extract product brands from a free-text description column using regex.
- A time-series chart with a 7-day rolling mean.
- A short executive summary and three recommendations.
The Rmd should knit to HTML without warnings.