Week 7: Advanced R & Data Visualization

ggplot2, stringr, lubridate - the production-grade R toolkit

Duration: 7 Days | Level: Intermediate | Prereq: Week 6

Week 6 gave you the tidyverse fundamentals. Week 7 is where you become productive: ggplot2 for charts that rival anything you'll see in a published report, stringr for the messy text columns that show up in every real dataset, and lubridate for the date arithmetic that used to be painful. By the end of this week, you write R code you would be happy to show in an interview.

Day 1

ggplot2 Fundamentals

The grammar of graphics in one page

A ggplot2 plot is built in layers: a dataset, an aesthetic mapping (aes(): which column → which visual property), one or more geometric layers (geom_*), optional scales, facets, and a theme. Each + adds a layer.
library(ggplot2) ggplot(mtcars, aes(x = wt, y = mpg)) + geom_point() # scatter ggplot(mtcars, aes(x = wt, y = mpg)) + geom_smooth() # trend line ggplot(mpg, aes(x = class)) + geom_bar() # bar (count) ggplot(mpg, aes(x = displ)) + geom_histogram(bins = 30) ggplot(mpg, aes(x = class, y = hwy)) + geom_boxplot() ggplot(economics, aes(x = date, y = unemploy)) + geom_line()

Always set titles and labels

ggplot(orders_by_month, aes(month, revenue)) + geom_col(fill = "#0077B6") + labs(title = "Monthly revenue, 2026", subtitle = "Source: order_items", x = NULL, y = "Revenue (INR)", caption = "Created with R + ggplot2") + theme_minimal(base_size = 12)
Day 2

Faceting, Themes and Palettes

Same chart, many lenses

ggplot(sales, aes(month, revenue, group = region)) + geom_line(colour = "#00416A", linewidth = 0.7) + facet_wrap(~ region, ncol = 3, scales = "free_y") + theme_minimal(base_size = 11) + theme(strip.background = element_rect(fill = "#E3F2FD", colour = NA), strip.text = element_text(face = "bold"))

Pick a palette on purpose

Palette typeWhen to use
Sequential (Blues)Ordered numeric data
Diverging (RdBu)Data with a meaningful zero
Qualitative (Set2)Unordered categories
viridisColour-blind friendly; prints well in grayscale
Day 3

String Manipulation with stringr

Cleaning the world's messy text data

library(stringr) str_to_upper("hello") # "HELLO" str_length("data analytics") # 14 str_trim(" whitespace ") str_detect(emails, "@gmail\\.com") str_replace_all(phones, "[^0-9]", "") str_extract(text, "[A-Z]{2,}")

Regex in 30 seconds

  • . - any character; * - zero or more.
  • \d - digit; \w - word char; \s - whitespace.
  • [abc] - any of a/b/c; [^abc] - none of.
  • ^ - start; $ - end; () - capture group.
Pro tip: build and debug regex on regex101.com. It explains every token in plain English and saves hours.
Day 4

Dates with lubridate

Parsing and arithmetic that finally make sense

library(lubridate) ymd("2026-05-16") dmy("16/05/2026") ymd_hms("2026-05-16 14:23:11") today() + days(7) today() %--% ymd("2026-12-31") / months(1) orders %>% mutate( month = floor_date(order_date, "month"), weekday = wday(order_date, label = TRUE), is_weekend = wday(order_date) %in% c(1, 7) )
Day 5

Advanced dplyr

across(), case_when() and friends

# Apply the same function to many columns df %>% summarise(across(where(is.numeric), mean, na.rm = TRUE)) # Multi-branch logic orders %>% mutate( tier = case_when( lifetime_value >= 50000 ~ "platinum", lifetime_value >= 10000 ~ "gold", lifetime_value >= 2000 ~ "silver", TRUE ~ "bronze" ) ) # Window functions in dplyr orders %>% group_by(customer_id) %>% arrange(order_date) %>% mutate(order_rank = row_number(), days_since_prev = as.numeric(order_date - lag(order_date)))
Day 6

Project Hygiene

Code that won't embarrass you in six months

A clean project layout

  • data/ - raw and processed CSVs (gitignore raw)
  • R/ - helper functions (utils.R, plots.R)
  • scripts/ - pipelines: 01_clean.R, 02_eda.R, 03_model.R
  • output/ - generated plots and tables
  • README.md - how to run, in five lines or fewer
Use here::here() for paths and renv for package versions. Future you and your colleagues will thank you.
Day 7

Project: Complete R Analytics Report

Deliver a single R Markdown report on week7-retail-sales.csv (provided in /data/):

  1. Three faceted ggplot2 charts of monthly revenue by region.
  2. A correlation heat-map of numeric columns.
  3. A clean-text section: extract product brands from a free-text description column using regex.
  4. A time-series chart with a 7-day rolling mean.
  5. A short executive summary and three recommendations.

The Rmd should knit to HTML without warnings.

Coming up: Week 8 - R Shiny Interactive Dashboards

The skill that sets you apart - turn your R analyses into live, interactive web applications.

View Detailed Curriculum