Themes, axes, annotations

Lecture 10

Dr. Mine Çetinkaya-Rundel

Duke University
STA 313 - Spring 2023

Warm up

Announcements

  • My office hours next week - Tuesday 8-9pm
  • Proposal regrades pending, but you can keep working

Setup

# load packages
library(countdown)
library(tidyverse)
library(palmerpenguins)
library(fs)
library(lubridate)
library(scales)
library(openintro)
library(colorspace)
library(glue)

# set theme for ggplot2
ggplot2::theme_set(ggplot2::theme_minimal(base_size = 14))

# set width of code output
options(width = 65)

# set figure parameters for knitr
knitr::opts_chunk$set(
  fig.width = 7, # 7" width
  fig.asp = 0.618, # the golden ratio
  fig.retina = 3, # dpi multiplier for displaying HTML output on retina
  fig.align = "center", # center align figures
  dpi = 300 # higher dpi, sharper image
)

Themes

Complete themes

p <- ggplot(penguins, aes(x = flipper_length_mm, y = body_mass_g)) +
  geom_point()

p + theme_gray() + labs(title = "Gray")
p + theme_void() + labs(title = "Void")
p + theme_dark() + labs(title = "Dark")

Themes from ggthemes

library(ggthemes)

p + theme_fivethirtyeight() + labs(title = "FiveThirtyEight")
p + theme_economist() + labs(title = "Economist")
p + theme_wsj() + labs(title = "Wall Street Journal")

Themes and color scales from ggthemes

p + 
  aes(color = species) +
  scale_color_wsj() +
  theme_wsj() + 
  labs(title = "Wall Street Journal")

Duke theme!

Warning

This package is a work in progress. Feedback and issues welcome! See https://aidangildea.github.io/duke/ for more info.

Modifying theme elements

p + 
  labs(title = "Palmer penguins") +
  theme(
    plot.title = element_text(color = "red", face = "bold", family = "Comic Sans MS"),
    plot.background = element_rect(color = "red", fill = "mistyrose")
  )

Axes

Axis breaks

How can the following figure be improved with custom breaks in axes, if at all?

Context matters

pac_plot +
  scale_x_continuous(breaks = seq(from = 2000, to = 2022, by = 2))

Conciseness matters

pac_plot +
  scale_x_continuous(breaks = seq(2000, 2022, 4))

Precision matters

pac_plot +
  scale_x_continuous(breaks = seq(2000, 2022, 4)) +
  labs(x = "Election year")

Annotation

Why annotate?

Video recap

Discuss the main take aways from each of the videos assigned for today with your neighbor. Add one takeaway per video to the Slack thread.

03:00

geom_text()

Can be useful when individual observations are identifiable, but can also get overwhelming…

How would you improve this visualization? Discuss with your neighbor and add your ideas to the Slack thread.

ggplot(state_stats, aes(x = homeownership, y = pop2010)) + 
  geom_point()

ggplot(state_stats, aes(x = homeownership, y = pop2010)) + 
  geom_text(aes(label = abbr))

03:00

Now go ahead and improve it!

Open ae-09 and implement the improvements you came up with in Part 1.

Revisit Durham AQI

Recreate the following visualization, in Part 2 of ae-09. This picks up where you left off in ae-08.

All of the data doesn’t tell a story

Highlighting in ggplot2

We have (at least) two options:

  1. Native ggplot2 – use layers

  2. gghighlight: https://yutannihilation.github.io/gghighlight/articles/gghighlight.html

Data: SF AQI

sf_files <- fs::dir_ls(here::here("data/san-francisco"))
sf <- read_csv(sf_files, na = c(".", ""))

sf <- sf |>
  janitor::clean_names() |>
  mutate(date = mdy(date)) |>
  arrange(date) |>
  select(date, aqi_value)

sf
# A tibble: 2,557 × 2
   date       aqi_value
   <date>         <dbl>
 1 2016-01-01        32
 2 2016-01-02        37
 3 2016-01-03        45
 4 2016-01-04        33
 5 2016-01-05        27
 6 2016-01-06        39
 7 2016-01-07        39
 8 2016-01-08        31
 9 2016-01-09        20
10 2016-01-10        20
# … with 2,547 more rows

Data prep

sf <- sf |>
  mutate(
    year = year(date),
    day_of_year = yday(date)
  )
# check
sf |>
  filter(day_of_year < 3)
# A tibble: 14 × 4
   date       aqi_value  year day_of_year
   <date>         <dbl> <dbl>       <dbl>
 1 2016-01-01        32  2016           1
 2 2016-01-02        37  2016           2
 3 2017-01-01        55  2017           1
 4 2017-01-02        36  2017           2
 5 2018-01-01        87  2018           1
 6 2018-01-02        95  2018           2
 7 2019-01-01        33  2019           1
 8 2019-01-02        50  2019           2
 9 2020-01-01        53  2020           1
10 2020-01-02        43  2020           2
11 2021-01-01        79  2021           1
12 2021-01-02        57  2021           2
13 2022-01-01        53  2022           1
14 2022-01-02        55  2022           2

Plot AQI over years

ggplot(sf, aes(x = day_of_year, y = aqi_value, group = year)) +
  geom_line()

Plot AQI over years

ggplot(sf, aes(x = day_of_year, y = aqi_value, group = year, color = year)) +
  geom_line()

Plot AQI over years

ggplot(sf, aes(x = day_of_year, y = aqi_value, group = year, color = factor(year))) +
  geom_line()

Highlight 2016

ggplot(sf, aes(x = day_of_year, y = aqi_value, group = year)) +
  geom_line(color = "gray") +
  geom_line(data = sf |> filter(year == 2016), color = "red") +
  labs(
    title = "AQI levels in SF in 2016",
    subtitle = "Versus all years 2016 - 2022",
    x = "Day of year", y = "AQI value"
    )

Highlight 2017

ggplot(sf, aes(x = day_of_year, y = aqi_value, group = year)) +
  geom_line(color = "gray") +
  geom_line(data = sf |> filter(year == 2017), color = "red") +
  labs(
    title = "AQI levels in SF in 2017",
    subtitle = "Versus all years 2016 - 2022",
    x = "Day of year", y = "AQI value"
    )

Highlight 2018

ggplot(sf, aes(x = day_of_year, y = aqi_value, group = year)) +
  geom_line(color = "gray") +
  geom_line(data = sf |> filter(year == 2018), color = "red") +
  labs(
    title = "AQI levels in SF in 2018",
    subtitle = "Versus all years 2016 - 2022",
    x = "Day of year", y = "AQI value"
    )

Highlight any year

year_to_highlight <- 2018

ggplot(sf, aes(x = day_of_year, y = aqi_value, group = year)) +
  geom_line(color = "gray") +
  geom_line(data = sf |> filter(year == year_to_highlight), color = "red") +
  labs(
    title = glue("AQI levels in SF in {year_to_highlight}"),
    subtitle = "Versus all years 2016 - 2022",
    x = "Day of year", y = "AQI value"
    )

Highlight with gghighlight

Time permitting!

Highlight years using gghighlight instead in Part 3 of ae-09.