BTS 510 Lab 5

set.seed(12345)
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(Stat2Data)
theme_set(theme_classic(base_size = 16))

1 Learning objectives

  • Advanced plotting
    • Color, size, opacity
    • Annotations
    • Changing themes, axis labels, re-ordering categories
    • Complex and combined plots

2 Data

  • ICU data from the Stat2Data package
    • ID: Patient ID code
    • Survive: 1 = patient survived to discharge or 0 = patient died
    • Age: Age (in years)
    • AgeGroup: 1 = young (under 50), 2 = middle (50-69), 3 = old (70+)
    • Sex: 1 = female or 0 = male
    • Infection: 1 = infection suspected or 0 = no infection
    • SysBP: Systolic blood pressure (in mm of Hg)
    • Pulse: Heart rate (beats per minute)
    • Emergency: 1 = emergency admission or 0 = elective admission
data(ICU)
  • Convert the factor variables to factor variables, as in the lecture
    • as.factor() function
ICU$Survive <- as.factor(ICU$Survive)
ICU$AgeGroup <- as.factor(ICU$AgeGroup)
ICU$Sex <- as.factor(ICU$Sex)
ICU$Infection <- as.factor(ICU$Infection)
ICU$Emergency <- as.factor(ICU$Emergency)

3 Tasks

  1. Make boxplots of blood pressure for emergency and non-emergency admits. One of the groups have more extreme values (both high and low) than the other. Annotate the plot with rectangles and text to describe this. Modify the X and Y axes to have useful labels and categories (i.e., not just 0 and 1 for Emergency).
ggplot(data = ICU, aes(x = Emergency, y = SysBP)) +
    geom_boxplot() +
    annotate("rect", xmin = 1.75, xmax = 2.25, ymin = 190, ymax = 260, alpha = 0.3) +
    annotate("rect", xmin = 1.75, xmax = 2.25, ymin = 30, ymax = 60, alpha = 0.3) +
    annotate("text", x = 1.75, y = 220, label = "High outliers") +
    annotate("text", x = 1.75, y = 45, label = "Low outliers") +
    scale_x_discrete(labels = c("Non-emergency admit", "Emergency admit")) +
    labs(x = "Admission type")

  1. Make a scatterplot of blood pressure (Y) vs pulse (X). Define a color palette with a few colors you like and that are distinguishable from one another. Use the palette to change the color of the points based on whether the patients survived or not. Change the axis labels and the legend label.
color_blind_friendly <- c("#56B4E9", "#000000", "#E69F00", "#009E73")
ggplot(data = ICU, aes(x = Pulse, y = SysBP, color = Survive)) +
    geom_point() +
    labs(x = "Pulse rate (beats per minute)",
         y = "Systolic blood pressure (mmHg)",
         color = "Did they survive?") +
    scale_color_manual(values = color_blind_friendly, 
                       labels = c("No", "Yes"))

  1. Make a barplot of the number of patients in each age category. Change the color of the outline and fill for the bars to something nice. Make the outlines size 2. Look at the built in themes and change the theme to something besides the default theme.
ggplot(data = ICU, aes(AgeGroup)) +
    geom_bar(color = "grey30", 
             fill = "forestgreen", 
             linewidth = 2) +
    theme_dark()