BTS 510 Lab 4

set.seed(12345)
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.2     ✔ tibble    3.3.0
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(Stat2Data)
theme_set(theme_classic(base_size = 16))
cbCSPalette <- c("#76777a", "#3697aa", "#dc1e34", "#009E73", 
               "#F0E442", "#0072B2", "#E69F00", "#CC79A7")
CS_red <- "#dc1e34"
CS_grey <- "#76777a"
CS_cer <- "#3697aa"

1 Learning objectives

  • Select an appropriate plot for the variable type
  • Create plots in ggplot2

2 Data

  • ICU data from the Stat2Data package
    • ID: Patient ID code
    • Survive: 1 = patient survived to discharge or 0 = patient died
    • Age: Age (in years)
    • AgeGroup: 1 = young (under 50), 2 = middle (50-69), 3 = old (70+)
    • Sex: 1 = female or 0 = male
    • Infection: 1 = infection suspected or 0 = no infection
    • SysBP: Systolic blood pressure (in mm of Hg)
    • Pulse: Heart rate (beats per minute)
    • Emergency: 1 = emergency admission or 0 = elective admission
  • Convert the factor variables to factor variables, as in the lecture
    • as.factor() function
library(Stat2Data)
data(ICU)
ICU$Survive <- as.factor(ICU$Survive)
ICU$Sex <- as.factor(ICU$Sex)
ICU$Infection <- as.factor(ICU$Infection)
ICU$Emergency <- as.factor(ICU$Emergency)
ICU$AgeGroup <- as.factor(ICU$AgeGroup)

3 Tasks

  1. Make a histogram of blood pressure. Make the bars grey with a black outline. Add vertical lines at the standard cutoffs (https://newsroom.heart.org/news/high-blood-pressure-redefined-for-first-time-in-14-years-130-is-the-new-high) of 120, 130, and 140. Make those lines green, yellow, and red, respectively.
min(ICU$SysBP)
[1] 36
max(ICU$SysBP)
[1] 256
ggplot(data = ICU,
        aes(x = SysBP)) +
  geom_histogram(binwidth = 10, color = "black", fill = "grey") +
  geom_vline(xintercept = 120, color = "green", linewidth = 1) +
  geom_vline(xintercept = 130, color = "yellow", linewidth = 1) +
  geom_vline(xintercept = 140, color = "red", linewidth = 1)

  1. Make dotplots of blood pressure for emergency vs elective admission patients. Try different numbers of bins or binwidths.
ggplot(data = ICU,
       aes(x = Emergency, y = SysBP)) +
  geom_dotplot(method = "histodot",
               binaxis = "y", 
               stackdir = "center")
Bin width defaults to 1/30 of the range of the data. Pick better value with
`binwidth`.

ggplot(data = ICU,
       aes(x = Emergency, y = SysBP)) +
  geom_dotplot(binwidth = 5,
               method = "histodot",
               binaxis = "y", 
               stackdir = "center")

  1. Make a scatterplot of blood pressure (Y) vs age (X). Add a straight line to the plot. Does it look like blood pressure increases, decreases, or is relatively stable over ages?
ggplot(data = ICU,
       aes(x = Age, y = SysBP)) +
  geom_point() +
  geom_smooth(method = "lm", se = FALSE)
`geom_smooth()` using formula = 'y ~ x'

ggplot(data = ICU,
       aes(x = Age, y = SysBP)) +
  geom_point() +
  geom_smooth(se = FALSE)
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'