BTS 510 Lab 9

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
set.seed(12345)
theme_set(theme_classic(base_size = 16))

1 Learning objectives

  • Interpret tests comparing two unrelated samples
  • Summarize data using contingency tables
  • Describe different study designs for contingency tables

2 Data

  • Pulse dataset from the Stat2Data package
    • A dataset with n = 232 observations on the following 7 variables.
      • Active: Pulse rate (beats per minute) after exercise
      • Rest: Resting pulse rate (beats per minute)
      • Smoke: 1=smoker or 0=nonsmoker
      • Sex: 1=female or 0=male
      • Exercise: Typical hours of exercise (per week)
      • Hgt: Height (in inches)
      • Wgt: Weight (in pounds)

3 Tasks

  • Make plots of variables as needed (e.g., to assess assumptions)
  • Conduct a z-test, t-test, and Welch’s t-test
    • What is/are your conclusion(s) based on the tests?
    • Are the assumptions met?
      • e.g., large enough sample to justify z test using sample variance
      • e.g., equal variances in both groups
    • Which test seems the best choice? (Don’t make this decision based on what is significant – here or elsewhere)
      • Do you think a non-parametric test might be a good option?

3.1 Some useful code

  • To split the dataset into Smoke = 0 and Smoke = 1
    • There are other ways to do this, so you don’t need to use this code
library(Stat2Data)
data(Pulse)
library(tidyverse)
Pulse_smoke <- Pulse %>% filter(Smoke == 1)
Pulse_nosmoke <- Pulse %>% filter(Smoke == 0)
head(Pulse_smoke)
  Active Rest Smoke Sex Exercise Hgt Wgt
1     82   68     1   0        3  70 225
2     86   68     1   0        2  73 195
3     87   72     1   0        2  70 173
4    102   77     1   0        2  72 200
5     80   67     1   1        2  65 133
6     99   78     1   0        3  71 165
head(Pulse_nosmoke)
  Active Rest Smoke Sex Exercise Hgt Wgt
1     97   78     0   1        1  63 119
2     88   62     0   0        3  72 175
3    106   74     0   0        3  72 170
4     78   63     0   1        3  67 125
5    109   65     0   0        3  74 188
6     66   43     0   1        3  67 140
  • Use alternative = "greater" if H_1: \mu_1 > \mu_2
    • Use alternative = "less" if H_1: \mu_1 < \mu_2
    • Where \mu_1 is the mean for the first-entered group (x)
    • The order you enter them (x vs y) doesn’t matter, just make sure you set up the directional hypothesis accordingly

3.2 Active pulse rate

  • Is active pulse rate higher among smokers than non-smokers?

3.3 Weight

  • Do smokers weight less than non-smokers?

3.4 Exercise

  • Do smokers and non-smokers exercise the same amount?