So far, we’ve been making plots for their own sake
But graphics are also useful for both
Determining statistical significance
Presenting statistical significance
Are we good at telling if, e.g., two variables are significantly correlated?
Not so much…
http://guessthecorrelation.com/
But we do a better job of comparing relationships
Alpha (usually .05) is the type I error rate
This means two things:
So alpha means both of these things:
Probability of an extreme value given the null hypothesis is true
How extreme a value has to be to reject the null hypothesis
Imagine generating data from the null hypothesis (in the population)
Do this several times (say, 19) and make a plot of each
Compare the now 20 versions of the plot
A significant effect will stand out compared to the null effect
In the same way that a significant effect is extreme in test statistic, it will be extreme visually
# three datasets with no relationship between x and y
x <- rnorm(100, 0, 1)
y <- rnorm(100, 0, 1)
data1 <- data.frame(x, y)
x <- rnorm(100, 0, 1)
y <- rnorm(100, 0, 1)
data2 <- data.frame(x, y)
x <- rnorm(100, 0, 1)
y <- rnorm(100, 0, 1)
data3 <- data.frame(x, y)
# this one has a relationship
x <- rnorm(100, 0, 1)
e <- rnorm(100, 0, 0.5)
y <- rnorm(100, 0.4*x + e)
data4 <- data.frame(x, y)
In practice, you would want to have more plots
1 out of 20 = .05 alpha
With 4 plots, the p-value is .25
You could do this by hand like I have
There is also a package to automatically create line-ups:
https://cran.r-project.org/web/packages/nullabor/vignettes/nullabor.html
Also includes a function (rorschach) to create all null plots, so you can get used to what real null effects look like (since we’re not very good at it)
The title of this article is incredibly vague, but it lays out the logic of the line-up procedure:
Theoretical background for the line-up method for a variety of tests:
A presentation about the line-up technique:
Range of plausible values for the population value we are estimating
We have a certain degree of confidence (e.g., 95%) in this range
Do not say that the CI has a 95% chance of including the population value
The population value is what it is, it does not vary
Our sample varies (not every sample is perfectly representative of the population = sampling variability) and therefore our estimate varies from sample to sample
Can be derived in a variety of different ways: theoretical, bootstrapping, Monte Carlo simulation, etc.
APA Task Force on Statistical Inference
https://www.apa.org/science/leadership/bsa/statistical/
Report confidence intervals in addition to estimates
CIs are consistent with p-values / statistical testing
Provide information about precision of the estimates
What are we doing in our plots?
We usually do not have CIs in plots, but we should
Sometimes, we include other indicators of variability, but we should include CIs
Geoff Cumming has made a career out of this
Using confidence intervals in plots (instead of standard error or standard deviation) allows you to make (rough) statistical inference based on the plot
For independent groups
95% confidence intervals will just touch when p = .01
95% confidence intervals will overlap half way when p = .05
total_N <- 50
set.seed(12345)
group <- rbinom(total_N, 1, 0.5)
#group
set.seed(13579)
outcome <- 50 + 1 * group + rnorm(total_N, 0, 1.5)
#outcome
group_data <- data.frame(group, outcome)
#group_data
model1 <- lm(data = group_data, outcome ~ group)
#summary(model1)
rawdata <-
ggplot(data = group_data,
aes(x = as.factor(group), y = outcome)) +
geom_jitter() +
geom_boxplot(alpha = 0.2)
rawdata
summary_data <- group_data %>% group_by(group) %>%
summarize(group_n = n(),
out_mean = mean(outcome),
out_sd = sd(outcome),
error = qt(0.975,df = group_n-1) * out_sd/sqrt(group_n))
summary_data
## # A tibble: 2 × 5
## group group_n out_mean out_sd error
## <int> <int> <dbl> <dbl> <dbl>
## 1 0 27 49.9 1.51 0.599
## 2 1 23 50.9 1.59 0.688
CIerror <-
ggplot(data = summary_data,
aes(x = as.factor(group), y = out_mean)) +
geom_point(size = 6) +
geom_errorbar(data = summary_data,
aes(x = as.factor(group),
ymin = out_mean - error,
ymax = out_mean + error)) +
geom_jitter(data = group_data,
aes(x = as.factor(group),
y = outcome),
color = "blue", alpha = 0.5)
CIerror
p-value for the group effect
summary(model1)$coefficients[8]
## [1] 0.03028818
The geom_smooth() function produces a smoothed 95% (default) CI of predicted values
Estimates predicted values for all subjects
At each value of X, create the CI and then smooth together
Comparing two lines with their own standard errors
Where the CIs overlap halfway, the lines are different at p<.05
Where the CIs don’t overlap, the lines are different at p<.01
Most useful for interactions, but not the only place
Keep in mind that 1) the CIs vary across X and 2) they’ll always get larger near the ends of the data
data("FirstYearGPA")
#glimpse(FirstYearGPA)
hsgpa_x_white <-
ggplot(data = FirstYearGPA,
aes(x = HSGPA, y = GPA,
color = as.factor(White))) +
geom_point() +
geom_smooth(method = lm)
hsgpa_x_white
## `geom_smooth()` using formula 'y ~ x'
Things are very simple for independent groups
What about non-independence?
For these types of models, the effect of interest is no longer just the mean difference
There are two somewhat similar methods
Use the \(MS_{error}\) from the repeated-measures ANOVA as the standard error
CI is \(\pm t_{crit} * \sqrt{MS_{error} / df_{error}}\)
Single CI width across all conditions = assumes perfect homogeneity
Use highest order interaction, such as subject x condition
There is substantial variability across subjects (which is modeled in RM ANOVA or mixed model)
“Normalize” observations by subtracting the difference between the subject mean and the grand mean from each observation: \(X_{ij} - (\bar{X}_{personi} - \bar{X}_{grand})\)
This removes the between subject variability
Use this modified data for calculating CIs as you would for between-subjects designs
Cumming, G., & Finch, S. (2005). Inference by eye: confidence intervals and how to read pictures of data. American Psychologist, 60(2), 170.
Cumming, G. (2007). Inference by eye: Pictures of confidence intervals and thinking about levels of confidence. Teaching Statistics, 29(3), 89-93.
Cumming, G. (2009). Inference by eye: reading the overlap of independent confidence intervals. Statistics in medicine, 28(2), 205-220.
Loftus, G. R., & Masson, M. E. (1994). Using confidence intervals in within-subject designs. Psychonomic bulletin & review, 1(4), 476-490.
Masson, M. E., & Loftus, G. R. (2003). Using confidence intervals for graphically based data interpretation. Canadian Journal of Experimental Psychology/Revue canadienne de psychologie expérimentale, 57(3), 203.
Cousineau, D. (2005). Confidence intervals in within-subject designs: A simpler solution to Loftus and Masson’s method. Tutorials in quantitative methods for psychology, 1(1), 42-45.