Stefany Coxe
April 8, 2022
Quantitative Psychology
Similar to biostatistics
Overlap with psychometrics, educational measurement
Teach applied statistics courses
Graduate level
Multivariate, Longitudinal, Categorical, Graphics
Work on grants as statistician / methodologist
Psychology and CCF
Public Health, Education, Engineering
Analyze data for a lot of people
Create tools to make this easier for me and others
Does learning a second language depend on the languages you learn?
How can we improve parent-child relationship for bereaved families?
Should kids with ADHD start with medication or behavioral treatment?
Can internet-delivered anxiety treatment for children be as effective?
How do adolescents’ brains and behavior change together over time?
Many fields are experiencing a replication crisis
Previously established findings cannot be replicated
Medicine, economics, psychology, neuroscience, others
Poor statistical design
Small sample size
Less powerful study designs: between-subjects designs
Inappropriate or unethical analysis decisions
Cherry picking only significant results
Selectively including predictors and covariates
“p-hacking”
Design
Power analysis: required for most grants
Better design: within-subjects designs, covariates, reliable measures
Analysis
De-emphasize p-values in favor of confidence and credible intervals
Statistical and scientific ethics training
Open science practices: data, code, review
Cumulative science
Combine or aggregate information from multiple studies
Don’t rely on a single study
Two broad and commonly used approaches:
Meta-analysis: combines summary statistics from multiple studies
Integrative data analysis: pools raw data from multiple studies
Methodological framework that allows for the simultaneous analysis of data from multiple studies
Not a specific analysis approach
Involves several steps, including
Combining datasets
Coding study characteristics
Harmonizing measures
Conducting the statistical analysis
Secondary data analysis: no new data collection
Larger sample, so
Increased statistical power
Increased sample heterogeneity
Increased frequency of low base-rate behaviors
The ADHD Teen Integrative Data Analysis Longitudinal dataset
R03 from National Institute of Mental Health (NIMH)
Coxe and Sibley: joint co-PIs
Combined dataset is available at NIMH Data Archive: https://nda.nih.gov/
Four published articles, 1 accepted, 1 revision under review
Sibley, M.H., Coxe, S.J. The ADHD teen integrative data analysis longitudinal (TIDAL) dataset: background, methodology, and aims. BMC Psychiatry 20, 359 (2020). https://doi.org/10.1186/s12888-020-02734-6
Four studies of psychosocial (behavioral) interventions for adolescents with ADHD
854 total participants (128, 325, 123, 278)
Five different treatment conditions: STAND, STAND-G, STP-A, usual care, no treatment
Three measurements: baseline, post-treatment, follow-up
Varied in terms of:
time of year (summer vs school year)
setting (university clinic vs community clinic vs school)
clinician type (school staff, community mental health, trainees)
treatment duration (8 to 10 weeks)
exact time frame of measurements (4 to 6 months from baseline to post, 10 to 12 months from baseline to follow-up)
Some measures were common across studies, but others were not
Adolescent, parent, and teacher reports
Most measured at multiple time points
When you have different studies
Different measures / versions / reporters of the same construct
Similar items but different time frames / number of response options / response option labels
(Also consider how items function differently across demographics: sex, age, race / ethnicity)
Need a single measure that is common across all studies
Create a commensurate measure
Only possible using the raw data
Common measures created using moderated nonlinear factor analysis (MNLFA; Bauer, 2017)
Factor analysis: latent variable representing the construct of interest
Non-linear: responses are binary, modeled using logistic regression
Moderated: parts of the model vary depending on the study / measure / demographics
A version of invariance testing / differential item functioning (DIF) testing
Two measures harmonized in this project
Parent depression: different measures across studies
Adolescent ADHD symptoms: different version of same measure across studies
Focus on differences across studies
Factor analysis (FA)
Continuous or categorical items
Item thresholds (/ means / intercepts)
Item loadings
Item response theory (IRT)
Categorical items only
Item difficulty: how “difficult” it is to endorse an item
Item discrimination: how well an item discriminates between similar individuals
Different parameterization
Discrimination = loading / 1.7
Difficulty = threshold / loading
SCL-90R: Symptom Checklist 90 Revised (Study 1 and 2)
90 items across 9 dimensions, including depression, anxiety, OCD
Responses: 0 (not at all) to 4 (extremely)
Time frame: 7 days
PHQ-9: Patient Health Questionnaire 9 (Study 3)
9 items corresponding to the 9 symptoms for major depressive disorder from DSM
Responses: 0 (not at all) to 3 (very often)
Time frame: 2 weeks
WHO QOL BREF: World Health Organization Quality of Life Questionnaire (Study 4)
26 items about quality of life and health
Responses: 1 to 5 (direction varied)
Time frame: 2 weeks
Symptom | SCL-90R | PHQ-9 | QOL |
---|---|---|---|
Decreased interest or pleasure in activities | 32 | 1 | 5 |
Depressed mood (sad, empty, hopeless) | 20, 30, 54 | 2 | 26 |
Changes in sleep patterns | 44, 64, 66 | 3 | 16 |
Fatigue or loss of energy | 14 | 4 | 10 |
Change in weight and / or appetite | 19 | 5 | - |
Feelings of worthlessness or excessive guilt | 79 | 6 | - |
Diminished ability to concentrate | 55 | 7 | 7 |
Psychomotor agitation | - | 8 | - |
Thoughts of death and / or suicide | 15, 56 | 9 | - |
This was not a linear process
Data consolidation and recoding
Series of models and model comparisons
Estimation problems in many models
Dropped all covariate effects on loadings and factor variance
Several phone calls with our consultant
Extremely collaborative process
Final model produces estimated factor scores for each person
Covariate effect on factor mean: differences between studies on mean depression
Study 2 parents were less depressed than study 1 parents (0.40 SD)
Study 4 parents were more depressed than study 1 parents (1.19 SD)
Covariate effect on item thresholds: different item endorsement rates across studies
Compared to study 1…
Study 3 had lower difficulty for trouble concentrating, suicidal thoughts, sleep problems
Study 3 had higher difficulty for feelings of worthlessness, fatigue, lack of interest
Study 4 had lower difficulty for fatigue
Study 4 had higher difficulty for depressed mood, trouble concentrating
Discrimination = loading / 1.7 and difficulty = threshold / loading, so difficulty differences across studies corresponds to differential endorsement across studies
Differences across studies
Overall mean score and item endorsement rates / difficulty
Study is partially confounded with measure
Final model produces estimated factor scores for each person
Based on which study they were in: captured variability due to study
Common measure across all studies
Use those scores in other models
Conduct an analysis on the pooled dataset
Some analyses required substantial sample sizes
Instead of several (possibly underpowered) analyses
Zhao, X., Coxe, S., Sibley, M. H., Zulauf-McCurdy, C., & Pettit, J. W. (revision under review.) Harmonizing Depression Measures Across Studies: A Tutorial for Data Harmonization. Prevention Science.
IDA is a framework to combine data and improve replicability
Even if the measures are different
Even if you don’t have the same measures at all
Requires a collaborative approach
Difficult and sophisticated analysis
Subject matter knowledge
Stop relying on small samples and underpowered studies
Use data that are already available
Save time, save money
Let’s do better science!
Co-PI: Margaret H. Sibley
Consultant: Patrick Curran
Co-authors: Stephen P. Becker, Michael C. Meinzer, Mark A. Stein, Matthew J. Valente, Xin (Alisa) Zhao, Courtney Zulauf-McCurdy
Funding: NIMH R03 MH116397
Lower: trouble concentrating, suicidal thoughts, sleep problems
Higher: worthlessness, fatigue, lack of interest
Lower: fatigue
Higher: depressed mood, trouble concentrating
Changes to question wording in DSM-IV (1994) vs DSM-5 (2013)
DSM-IV: Often loses things necessary for tasks and activities
DSM-5: Often loses things necessary for tasks and activities (e.g. school materials, pencils, books, tools, wallets, keys, paperwork, eyeglasses, mobile telephones)
Two domains: inattention and hyperactivity
Same 18 symptoms (9 in each domain) in both versions
Inattention: careless, inattentive, doesn’t listen, doesn’t follow instructions, not organized, can’t sustain mental effort, loses things, distracted, forgetful
Hyperactivity: fidgets, leaves seat, runs and climbs, can’t be quiet, on the go, talks a lot, blurts out, trouble waiting their turn, interrupts
Separate models for each domain, combined in the final model
Up to three observations per person
Non-independence
Model building based on random sample of 1 observation per person
Final model includes all observations
Covariate effect on factor means: differences between studies on mean ADHD symptoms
Inattention: Study 2 adolescents had lower inattention than study 1
Hyperactivity: Study 3 adolescents had lower hyperactivity than study 1
Covariate effect on item thresholds: different item endorsement rates across studies
Compared to study 1…
Inattention: Study 4 had lower difficulty for careless and higher difficulty for inattentive
Hyperactivity: Study 4 had lower difficulty for talks a lot and higher difficulty for interrupts
Coxe S, Sibley MH. Harmonizing DSM-IV and DSM-5 Versions of ADHD “A Criteria”: An Item Response Theory Analysis. Assessment. December 2021. https://doi.org/10.1177/10731911211061299
Latent profile analysis (LPA)
Identify latent (unobserved) groups of individuals that are relatively homogenous, based on a set of indicators
What do the adolescents look like when they show up for ADHD treatment, in terms of problems?
Demographic differences
Requires a large sample (200+ but more is better!)
Unable to complete this analysis in individual studies
Samples ranged from 123 to 325
Coxe, S., Sibley, M. H., & Becker, S. P. (2021). Presenting problem profiles for adolescents with ADHD: differences by sex, age, race, and family adversity. Child and Adolescent Mental Health, 26(3), 228-237. https://doi.org/10.1111/camh.12441