Eigenvectors / values are the solution to homogenous equations
Maximize a function while also imposing some constraints
Loadings:
PC1 PC2
x1 0.739 -0.425
x2 0.779 -0.468
x3 0.488 -0.623
x4 0.552 0.577
x5 0.546 0.714
x6 0.514 0.534
PC1 PC2
SS loadings 2.257 1.914
Proportion Var 0.376 0.319
Cumulative Var 0.376 0.695
x1 x2 x3 x4 x5 x6
0.7261219 0.8252297 0.6261259 0.6372285 0.8070047 0.5491167
Loadings tell us how items are correlated with components
Communalities tell us how much variance in the items is explained by the components we kept
But where did the \(Y\)s / components even come from?
\(\textbf{R}_{XX} =\begin{bmatrix} 1 & r_{X_1X_2} & r_{X_1X_3} & r_{X_1X_4} & r_{X_1X_5} & r_{X_1X_6}\\ r_{X_2X_1} & 1 & r_{X_2X_3} & r_{X_2X_4} & r_{X_2X_5} & r_{X_2X_6}\\ r_{X_3X_1} &r_{X_3X_2} & 1 & r_{X_3X_4} & r_{X_3X_5} & r_{X_3X_6}\\ r_{X_4X_1} & r_{X_4X_2} & r_{X_4X_3} & 1 & r_{X_4X_5} & r_{X_4X_6}\\ r_{X_5X_1} & r_{X_5X_2} & r_{X_5X_3} & r_{X_5X_4} & 1 & r_{X_5X_6}\\ r_{X_6X_1} & r_{X_6X_2} & r_{X_6X_3} & r_{X_6X_4} & r_{X_6X_5} & 1\\ \end{bmatrix}\)
x1 | x2 | x3 | x4 | x5 | x6 | |
---|---|---|---|---|---|---|
x1 | 1.0000 | 0.7041 | 0.4157 | 0.1406 | 0.1058 | 0.0814 |
x2 | 0.7041 | 1.0000 | 0.5428 | 0.1963 | 0.0538 | 0.1087 |
x3 | 0.4157 | 0.5428 | 1.0000 | -0.1208 | -0.1177 | 0.0276 |
x4 | 0.1406 | 0.1963 | -0.1208 | 1.0000 | 0.6027 | 0.3249 |
x5 | 0.1058 | 0.0538 | -0.1177 | 0.6027 | 1.0000 | 0.5651 |
x6 | 0.0814 | 0.1087 | 0.0276 | 0.3249 | 0.5651 | 1.0000 |
[1] 2.2566146 1.9142128 0.7510163 0.4963613 0.3482518 0.2335431
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] -0.4917076 0.3070964 0.2458906 0.54136266 -0.3027305 0.4676901
[2,] -0.5183409 0.3381863 0.1510558 0.05919316 0.3956537 -0.6588544
[3,] -0.3247308 0.4503119 -0.4335595 -0.64643283 -0.2133135 0.2010403
[4,] -0.3674707 -0.4167786 0.5131075 -0.44899215 0.3249147 0.3475890
[5,] -0.3632801 -0.5157586 -0.0382021 -0.05343548 -0.6660796 -0.3924843
[6,] -0.3421828 -0.3857845 -0.6811809 0.28477700 0.3963310 0.1785989
\(\textbf{A} =\begin{bmatrix} a_{11} & a_{12} & a_{13} & a_{14} & a_{15} & a_{16}\\ a_{21} & a_{22} & a_{23} & a_{24} & a_{25} & a_{26}\\ a_{31} & a_{32} & a_{33} & a_{34} & a_{35} & a_{36}\\ a_{41} & a_{42} & a_{43} & a_{44} & a_{45} & a_{46}\\ a_{51} & a_{52} & a_{53} & a_{54} & a_{55} & a_{56}\\ a_{61} & a_{62} & a_{63} & a_{64} & a_{65} & a_{66}\\ \end{bmatrix}\)
\(\begin{matrix}\textbf{Y} \\(n,r)\end{matrix} = \begin{matrix}\textbf{X} \\(n,p)\end{matrix}\begin{matrix}\textbf{A} \\(p,r)\end{matrix}\)
\(\begin{matrix}\textbf{Y} \\(n,r)\end{matrix} = \begin{matrix}\textbf{X} \\(n,p)\end{matrix}\begin{matrix}\textbf{A} \\(p,r)\end{matrix}\)
\(Y\) variables are linear combinations of \(X\)s and \(\textbf{A}\)
First Y variable: \(\underline{Y}_1 = a_{11}\underline{X}_1 + a_{21}\underline{X}_2 + a_{31}\underline{X}_3 + a_{41}\underline{X}_4 + a_{51}\underline{X}_5 + a_{61}\underline{X}_6\)
Second Y variable: \(\underline{Y}_2 = a_{12}\underline{X}_1 + a_{22}\underline{X}_2 + a_{32}\underline{X}_3 + a_{42}\underline{X}_4 + a_{52}\underline{X}_5 + a_{62}\underline{X}_6\)
Looks like a regression, but note that it’s not \(\hat{Y}\) and there’s no \(+ e\)
\(\begin{matrix}\textbf{X} \\(n,p)\end{matrix} = \begin{matrix}\textbf{Y} \\(n,r)\end{matrix}\begin{matrix}\textbf{B} \\(r,p)\end{matrix}\)
\(\begin{matrix}\textbf{X} \\(n,p)\end{matrix} = \begin{matrix}\textbf{Y} \\(n,r)\end{matrix}\begin{matrix}\textbf{B} \\(r,p)\end{matrix}\)
\(\textbf{B} =\begin{bmatrix} b_{11} & b_{12} & b_{13} & b_{14} & b_{15} & b_{16}\\ b_{21} & b_{22} & b_{23} & b_{24} & b_{25} & b_{26}\\ b_{31} & b_{32} & b_{33} & b_{34} & b_{35} & b_{36}\\ b_{41} & b_{42} & b_{43} & b_{44} & b_{45} & b_{46}\\ b_{51} & b_{52} & b_{53} & b_{54} & b_{55} & b_{56}\\ b_{61} & b_{62} & b_{63} & b_{64} & b_{65} & b_{66}\\ \end{bmatrix}\)
Simulation based method
Generate random correlation matrices with same \(p\) and \(n\) as data
Parallel analysis suggests that the number of factors = NA and the number of components = 2
MATRIX
statements
Number of factors
Call: vss(x = x, n = n, rotate = rotate, diagonal = diagonal, fm = fm,
n.obs = n.obs, plot = FALSE, title = title, use = use, cor = cor)
VSS complexity 1 achieves a maximimum of 0.6 with 3 factors
VSS complexity 2 achieves a maximimum of 0.87 with 5 factors
The Velicer MAP achieves a minimum of 0.12 with 2 factors
Empirical BIC achieves a minimum of -14.87 with 2 factors
Sample Size adjusted BIC achieves a minimum of 1.77 with 2 factors
Statistics by number of factors
vss1 vss2 map dof chisq prob sqresid fit RMSEA BIC SABIC complex
1 0.47 0.00 0.20 9 9.3e+01 4.1e-16 5.2 0.47 0.305 52 79.9 1.0
2 0.48 0.84 0.12 4 7.6e+00 1.1e-01 1.5 0.84 0.094 -11 1.8 1.8
3 0.60 0.85 0.23 0 8.3e-01 NA 1.1 0.88 NA NA NA 2.0
4 0.59 0.87 0.43 -3 6.2e-09 NA 0.9 0.91 NA NA NA 2.3
5 0.58 0.87 1.00 -5 0.0e+00 NA 0.8 0.92 NA NA NA 2.3
6 0.58 0.87 NA -6 0.0e+00 NA 0.8 0.92 NA NA NA 2.3
eChisq SRMR eCRMS eBIC
1 1.6e+02 2.3e-01 0.300 121
2 3.6e+00 3.4e-02 0.067 -15
3 1.8e-01 7.8e-03 NA NA
4 1.0e-09 5.8e-07 NA NA
5 5.4e-16 4.2e-10 NA NA
6 5.4e-16 4.2e-10 NA NA