Table of Contents

Statistics - Correlation Matrix

About

From a raw matrix to a correlation matrix.

A correlation matrix is a special matrix used in statistics. It is a square symmetric matrix.

Steps

Raw Matrix

3 columns (3 variables), 8 rows (8 individuals)

<MATH> A_{ij} = \begin{bmatrix} 1 & 3 & 4 \\ 2 & 5 & 4 \\ 0 & 0 & 1 \\ 3 & 2 & 3 \\ 1 & 0 & 5 \\ 4 & 4 & 3 \\ 4 & 5 & 2 \\ 3 & 2 & 3 \\ \end{bmatrix} </MATH>

Sum Matrix

<MATH> S_{1j} = 1_{1i} . A_{ij} = \begin{bmatrix} 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ \end{bmatrix} . \begin{bmatrix} 1 & 3 & 4 \\ 2 & 5 & 4 \\ 0 & 0 & 1 \\ 3 & 2 & 3 \\ 1 & 0 & 5 \\ 4 & 4 & 3 \\ 4 & 5 & 2 \\ 3 & 2 & 3 \\ \end{bmatrix} = \begin{bmatrix} 18 & 21 & 25 \\ \end{bmatrix} </MATH>

Mean Vector

<MATH> Mv_{1j} = S_{1j} . N^{-1} = \begin{bmatrix} 18 & 21 & 25 \\ \end{bmatrix} .8^{-1} = \begin{bmatrix} 2.25 & 2.62 & 3.12 \\ \end{bmatrix} </MATH>

where N is the number of rows.

Mean Matrix

<MATH> Mm_{ij} = 1_{i1} . Mv_{1j} = \begin{bmatrix} 1 \\ 1 \\ 1 \\ 1 \\ 1 \\ 1 \\ 1 \\ 1 \\ \end{bmatrix} . \begin{bmatrix} 2.25 & 2.62 & 3.12 \\ \end{bmatrix} = \begin{bmatrix} 2.25 & 2.62 & 3.12 \\ 2.25 & 2.62 & 3.12 \\ 2.25 & 2.62 & 3.12 \\ 2.25 & 2.62 & 3.12 \\ 2.25 & 2.62 & 3.12 \\ 2.25 & 2.62 & 3.12 \\ 2.25 & 2.62 & 3.12 \\ 2.25 & 2.62 & 3.12 \\ \end{bmatrix} </MATH>

Deviation Score Matrix

Deviation Score Matrix

<MATH> D_{ij} = A_{ij}-Mm_{ij} = \begin{bmatrix} 1 & 3 & 4 \\ 2 & 5 & 4 \\ 0 & 0 & 1 \\ 3 & 2 & 3 \\ 1 & 0 & 5 \\ 4 & 4 & 3 \\ 4 & 5 & 2 \\ 3 & 2 & 3 \\ \end{bmatrix} - \begin{bmatrix} 2.25 & 2.62 & 3.12 \\ 2.25 & 2.62 & 3.12 \\ 2.25 & 2.62 & 3.12 \\ 2.25 & 2.62 & 3.12 \\ 2.25 & 2.62 & 3.12 \\ 2.25 & 2.62 & 3.12 \\ 2.25 & 2.62 & 3.12 \\ 2.25 & 2.62 & 3.12 \\ \end{bmatrix} = \begin{bmatrix} \begin{array}{rrr} -1.25 & 0.38 & 0.88 \\ -0.25 & 2.38 & 0.88 \\ -2.25 & -2.62 & -2.12 \\ 0.75 & -0.62 & -0.12 \\ -1.25 & -2.62 & 1.88 \\ 1.75 & 1.38 & -0.12 \\ 1.75 & 2.38 & -1.12 \\ 0.75 & -0.62 & -0.12 \\ \end{array} \end{bmatrix} </MATH>

SS and SP Product

Sum Square of Deviation Score and Som of cross product

<MATH> \begin{array}{rrccc} S_{jj} & = & D_{ij} & . & {D_{ij}}^T \\ S_{jj} & = & \begin{bmatrix} \begin{array}{rrr} -1.25 & 0.38 & 0.88 \\ -0.25 & 2.38 & 0.88 \\ -2.25 & -2.62 & -2.12 \\ 0.75 & -0.62 & -0.12 \\ -1.25 & -2.62 & 1.88 \\ 1.75 & 1.38 & -0.12 \\ 1.75 & 2.38 & -1.12 \\ 0.75 & -0.62 & -0.12 \\ \end{array} \end{bmatrix} & . & \begin{bmatrix} \begin{array}{rrrrrrrr} -1.25 & -0.25 & -2.25 & 0.75 & -1.25 & 1.75 & 1.75 & 0.75 \\ 0.38 & 2.38 & -2.62 & -0.62 & -2.62 & 1.38 & 2.38 & -0.62 \\ 0.88 & 0.88 & -2.12 & -0.12 & 1.88 & -0.12 & -1.12 & -0.12 \\ \end{array} \end{bmatrix} \\ \end{array} </MATH>

where:

Variance and covariance Matrix

<MATH> VCoV_{jj} = S_{jj}. N^{-1} </MATH>

Standard Deviation Matrix

<MATH> SD_{jj} = Diag(VCoV_{jj})^{\frac{1}{2}} </MATH>

Correlation matrix

correlation coefficient

<MATH> R_{jj} = {SD_{jj}}^{-1}. VCoV_{jj}.{SD_{jj}}^{-1} </MATH>

Multiple regression coefficients are estimated simultaneously using matrix algebra. In matrix form, the formula is [B = (X’X) -1X’Y]. The matrix inversion is required to isolate B on one side of the equation.