Table of Contents

Statistics - Correlation (Coefficient analysis)

About

Correlation is a statistical analysis used to measure and describe the relationship between two variables.

The Correlations coefficient is a statistic and it can range between +1 and -1

Regression Line

Correlation is used:

If two variables are correlated, X and Y then a regression can be done in order to predict scores on Y from the scores on X.

Correlation demonstrates the relationship between two variables whereas regression provides an equation (with two or more variables) which is used to predict scores on an outcome variable.

Positive correlation only means that the univariate regression has a positive correlation. In a multiple regression, the sign (positive, negative) is dependent of the other variables.

Assumptions

See: Statistics - Assumptions underlying correlation and regression analysis (Never trust summary statistics alone)

Correlation does not imply causation

Correlation does not imply causation but correlations are useful because they can be used to assess:

Type

There are several types of correlation coefficients, for different variable types

Venn diagrams

Covariance

Venn diagrams representation of a correlation between two variables X and Y.

Venn diagrams representing:

The degree to which x and y correlate is represented by the degree to which these two variance circles overlap. The correlation (degree|coefficient) is the systematic variance in Y that's explained by X.

The correlation is approaching:

The residual is the unexplained variance in Y. Some of the variance in Y is explained by the model. Some if it is unexplained, that's the residual.

Documentation / Reference