Statistics - (Variance|Dispersion|Mean Square) (MS)

1 - About

The variance shows how widespread the individuals are from the average.

The variance is how much that the estimate varies around its average.

It's a measure of consistency. A very large variance means that the data were all over the place, while a small variance (relatively close to the average) means that the majority of the data are closed.

See:

3 - Formula

<MATH> \begin{array}{rrl} Variance & = & \frac{\displaystyle \sum_{i=1}^{\href{sample_size}{N}}{(\href{raw_score}{X}_i- \href{mean}{\bar{X}})^2}}{\displaystyle \href{degree_of_freedom}{\text{Degree of Freedom}}} \\ & = & \frac{\displaystyle \sum_{i=1}^{\href{sample_size}{N}}{(\href{Deviation Score}{\text{Deviation Score}}_i)^2}}{\displaystyle \href{degree_of_freedom}{\text{Degree of Freedom}}} \\ & = & (\href{Standard_Deviation}{\text{Standard Deviation}})^2 \end{array} </MATH>

where:

4 - Addition

<MATH> Var(X + Y) = Var(X) + Var(Y) + 2 Cov(X, Y) </MATH> where:

  • cov = Statistics - Covariance

5 - Computation

  • For each data point, calculate the deviation score (difference from the average)
  • Square this difference (because the original sum of all deviation score is zero) (to get rid of negative differences)
  • Calculate a sum of the squared differences
  • The final variance is the sum of squared differences divided by the degree of freedom
  • The degree of freedom is:

5.1 - Python


units = [7, 10, 9, 4, 5, 6, 5, 6, 8, 4, 1, 6, 6]
  
def units_average(units):
    average = sum(units) / len(units)
    return average

def units_variance(units,average):
    diff = 0
    for unit in units:
        diff += (unit - average) ** 2
    return diff / len(units)

print units_variance(units, units_average(units))


5


Data Science
Data Analysis
Statistics
Data Science
Linear Algebra Mathematics
Trigonometry

Powered by ComboStrap