Statistics - Standard Deviation (SD|s| |RMS width)

Thomas Bayes

About

The standard deviation is the average deviation from the mean in a distribution.

If you have a mean of 95% and a standard deviation of 2% in a normal distribution, 68% of the data are between 93% and 97%.

A larger sample should not affect the mean, but would reduce the standard deviation.

Normal Distribution Z Scale

Only 3 units in 1000 will fall outside the area of 3 standard deviations either side of the centre line.

Standard deviation is also called:

Bias

Standard Deviation is not biased by sample size.

Equation

The standard deviation is a standardized variance (ie the square root of the variance to brings the units back to the unit of the distribution because the variance is squared in order to get rid of a zero sum).

<MATH> \begin{array}{rrl} \text{Standard Deviation (SD)} & = & \sqrt{\href{variance}{Variance}} \\ & = & \sqrt{\frac{\displaystyle \href{Deviation Score}{\text{Sum of Squared (SS)}}}{\displaystyle \href{sample_size}{N}}} \\ & = & \sqrt{\frac{\displaystyle \sum_{i=1}^{\href{sample_size}{N}}{(\href{Residual}{\text{Residual}})^2}}{\displaystyle \href{sample_size}{N}}} \\ & = & \sqrt{\frac{\displaystyle \sum_{i=1}^{\href{sample_size}{N}}{(\href{raw_score}{X}_i- \href{mean}{\bar{X}})^2}}{\displaystyle \href{sample_size}{N}}} \\ \end{array} </MATH>

Computation

Python:

import math

def std_deviation(variance):
    return math.sqrt(variance)





Discover More
Six Sigma
Business Method - Six Sigma

Six Sigma (6s) is an approach to improve the performance of business process. where: sigma UTL = Upper Tolerance Limit. See LTL = Lower Tolerance Limit The 6s strategy was developed by Motorola,...
Thomas Bayes
Data Mining - Outliers Cases

Outliers are cases that are unusual because they fall outside the distribution that is considered normal for the data. The distance from the centre of a normal distribution indicates how typical a given...
Mean
Distribution - (Mean|Average) (M| | )

The average is a measure of center that statisticians call the mean. To calculate the mean, you add all numbers and divide the total by the number of numbers (N). The mean is not resistant. The...
Model Funny
Model Building - ReSampling Validation

Resampling method are a class of methods that estimate the test error by holding out a subset of the training set from the fitting process, and then applying the statistical learning method to those held...
Response Time Of System
Performance - (Latency|Response time|Running Time)

Latency is a performance metric also known as Response time. Latency (Response Time) is the amount of time take a system to process a request (ie to first response) from the outside or not, remote or...
Clustered Data Generated R
R - Cluster Generation

How to generate cluster data. To generate clustered data, the mean of random generated group of data is shifted. where: the seed is set rnorm is a random generation function for the normal...
Card Puncher Data Processing
R - Standard Deviation (sd)

Sd is the standard deviation function of R
Analytic Function Process Order
SQL Function - Window Aggregate (Analytics function)

Windowing functions (known also as analytics) allow to compute: cumulative, moving, and aggregates. They are distinguished from ordinary SQL functions by the presence of an OVER clause. With...
Card Puncher Data Processing
SQL Plus - Compute

COMPUTE in combination with the BREAK command, calculates and prints summary lines using various standard computations. The following summary function are available: SUM : Sum of the values in...
Thomas Bayes
Statistics - ( Spread | Variability ) of a sample

An important element of a data set is how it is spread. Variability is measure that describes the range and diversity of scores in a distribution Measures of spread: , inter-quartile range,...



Share this page:
Follow us:
Task Runner