Statistics is a scientific discipline devoted to the study of data.
Statistics is the art of extracting information from data.
From Data to Information to Knowledge.
No learning.
There are three kinds of lies:
lies, damned lies, and statistics.
Facts are stubborn things, but statistics are more pliable.
Statisticians refer to the entire group that is being studied as a population
Each member of the population is called a unit.
A statistician studying a population would be interested in collecting information about different characteristics of the unit. Those characteristics are called variables.
Most of the time, it is extremely difficult or very costly to collect all the information about a population. Because of these, it is common to use a smaller, representative group from the population called a sample.
In statistics, the actual number of the population is called a parameter.
The number of unit in the sample, or any other number that describes the individuals in the sample (like their length, or weight, or age), is called a statistic. In general, each statistic is an estimate of a parameter, whose value is not known exactly.
In general, the potential difference between the true parameter and the statistic obtained from using a sample is called sampling error.
The sample could have chosen in an area where a large number of tortoise tend to congregate (near a food or water source perhaps). If this sample were used to estimate the number of tortoises in all locations, it may lead to population estimate that is too high. This type of systematic error in sampling is called bias.
Descriptive statistics: procedures used to summarize, organize, and simplify data
E.g., Median – describes data but can’t be generalized beyond that
Inferential statistics : procedures that allow for generalizations about population parameters based on sample statistics
E.g., t-test – enables inferences about population beyond our data
Approach to Statistics
P(D|H)
Probability of seeing this data, given the (null) hypothesis
P(H|D)
Probability of a given outcome, given this data
Data Analyse Techniques such as:
Statisticians and researchers use two main techniques to form important conclusions about the relationships between variables.