Table of Contents

Statistics - (dependent|paired sample) t-test

About

A dependent t-test is appropriate when:

The idea is that one measure is dependent on the other. That they're related.

Is the difference between means a significant difference or is this difference just due to chance because of sampling error ?

If the mean of this different scores is significantly different from zero, we have a significant change.

Assumption

The distribution is normal

Calculation

Analysis

A thorough analysis will include:

Different score

The same subjects or cases are measured twice. We can calculate a different score for each individual subject.

<MATH> \begin{array}{rrl} \text{Different Score} & = & \href{raw_score}{X}_1 - \href{raw_score}{X}_2 \\ \end{array} </MATH>

where:

t-value

See t-value for mean <MATH> \begin{array}{rrl} \href{t-value#mean}{\text{t-value}} & = & \frac{\href{mean}{\text{Mean of the Different Scores}}}{\href{Standard_Error#mean}{\text{Standard Error of the Different Scores}}} & \\ \end{array} </MATH>

p-value

The p-value will be based on:

Effect size

The most appropriate and the most common estimate of effect size is Cohen's d.

Because NHST is biased by sample size, we should supplement the analysis with an estimate of effect size: Cohen's d

And the effect size is calculated differently than in regression.

Cohen's d is a intuitive measure that tells us how much in terms of standard deviation units:

<MATH> \begin{array}{rrl} \text{Cohen's d} & = & \frac{\href{mean}{\text{Mean of the Different Scores}}}{\href{Standard Deviation}{\text{Standard deviation of the Different Scores}}} \\ \end{array} </MATH>

As you can remark:

Why ? Because:

A Cohen's d of 1 means that:

0.8 is also a strong effect.

Confidence Interval

We can also get interval estimates around these means rather than just point estimates.

We get the mean of the difference scores and put an upper bound and a lower bound. It's the same method than for sample means or regression coefficients.

<MATH> \begin{array}{rrl} \text{Upper bound} & = & \href{Mean}{\text{Mean of the difference scores}} & + & \href{#t-value}{\text{t-value}}.\href{Standard_Error}{\text{Standard Error}} \\ \text{Lower bound} & = & \href{Mean}{\text{Mean of the difference scores}} & - & \href{#t-value}{\text{t-value}}.\href{Standard_Error}{\text{Standard Error}} \end{array} </MATH>

That exact t-value value depends on:

When the interval does not include zero, it's significant in terms of null hypothesis significance testing.

Simulation

Build a sampling distribution of the differences

Pseudo Code: Loop until you get a beautiful normal distribution

After getting the normal distribution, calculate the probability of the differences.

Documentation / Reference