About
A dependent t-test is appropriate when:
- we have the same people measured twice.
- the same subject are been compared (ex: Pre/Post Design)
- or two samples are matched at the level of individual subjects (allowing for a difference score to be calculated)
The idea is that one measure is dependent on the other. That they're related.
Is the difference between means a significant difference or is this difference just due to chance because of sampling error ?
If the mean of this different scores is significantly different from zero, we have a significant change.
Articles Related
Assumption
The distribution is normal
Calculation
Analysis
A thorough analysis will include:
- A measure of effect size: cohen_s_d (because NHST is biased by sample size)
- Confidence Interval. An interval estimates because sample means are just point estimates.
Different score
The same subjects or cases are measured twice. We can calculate a different score for each individual subject.
<MATH> \begin{array}{rrl} \text{Different Score} & = & \href{raw_score}{X}_1 - \href{raw_score}{X}_2 \\ \end{array} </MATH>
where:
t-value
See t-value for mean <MATH> \begin{array}{rrl} \href{t-value#mean}{\text{t-value}} & = & \frac{\href{mean}{\text{Mean of the Different Scores}}}{\href{Standard_Error#mean}{\text{Standard Error of the Different Scores}}} & \\ \end{array} </MATH>
p-value
The p-value will be based on:
- the above t-value and which t-distribution we're in
- whether we're doing a non-directional or directional test.
Effect size
The most appropriate and the most common estimate of effect size is Cohen's d.
Because NHST is biased by sample size, we should supplement the analysis with an estimate of effect size: Cohen's d
And the effect size is calculated differently than in regression.
Cohen's d is a intuitive measure that tells us how much in terms of standard deviation units:
- one measurement differ from another (in a dependent t-test)
- one mean differ from another (in a independent t-test)
<MATH> \begin{array}{rrl} \text{Cohen's d} & = & \frac{\href{mean}{\text{Mean of the Different Scores}}}{\href{Standard Deviation}{\text{Standard deviation of the Different Scores}}} \\ \end{array} </MATH>
As you can remark:
- For the t-value, the denominator is standard error,
- For d, the denominator is the standard deviation.
Why ? Because:
- Standard error is biased by N
- whereas standard deviation is not.
A Cohen's d of 1 means that:
- score's went up a whole standard deviation.
- it's a strong effect.
0.8 is also a strong effect.
Confidence Interval
We can also get interval estimates around these means rather than just point estimates.
We get the mean of the difference scores and put an upper bound and a lower bound. It's the same method than for sample means or regression coefficients.
<MATH> \begin{array}{rrl} \text{Upper bound} & = & \href{Mean}{\text{Mean of the difference scores}} & + & \href{#t-value}{\text{t-value}}.\href{Standard_Error}{\text{Standard Error}} \\ \text{Lower bound} & = & \href{Mean}{\text{Mean of the difference scores}} & - & \href{#t-value}{\text{t-value}}.\href{Standard_Error}{\text{Standard Error}} \end{array} </MATH>
That exact t-value value depends on:
- how confident we want to be so like a 95% confidence interval Versus an 90% confidence interval.
- which sampling distribution of t we're going to to use (because we have that family of t distribution). So it depends on the number of subjects in the sample.
When the interval does not include zero, it's significant in terms of null hypothesis significance testing.
Simulation
Build a sampling distribution of the differences
Pseudo Code: Loop until you get a beautiful normal distribution
- Take the two samples
- Shuffle the observations between the two samples
- Calculate and plot the mean
After getting the normal distribution, calculate the probability of the differences.