logistic regression for a binary outcome.
<MATH> \href{odds#logit}{Logit}(\hat{Y}) = ln(\frac{\hat{Y}}{1-\hat{Y}}) = B_0 + \sum_{i=1}^{k}{(B_iX_i)} </MATH>
where:
As:
The general lineal model will not guarantee that the linear combination of the predictors will come up with a score that falls between 0 and 1. This is why the logit is used in the left hand site of the formula.
<MATH> \href{odds#logit}{Logit}(\hat{Y}) = ln(\frac{\hat{Y}}{1-\hat{Y}}) </MATH>
The general lineal model assumes:
As the general lineal model will not work because we have a binary outcome variable, a logit transformation must be applied.
The logit transformation is a feature of an even more “general” mathematical framework in regression that is called the “Generalized Linear Model”. It sounds almost exactly the same as the general linear model but it's very different.
For example with a coefficient regression B_1 of 0.39.
For every 1 unit increased in X, I'm going to predict:
Example for 1 unit increase:
<MATH> \begin{array}{rrlrrl} P(X) & = & \frac{Odds}{1+Odds} \\ P(X) & = & \frac{1.88}{2.88} & = & .65 \\ P(X-1) & = & \frac{1.27}{2.27} & = & .56 \end{array} </MATH>
We can look at:
To test each predictor variable, we're going to look at:
Are they significant ?
Odds ratio are more meaningful.
For one unit increase in X, the predicted changes in Odds.
It's also possible to report confidence interval for odds.
Wald test test the model with the predictor versus a model without the predictor. The Wald test is very common in logistic regression, and in more advanced statistics. We can see how well does the model fit with the predictor in, and then with the predictor taken out.
The Wald test is a function of the regression coefficient. A wall test is calculated for each predictor variable and compares the fit of the model without the predictor.
How to assess the overall fit of the model ?
Comparison of:
compare the fit of the model to the fit of the Null model.
multiple models (Wald test)
Percentages of cases classified correctly.
See how well it classifies cases.
Main output component are: