About
A big R squared indicates a model that really fits the data well. But unfortunately, you can't compare models of different sizes by just taking the one with the biggest R squared because you can't compare the R squared of a model with three variables to the R squared of a model with eight variables, for instance because the models with the most variables will always fit better the data. So the adjusted R squared tries to fix this issue.
With adjusted R squared, we pay a price for having a large model, unlike the classical r squared, where we pay no price for having a large model with a lot of features.
Articles Related
Formula
A large value of adjusted R2 indicates a model with a small test error
In order to be able to compare two models of differents size, Adjusted R Squared makes you pay a price for having a large model. Adjusted R squared adjusts the R squared so that the values that you get are comparable even if the numbers of predictors are different. It does this by adding a denominator to RSS and to TSS in the below ratio.
For a least squares model with d variables, the adjusted R squared statistic is calculated as
<MATH> \text{Adjusted }R^2 = 1 - \frac{ \displaystyle \frac{RSS}{n-d-1} } { \displaystyle \frac{TSS}{n-1} } </MATH>
where:
- TSS is the total sum of squares
- d is the total # of parameters used
- n is the s
When d is large, the denominator is really large. You're dividing the RSS by a really big number, and you're going to end up with a smaller R squared.
Advantage
Compared to the others criterias such as Cp, AIC and BIC, it doesn't require an estimate of sigma squared. So you can also apply it when p is bigger than n.
That's a really nice advantage of RSS.
With adjusted R squared, you can't really generalize to other types of models such as logistic regression.