When is it important to use adjusted R-squared instead of R-squared?
R², the the Coefficient of Determination, is one of the most useful and intuitive statistics we have in linear regression.
It tells you how well the model predicts the outcome and has some nice properties. But it also has one big drawback.
R²’s nice properties
First, it’s standardized. Every R² is on the scale of 0 to 1. The advantage is that we can look at its actual value to get an idea of how well the model is doing. Your model has an R² of .7? That’s pretty good. An R² of .08? Not so great.
Of course, different fields can expect and interpret different values of R² as being high or low. Actually, an R² of .7 might not be great for every field or every data set. But once you’re used to the types of values you get in your field, you can evaluate your model on its own, without worrying about the units of the variables contained in them.
Second, it’s intuitive. That standardized scale of 0 to 1 represents the proportion of variation in our response variable Y, that is attributable to the predictors in the model. The more related those predictors collectively are to the response variable, the higher R² will be.
Third, you can use it as a measure of effect size for the model as a whole. This makes it particularly useful in sample size calculations.
R²’s big drawback
It does have one big drawback, though. In multiple regression as you add predictors, it will get bigger. Because of the way it’s calculated, it can never go down with more predictors. That raises a few issues.
First R² will go up even if those predictors don’t help predict Y. Sure, it won’t go up a lot, but it will gradually get bigger with more predictors.
And model complexity isn’t a good thing. If we’re going to add more predictors, we want to make sure they’re helpful.
The advantage of Adjusted R-squared
Luckily, there is an alternative: Adjusted R².
Adjusted R² does just what is says: it adjusts the R² value. This adjustment is a penalty that is subtracted from R². The size of the penalty is based on the number of predictors and the sample size.
If you add a predictor that is useful in predicting Y, the adjusted R² will increase because the penalty will be smaller than the R² increase.
But if you add a predictor that is not useful in predicting Y, the adjusted R² will decrease because the penalty will be a bigger negative than the small increase.
In fact, while R² cannot be below 0, adjusted R² can. So it’s a super-useful way to tell if adding predictors to a model is adding useless complexity.
So in multiple regression, when you have multiple predictors, always use Adjusted R².
Dr Subhash Chandra says
The “predictive performance” of a model essentially implies how well the model predictions “agree” with observed data – which is what Lin’s CCC does.
Dr Subhash Chandra says
Dear Karen: You mention that “Coefficient of Determination R² tells you “how well the model predicts the outcome”. This appears to imply that R² is a measure of “predictive performance” of the model! I think a model’s predictive performance may be better/more correctly estimated by Lin’s CCC which jointly considers both the precision (in terms of Pearson Correlation r) and accuracy (Cb), whereas R² (like r) measures only the precision. What do you think Karen?
Karen Grace-Martin says
Hi Subhash,
Here's a quote from Scott Menard's book: "R2 is a 'proportional reduction in error' statistics. It measures the proportion by which use of the regression equation reduces the error of prediction relative to predicting the mean."
So I would say a model predicts better if the predictions have reduced error. But I'm sure the CCC has useful properties also. I have only see CCC described in situations where there is a gold standard measurement, X, and a new measurement, Y. I haven't seen it used in multiple regression, but that doesn't mean it isn't useful there. I just haven't come across that use.
Cliff Richard Kikawa says
Dear Dr. Karen and Subhash,
You are all bringing out useful statistics that are usually under looked or rather no attention paid to them at all. I have not come across Lin’s CCC in regression I have to be honest on that. I thank Dr Subhash for bringing it out and will read more about it, perhaps it has good properties that are worth knowing when performing regression.
Thank you Dr. Karen for your VERY useful information to use practicing statisticians.