Adding interaction terms to a regression model has real benefits. It greatly expands your understanding of the relationships among the variables in the model. And you can test more specific hypotheses. But interpreting interactions in regression takes understanding of what each coefficient is telling you.
The example from Interpreting Regression Coefficients was a model of the height of a shrub (Height) based on the amount of bacteria in the soil (Bacteria) and whether the shrub is located in partial or full sun (Sun). Height is measured in cm, Bacteria is measured in thousand per ml of soil, and Sun = 0 if the plant is in partial sun, and Sun = 1 if the plant is in full sun.
The beauty of the Univariate GLM procedure in SPSS is that it is so flexible. You can use it to analyze regressions, ANOVAs, ANCOVAs with all sorts of interactions, dummy coding, etc.
The down side of this flexibility is it is often confusing what to put where and what it all means.
So here’s a quick breakdown.
The dependent variable I hope is pretty straightforward. Put in your continuous dependent variable.
Fixed Factors are categorical independent variables. It does not matter if the variable is (more…)
One of the most common causes of multicollinearity is when predictor variables are multiplied to create an interaction term or a quadratic or higher order terms (X squared, X cubed, etc.).
Why does this happen? When all the X values are positive, higher values produce high products and lower values produce low products. So the product variable is highly correlated with the component variable. I will do a very simple example to clarify. (Actually, if they are all on a negative scale, the same thing would happen, but the correlation would be negative).
In a small sample, say you have the following values of a predictor variable X, sorted in ascending order:
2, 4, 4, 5, 6, 7, 7, 8, 8, 8
It is clear to you that the relationship between X and Y is not linear, but curved, so you add a quadratic term, X squared (X2), to the model. The values of X squared are:
4, 16, 16, 25, 49, 49, 64, 64, 64
The correlation between X and X2 is .987–almost perfect.
Plot of X vs. X squared
To remedy this, you simply center X at its mean. The mean of X is 5.9. So to center X, I simply create a new variable XCen=X-5.9.
The correlation between XCen and XCen2 is -.54–still not 0, but much more managable. Definitely low enough to not cause severe multicollinearity. This works because the low end of the scale now has large absolute values, so its square becomes large.
The scatterplot between XCen and XCen2 is:
Plot of Centered X vs. Centered X squared
If the values of X had been less skewed, this would be a perfectly balanced parabola, and the correlation would be 0.
Tonight is my free teletraining on Multicollinearity, where we will talk more about it. Register to join me tonight or to get the recording after the call.
I was recently asked about whether centering (subtracting the mean) a predictor variable in a regression model has the same effect as standardizing (converting it to a Z score). My response:
They are similar but not the same.
In centering, you are changing the values but not the scale. So a predictor that is centered at the mean has new values–the entire scale has shifted so that the mean now has a value of 0, but one unit is still one unit. The intercept will change, but the regression coefficient for that variable will not. Since the regression coefficient is interpreted as the effect on the mean of Y for each one unit difference in X, it doesn’t change when X is centered.
And incidentally, despite the name, you don’t have to center at the mean. It is often convenient, but there can be advantages of choosing a more meaningful value that is also toward the center of the scale.
But a Z-score also changes the scale. A one-unit difference now means a one-standard deviation difference. You will interpret the coefficient differently. This is usually done so you can compare coefficients for predictors that were measured on different scales. I can’t think of an advantage for doing this for an interaction.
The Analysis Factor uses cookies to ensure that we give you the best experience of our website. If you continue we assume that you consent to receive cookies on all websites from The Analysis Factor.
This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.