Linear Regression

One-tailed and Two-tailed Tests

November 19th, 2008 by

I was recently asked about when to use one and two tailed tests.

The long answer is:  Use one tailed tests when you have a specific hypothesis about the direction of your relationship.  Some examples include you hypothesize that one group mean is larger than the other; you hypothesize that the correlation is positive; you hypothesize that the proportion is below .5.

The short answer is: Never use one tailed tests.

Why?

1. Only a few statistical tests even can have one tail: z tests and t tests.  So you’re severely limited.  F tests, Chi-square tests, etc. can’t accommodate one-tailed tests because their distributions are not symmetric.  Most statistical methods, such as regression and ANOVA, are based on these tests, so you will rarely have the chance to implement them.

2. Probably because they are rare, reviewers balk at one-tailed tests.  They tend to assume that you are trying to artificially boost the power of your test.  Theoretically, however, there is nothing wrong with them when the hypothesis and the statistical test are right for them.

 


Regression Through the Origin

November 13th, 2008 by

I just wanted to follow up on my last post about Regression without Intercepts.Stage 2

Regression through the Origin means that you purposely drop the intercept from the model.  When X=0, Y must = 0.

The thing to be careful about in choosing any regression model is that it fit the data well.  Pretty much the only time that a regression through the origin will fit better than a model with an intercept is if the point X=0, Y=0 is required by the data.

Yes, leaving out the intercept will increase your df by 1, since you’re not estimating one parameter.  But unless your sample size is really, really small, it won’t matter.  So it really has no advantages.

 


Outliers: To Drop or Not to Drop

September 17th, 2008 by

Should you drop outliers? Outliers are one of those statistical issues that everyone knows about, but most people aren’t sure how to deal with.  Most parametric statistics, like means, standard deviations, and correlations, and every statistic based on these, are highly sensitive to outliers.

And since the assumptions of common statistical procedures, like linear regression and ANOVA, are also based on these statistics, outliers can really mess up your analysis.

stage 1

Despite all this, as much as you’d like to, it is NOT acceptable to

(more…)