Regression models

Understanding Interactions Between Categorical and Continuous Variables in Linear Regression

May 14th, 2018 by Jeff Meyer

We’ve looked at the interaction effect between two categorical variables. Now let’s make things a little more interesting, shall we?

What if our predictors of interest, say, are a categorical and a continuous variable? How do we interpret the interaction between the two? (more…)

26 comments

When to Use Logistic Regression for Percentages and Counts

April 30th, 2018 by Karen Grace-Martin

One important yet difficult skill in statistics is choosing a type model for different data situations. One key consideration is the dependent variable.

For linear models, the dependent variable doesn’t have to be normally distributed, but it does have to be continuous, unbounded, and measured on an interval or ratio scale.

Percentages don’t fit these criteria. Yes, they’re continuous and ratio scale. The issue is the (more…)

9 comments

Why ANOVA is Really a Linear Regression, Despite the Difference in Notation

April 23rd, 2018 by Karen Grace-Martin

When I was in graduate school, stat professors would say “ANOVA is just a special case of linear regression.” But they never explained why.

And I couldn’t figure it out.

The model notation is different.

The output looks different.

The vocabulary is different.

The focus of what we’re testing is completely different. How can they be the same model?

(more…)

3 comments

Poisson or Negative Binomial? Using Count Model Diagnostics to Select a Model

March 19th, 2018 by Jeff Meyer

How do you choose between Poisson and negative binomial models for discrete count outcomes?

One key criterion is the relative value of the variance to the mean after accounting for the effect of the predictors. A previous article discussed the concept of a variance that is larger than the model assumes: overdispersion.

(Underdispersion is also possible, but much less common).

There are two ways to check for overdispersion: (more…)

10 comments

Link Functions and Errors in Logistic Regression

March 14th, 2018 by Karen Grace-Martin

I recently held a free webinar in our The Craft of Statistical Analysis program about Binary, Ordinal, and Nominal Logistic Regression.

It was a record crowd and we didn’t get through everyone’s questions, so I’m answering some here on the site. They’re grouped by topic, and you will probably get more out of it if you watch the webinar recording. It’s free.

The following questions refer to this logistic regression model: (more…)

12 comments

Getting Accurate Predicted Counts When There Are No Zeros in the Data

March 12th, 2018 by Jeff Meyer

We previously examined why a linear regression and negative binomial regression were not viable models for predicting the expected length of stay in the hospital for people with the flu. A linear regression model was not appropriate because our outcome variable, length of stay, was discrete and not continuous.

A negative binomial model wasn’t the proper choice because the minimum length of stay is not zero. The minimum length of stay is one day. Negative binomial and Poisson models can only be used on data where the observations’ outcome have the possibility of having a zero count.

We need to use a truncated negative binomial model to analyze the expected length of stay of people admitted to the hospital who have the flu. Calculating the expected length of stay is an easy task once we create our model. (more…)

No comments yet