linear regression

The Importance of Including an Exposure Variable in Count Models

November 19th, 2020 by

When our outcome variable is the frequency of occurrence of an event, we will typically use a count model to analyze the results. There are numerous count models. A few examples are: Poisson, negative binomial, zero-inflated Poisson and truncated negative binomial.

There are specific requirements for which count model to use. The models are not interchangeable. But regardless of the model we use, there is a very important prerequisite that they all share.

(more…)


Count Models: Understanding the Log Link Function

November 12th, 2020 by

When we run a statistical model, we are in a sense creating a mathematical equation. The simplest regression model looks like this:

Yi = β0 + β1X+ εi

The left side of the equation is the sum of two parts on the right: the fixed component, β0 + β1X, and the random component, εi.

You’ll also sometimes see the equation written (more…)


Member Training: Preparing to Use (and Interpret) a Linear Regression Model

November 1st, 2020 by

You think a linear regression might be an appropriate statistical analysis for your data, but you’re not entirely sure. What should you check before running your model to find out?

(more…)


Same Statistical Models, Different (and Confusing) Output Terms

January 7th, 2020 by

Learning how to analyze data can be frustrating at times. Why do statistical software companies have to add to our confusion?Stage 2

I do not have a good answer to that question. What I will do is show examples. In upcoming blog posts, I will explain what each output means and how they are used in a model.

We will focus on ANOVA and linear regression models using SPSS and Stata software. As you will see, the biggest differences are not across software, but across procedures in the same software.

(more…)


What is Multicollinearity? A Visual Description

November 20th, 2019 by

Multicollinearity is one of those terms in statistics that is often defined in one of two ways:

1. Very mathematical terms that make no sense — I mean, what is a linear combination anyway?

2. Completely oversimplified in order to avoid the mathematical terms — it’s a high correlation, right?

So what is it really? In English?

(more…)


Linear Regression for an Outcome Variable with Boundaries

July 22nd, 2019 by

The following statement might surprise you, but it’s true.

To run a linear model, you don’t need an outcome variable Y that’s normally distributed. Instead, you need a dependent variable that is:

  • Continuous
  • Unbounded
  • Measured on an interval or ratio scale

The normality assumption is about the errors in the model, which have the same distribution as Y|X. It’s absolutely possible to have a skewed distribution of Y and a normal distribution of errors because of the effect of X. (more…)