Missing data causes a lot of problems in data analysis. Unfortunately, some of the “solutions” for missing data cause more problems than they solve.
Missing data causes a lot of problems in data analysis. Unfortunately, some of the “solutions” for missing data cause more problems than they solve.
You think a linear regression might be an appropriate statistical analysis for your data, but you’re not entirely sure. What should you check before running your model to find out?
Open data, particularly government open data is a rich source of information that can be helpful to researchers in almost every field, but what is open data? How do we find what we’re looking for? What are some of the challenges with using data directly from city, county, state, and federal government agencies?
Learning how to analyze data can be frustrating at times. Why do statistical software companies have to add to our confusion?
I do not have a good answer to that question. What I will do is show examples. In upcoming blog posts, I will explain what each output means and how they are used in a model.
We will focus on ANOVA and linear regression models using SPSS and Stata software. As you will see, the biggest differences are not across software, but across procedures in the same software.
Our analysis of linear regression focuses on parameter estimates, z-scores, p-values and confidence levels. Rarely in regression do we see a discussion of the estimates and F statistics given in the ANOVA table above the coefficients and p-values.
And yet, they tell you a lot about your model and your data. Understanding the parts of the table and what they tell you is important for anyone running any regression or ANOVA model.
Good graphs are extremely powerful tools for communicating quantitative information clearly and accurately.
Unfortunately, many of the graphs we see today confuse, mislead, or deceive the reader.