There are two main types of factor analysis: exploratory and confirmatory. Exploratory factor analysis (EFA) is data driven, such that the collected data determines the resulting factors. Confirmatory factor analysis (CFA) is used to test factors that have been developed a priori.
Think of CFA as a process for testing what you already think you know.
CFA is an integral part of structural equation modeling (SEM) and path analysis. The hypothesized factors should always be validated with CFA in a measurement model prior to incorporating them into a path or structural model. Because… garbage in, garbage out.
CFA is also a useful tool in checking the reliability of a measurement tool with a new population of subjects, or to further refine an instrument which is already in use.
Elaine will provide an overview of CFA. She will also (more…)
The LASSO model (Least Absolute Shrinkage and Selection Operator) is a recent development that allows you to find a good fitting model in the regression context. It avoids many of the problems of overfitting that plague other model-building approaches.
In this Statistically Speaking Training, guest instructor Steve Simon, PhD, explains what overfitting is — and why it’s a problem.
Then he illustrates the geometry of the LASSO model in comparison to other regression approaches, ridge regression and stepwise variable selection.
Finally, he shows you how LASSO regression works with a real data set.
Note: This training is an exclusive benefit to members of the Statistically Speaking Membership Program and part of the Stat’s Amore Trainings Series. Each Stat’s Amore Training is approximately 90 minutes long.
(more…)
Statistically speaking, when we see a continuous outcome variable we often worry about outliers and how these extreme observations can impact our model.
But have you ever had an outcome variable with no outliers because there was a boundary value at which accurate measurements couldn’t be or weren’t recorded?
Examples include:
- Income data where all values above $100,000 are recorded as $100k or greater
- Soil toxicity ratings where the device cannot measure values below 1 ppm
- Number of arrests where there are no zeros because the data set came from police records where all participants had at least one arrest
These are all examples of data that are truncated or censored. Failing to incorporate the truncation or censoring will result in biased results.
This webinar will discuss what truncated and censored data are and how to identify them.
There are several different models that are used with this type of data. We will go over each model and discuss which type of data is appropriate for each model.
We will then compare the results of models that account for truncated or censored data to those that do not. From this you will see what possible impact the wrong model choice has on the results.
Note: This training is an exclusive benefit to members of the Statistically Speaking Membership Program and part of the Stat’s Amore Trainings Series. Each Stat’s Amore Training is approximately 90 minutes long.
(more…)
Survival data models provide interpretation of data representing the time until an event occurs. In many situations, the event is death, but it can also represent the time to other bad events such as cancer relapse or failure of a medical device. It can also be used to denote time to positive events such as pregnancy. Often patients are lost to follow-up prior to death, but you can still use the information about them while they were in your study to better estimate the survival probability over time.
This is done using the Kaplan-Meier curve, an approach developed by (more…)
In many fields, the only way to measure a construct of interest is to have someone produce ratings:
- radiologists’ ratings of disease presence or absence on an X-ray
- researchers rate the amount of bullying occurring in an observed classroom
- coders sort qualitative responses into different response categories
It’s well established in research that multiple raters need to rate the same stimuli to ensure ratings are accurate. There are a number of ways to measure the agreement among raters using measures of reliability. These differ depending on a host of details, including: the number of raters; whether ratings are nominal, ordinal, or numerical; and whether one rating can be considered a “Gold Standard.”
In this webinar, we will discuss these and other issues in measures of inter and intra rater reliability, the many variations of the Kappa statistic, and Intraclass correlations.
Note: This training is an exclusive benefit to members of the Statistically Speaking Membership Program and part of the Stat’s Amore Trainings Series. Each Stat’s Amore Training is approximately 90 minutes long.
About the Instructor
Audrey Schnell is a statistical consultant and trainer at The Analysis Factor.
Audrey first realized her love for research and, in particular, data analysis in a career move from clinical psychology to research in dementia. As the field of genetic epidemiology and statistical genetics blossomed, Audrey moved into this emerging field and analyzed data on a wide variety of common diseases believed to have a strong genetic component including hypertension, diabetes and psychiatric disorders. She helped develop software to analyze genetic data and taught classes in the US and Europe.
Audrey has worked for Case Western Reserve University, Cedars-Sinai, University of California at San Francisco and Johns Hopkins. Audrey has a Master’s Degree in Clinical Psychology and a Ph.D. in Epidemiology and Biostatistics.
Not a Member Yet?
It’s never too early to set yourself up for successful analysis with support and training from expert statisticians.
Just head over and sign up for Statistically Speaking.
You'll get access to this training webinar, 130+ other stats trainings, a pathway to work through the trainings that you need — plus the expert guidance you need to build statistical skill with live Q&A sessions and an ask-a-mentor forum.
There are many types and examples of ordinal variables: percentiles, ranks, likert scale items, to name a few.
These are especially hard to know how to analyze–some people treat them as numerical, others emphatically say not to. Everyone agrees nonparametric tests work, but these are limited to testing only simple hypotheses and designs. So what do you do if you want to test something more elaborate?
In this webinar we’re going to lay out all the options and when each is (more…)