5 Reasons to Run Sample Size Calculations Before Collecting Data

Most of us run sample size calculations when a granting agency or committee requires it. That’s reason 1.

That is a very good reason. But there are others, and it can be helpful to keep these in mind when you’re tempted to skip this step or are grumbling through the calculations you’re required to do.

It’s easy to base your sample size on what is customary in your field (“I’ll use 20 subjects per condition”) or to just use the number of subjects in a similar study (“They used 150, so I will too”).

Sometimes you can get away with doing that.

However, there really are some good reasons beyond funding to do some sample size estimates. And since they’re not especially time-consuming, it’s worth doing them.

Often the most time consuming part is figuring out and writing the data analysis plan to base the calculations on, but that’s another step you should do anyway.

Reason 2: Many, many published studies have very low power, and are bad sources for basing your sample size on.

As reported in Keppel (1993), Cohen calculated the power of every study in a psychology journal for a year. The average power was just under 50%.

If power is 50% for a study, it basically means that that study had a 50% chance of finding significance for a real effect, given the sample size, the effect size, and the statistical test. Because these were published studies, they must have had significant results. But there were probably a lot of other studies (just as many) that never got published because they didn’t have adequate power.

If you now attempt to build on that study and you use the same sample size, you only have a 50% change of replicating it with significant results. Do your own power calculation and raise the sample size, if needed.

Reason 3: A power calculation estimates not only how many participants you need, but how many you don’t need.

You don’t want to spend any more resources–time, money, and energy–collecting more data than you need. Save those resources for a follow-up study.

Especially if your study creates any risk, or even inconvenience, for your human or animal participants, you don’t want to oversize your study either. You don’t want to expose more participants than necessary to the risk.

Reason 4: When sample size calculations tell you you’re close, but have not quite enough subjects, you can make adjustments to the study that will increase the power in other ways.

Maybe you can adjust the way you’re measuring some of your variables to add precision or switch your design to something that will give you a little more power. Or make sure you include some controls that will handle some of the random error. All of these increase power without increasing sample size.

Reason 5: The biggest benefit of doing these calculations is to not waste years and thousands of dollars in grants or tuition pursuing an impossible analysis.

If sample size calculations indicate you need a thousand subjects to find significant results but time, money, or ethical constraints limit you to 50, don’t do that study.

I know it’s painful to go back to square 1, but it’s much better to do it now than after 3 years of work.

Reference:

Keppel, G. (1993). Design and Analysis: A Researcher’s Handbook. Pearson.

Effect Size Statistics

Statistical software doesn't always give us the effect sizes we need. Learn some of the common effect size statistics and the ways to calculate them yourself.

Comments

morris olitsky says

January 7, 2024 at 10:49 am

Great post: every statistician and researcher should get this. I wasn’t aware of the study on power; very interesting. Thank you Karen.

Reply
Dagnachew GETNET Bizuye says

January 6, 2024 at 11:00 am

how to diagnose for highly heterogeneity panel data to have a significant result of panel regression?

Reply
Alex Zajichek says

January 4, 2024 at 9:16 am

Overall these are informative things to keep in mind and the power piece is certainly something to consider, but one issue seems to be the premise that the overarching goal of doing the sample size calculation is to get “significant” results. Particularly, this is a troubling statement: “Because these were published studies, they must have had significant results”, implying that statistical significance is used as a requirement to classify meaningful results. Anyway, thanks for sharing.

Reply
- Karen Grace-Martin says
  
  January 23, 2024 at 3:43 pm
  
  True, there is definitely a huge problem with only “statistically significant” results getting published. Perhaps a better reason for sample size calculations is to specify a specific precision you’d like to see in confidence intervals.
  
  Reply
Arturo says

June 9, 2022 at 3:53 pm

When calculating sample size. Is there any difference between phase II and III trials?

Reply
Cutie says

March 30, 2022 at 9:19 am

I would like to know exactly how sample size is calculated can you provide us an example

Reply
- Karen Grace-Martin says
  
  March 31, 2022 at 10:24 am
  
  Here are some more resources. Examples are hard because they look very different depending on the specific statistical test you’re doing.
  
  https://www.theanalysisfactor.com/resources/by-topic/effect-size-statistics-power-and-sample-size-calculations/
  
  Reply
Sigrid Gibson says

October 26, 2020 at 12:38 pm

Thank Karen for a great FREE tutorial on effect sizes. Very helpful in understanding the relationship between various measures and what they are describing, without an “over-heavy” mathematical explanation.

Reply
Bashir says

May 27, 2012 at 2:12 am

I am glad with it. However, I am more interested in “second generation of multivariate analysis”, thus STRUCTURAL EQUATION MODELLING(SEM).
If you could delve into this, it enhance the ANALYSIS FACTOR as an organization

Reply
- Karen says
  
  June 1, 2012 at 2:13 pm
  
  Hi Bashir,
  
  Thanks for the input. I agree, SEM is a great topic that would benefit a lot of people. We’ll see what we can do.
  
  Karen
  
  Reply