Select Page

6.16 Statistical significance and practical significance aka effect size


Suppose after running an A/B test, the results of the subsequent t-test show a statistically significant finding. That would mean this result did not happen by chance. The next question that should be asked is: does this statistically significant finding matter in the real world?

Answering the second question of practical significance helps a business determine if it should implement the test results for these reasons:

  • Statistical significance is affected by sample size – Generally speaking, larger sample sizes have a higher chance of finding a statistical difference. It makes sense since more data points = more information. Therefore very small differences can be detected through an experiment that has a large enough sample size.

  • Just because something is statistically significant does not mean it will have a meaningful impact in reality –  Suppose you were the principal of a school. One study comparing two methods of teaching math showed that students scored a mean of 85 using the current approach Method A, and a mean of 86 using a new approach Method B. Suppose again, the difference in mean test scores was statistically significant due to a large sample size. The principal may not rush to retrain all the math teachers unless he or she can be convinced that the time, effort, and financial costs involved in changing the status quo will ‘shift the needle’ substantially.  The costs involved in pushing that change through may not be worth the effort if the potential payoff is low.

At this point, the analyst should calculate the effect size to determine if those test scores have a practical significance. An effect size is essentially the magnitude of the difference reported. Sometimes the effect size is reported in terms of a standardized scale like Cohen’s d, a reading that takes the difference between two means and expresses the result in terms of standard deviation. A Cohen’s d reading is often reported together with t-test and ANOVA scores. Other times, the effect size is reported as simple effect sizes i.e. sample mean difference along with statistical uncertainty such as confidence interval, standard deviation or standard error5.


5 Transparent Statistics in Human–Computer Interaction working group. 2019. Transparent Statistics Guidelines. (Jun 2019). DOI: http://dx.doi.org/10.5281/zenodo.1186169
(Available at https://transparentstats.github.io/guidelines)