Effect Size – Explanation, Significance & Examples

28.11.22 Effect size Time to read: 7min

How do you like this article?

0 Reviews


Any quantitative academic study or experiment, which involves a comparison between two or more variables, requires a means of determining how substantially different these variables are from one another. The way in which disparities between variables either in terms of their differences or similarities can be quantified and is referred to as the effect size. This article will delve into statistical and practical significance, as well as how to calculate effect size.

Effect Size – In a Nutshell

In statistics, the effect size is most commonly calculated in two different ways, depending on what the research revolves around. The degree to which one may be able to depict when an effect starts to impact and how large the effect size may be is very broad. The most important aspects of the effect size:

  • Has a numeric value.
  • It is distinct from statistical significance.
  • It is enhanced with the introduction of a measure of practical significance in order to allow effective “real-world” comparisons.
  • It can be analyzed in more detail by calculating Cohen’s d or Pearson’s r value.

Definition: Effect size

To give a concise definition, the effect size, quite simply, is the size (or magnitude) of an effect. A large effect size indicates that a considerable degree of significance can be attributed to the data in question, while a small effect size suggests that the veracity of research results will be negligible.

Note: report results can be presented in one of several styles, however, this article follows the APA style guidelines.

Use the final format revision to perfect your thesis
Revise your thesis formatting one last time with our futuristic 3D preview function before sending it to print. It gives an accurate virtual representation of what the physical outcome will resemble, so the final product meets your expectations.

Effect size: Significance

While an equivalence is often assumed with the more general concept of significance, in reality, the effect size is distinct for a number of important reasons. In order to explore this further, it is useful to look at what role significance, and ‘statistical’ significance, in particular, plays in the effect size.

Statistical significance

Statistical significance shows that an effect exists in a study. Statistical significance is represented by calculating and assigning a p-value, or probability value, to data. Using the concept of the null hypothesis whereby an inconclusive, random result is considered to be devoid of statistical relevance (ergo “null”) as a starting point, a low p-value indicates that the reverse is true, i.e. an acceptable degree of statistical relevance is in evidence.

However, statistical significance can be misleading as it does not take into account the sample size. Increased sample size will achieve a closer resemblance to “real-world” conditions, moving researchers closer to establishing a causal relationship between two factors.

Practical significance

Research samples, which are large enough to achieve an approximation of “real-world” conditions, as mentioned above, are considered to have practical significance.

While not dependent on sample size, research indicates that the variability of effect sizes is diminished with increasing sample size.

It is important to report the effect size in research papers in order to indicate the practical significance of any data, which results from a given research project. APA guidelines require published research to include effect sizes and confidence intervals (a method of describing the uncertainty inherent in an estimate) whenever possible.

Statistical significance vs. practical significance: Example

A wide acknowledged exampe of and exemplary experiment is “Visual Adaption of the Perception of Causality” (Rolf et al, 2013).


In this study, research subjects were shown a digital image of one grey disc moving toward another at the center of a screen. The study concluded that the degree to which the two discs overlapped at the point of contact dictated whether research participants perceived the first disc to have impacted or passed under the second.

The results are statistically significant because they are derived from a relatively large sample size (10,000 repetitions of the experiment), and comprehensive steps were taken to eliminate outside variables, e.g. by ensuring that no subjects were affected by poor eyesight and by using sound shielding and dim room lighting in the laboratory in which the experiments took place. The study has a p-value of 0.5.

The addition of a measure of practical significance would give an indication of the extent to which these results would be replicated under “real-world” conditions.

Effect size: Calculation

While there are many different measures to calculate the effect size, the two which are used mostly are Cohen’s d and Pearson’s r.

In simple terms, Cohen’s d measures the difference in size between two groups, while Pearson’s r measures the strength of the relationship between two variables.

Calculating the effect size with Cohen’s d

Designed in order to provide researchers with a clear method by which to compare two groups, Cohen’s d measures the effect size as a number of standard deviations.

This is accomplished by subtracting the mean value of group two from the mean value of group one (M1 – M2) and dividing the result by the pooled standard deviation (SD). This may be expressed as an equation as follows:

d= (M1 – M2) / SD pooled

A result is a single number that summarizes the variability in a dataset, whereby a higher number indicates that more data points are further away from the mean. In other words, a dataset that has been assigned a higher number according to this method, will exhibit more frequent occurrences of extreme discrepancies between individual values within the data.

The key elements of Cohen’s d method may be summarized as follows:

Pooled standard deviation The average degree to which individual values differ within a given dataset.
Standard deviation from a critical group The average degree to which individual values differ within a subset of a given dataset deemed to be of particular significance.
Standard deviation from the pretest data The average degree to which individual values differ within a dataset collected by researchers as a preparatory measure prior to commencing collection of their main dataset.


Cohen’s d may be used in the calculation of the reliability with which the size of apples can be estimated according to the availability of direct sunlight, i.e. one apple tree is positioned in an open area, while another is positioned nearby beneath the shade of a large agricultural building, affording it little to no direct sunlight.

Upon harvesting the apples from both trees, researchers would be able to take individual weight measurements of each individual apple, with their hypothesis being that the apples grown in the shaded area will be smaller.

In the event, results show that while both trees produce apples with an average weight of 200g, the apples harvested from the tree in the shade weighed between 50g and 500g (admittedly an unlikely scenario), while the apples produced by the tree which has been planted in an open area vary in weight between 180g and 220g.

In this instance, the researchers would be able to assign a higher Cohen’s d number to their dataset, and as such would be able to confirm their hypothesis, despite the fact that the average weight of the apples harvested from each tree was the same.

Calculating the effect size with Pearson’s r

Pearson Correlation or Pearson’s r measures the effect size as an extent of a linear relationship between two variables. This measurement will indicate whether two factors move in the same direction (a negative correlation), or in the opposite direction (a positive correlation).

This spectrum of positive and negative correlations establishes a range between +1 and -1 within which Pearson’s r measures are quantified. A Pearson’s r measurement of 0 (a neutral rating) between two factors indicates that the factors in question do not have an effect on one another.

Calculating a Pearson’s r value requires the use of statistical software in order to generate visual graphs for use in the interpretation of the dataset presented. Pearson’s r may be presented as a formula as follows:

The various elements included in the formula are as follows:

  • Correlation Coefficient
  • Value of the x-variable included in a sample
  • Mean of the values of the x-variable
  • Value of the y-variable in a sample
  • Mean of the y-variable values
Design and print your thesis!
Our printing services at BachelorPrint offer US students a practical and cost-effective way for printing and binding their theses. Starting at just $7.90 and FREE express shipping, you can sit back and feel confident.


As discussed above, effect size is the size or magnitude of the relationship between two variables. This relationship is presented as a numeric value.

In order to calculate and assign a numeric effect size value to data, researchers take the differences between a pair of groups and divide it by the standard deviation of one group in the pair.

Effect size is significant because it equips readers with a quantifiable numeric value by which the relevance of data can be reported and assessed.