Standard Deviation ~ Calculation & Interpretation

Statistics is a field of various numbers and values to calculate and interpret. Keeping track of every formula can be quite challenging at times, but every one of them still has their purpose in research. One of these values is the standard deviation, a measurement of deviation from the mean in a set of data. Typically, it is calculated by extracting the square root from the variance. The following article will cover everything you need to know about this topic.

Index

Inhaltsverzeichnis

1 Standard Deviation in a nutshell
2 Definition: Standard deviation
3 Calculation
4 Interpretation
5 Standard error and standard normal distribution
6 Mean absolute deviation (MAD)
7 FAQs

Standard Deviation in a nutshell

The standard deviation is the square root of the variance. It describes the average deviation of values from the mean of the dataset.

Definition: Standard deviation

The standard deviation σ (Greek sigma) is calculated by extracting the square root of the variance, describing the average deviation from the mean in a dataset. It is used to determine how reliable the findings of a study are. A wide spread, meaning a high standard deviation, means that you have a lot of variability in your data. In that case, further research might be advised to confirm if the population deviation is broad spread or if there was research bias involved in your sample.

Conduct a final format revision for a print of your thesis

Before submitting your thesis for print, check on your formatting with our 3D preview function for a final time. It provides an exact virtual visualization of what the printed version will resemble, making sure the physical version meets your expectations.

Calculation

To calculate the standard deviation, you simply extract the square root from the variance.

$\sigma = \sqrt{ \sigma^{2} }$

Example

The variance is already set as 568.96.

$\sigma = \sqrt{568.96} \approx 23.85$

If you do not know the variance yet, you may use the following formula, which includes it directly.

$\sigma = \sqrt{ \frac{ \sum_ ( x_{i}- \overline{x})^{2} }{n-1} }$

n	Number of values in the sample
$x_{i}$	Individual values in the population
$\overline{x}$	Mean
σ	Standard deviation

Example

Your study wants to find out how the average performance in an exam is, to see whether it is too easy or too difficult. Therefore, they calculate the mean and standard deviation of this year’s results. The exam evaluates through credits from 1-100, while 20 students took the test:

55, 52, 34, 65, 70, 99, 23, 46, 71, 22, 56, 87, 98, 57, 93, 43, 29, 37, 59, 88

Step 1: Calculate the mean

At first, you need to calculate the mean by adding up all the values of the set before dividing them by the number of values.

Example

$\frac{55 + 52 + 34 + 65 + 70 + 99 + 23 + 46 + 71 + 22 + 56 + 87 + 98 + 57 + 93 + 43 + 29 + 37 + 59 + 88}{20} = \frac{1,184}{20} = 59.2$

Step 2: Subtract the mean from every value, then square

Next, you will have to subtract that mean from each value in the set before multiplying the result with itself.

Example

$(22-59.2) ^{2}$	1,383.84
$(23-59.2) ^{2}$	1,310.44
$(29-59.2) ^{2}$	912.04
$(34-59.2) ^{2}$	635.04
$(37-59.2) ^{2}$	492.84
$(43-59.2) ^{2}$	262.44
$(46-59.2) ^{2}$	174.24
$(52-59.2) ^{2}$	51.84
$(55-59.2) ^{2}$	17.64
$(56-59.2) ^{2}$	10.24

Example

$(57-59.2) ^{2}$	4.84
$(59-59.2) ^{2}$	0.04
$(65-59.2) ^{2}$	33.64
$(70-59.2) ^{2}$	116.64
$(71-59.2) ^{2}$	139.24
$(87-59.2) ^{2}$	772.84
$(88-59.2) ^{2}$	829.44
$(93-59.2) ^{2}$	1,142.44
$(98-59.2) ^{2}$	1,505.44
$(99-59.2) ^{2}$	1,584.04

Step 3: Sum up the squares

In this step, you simply add all the formerly calculated squares.

Example

$\sum_ _ = 11,379.2$

Step 4: Divide by the number of values

The result is thereafter divided by the absolute number of values.

Example

$\frac{ 11,379.2}{20}= 568.96$

Step 5: Square root

What was calculated before was the variance. As a last step, you extract the square root of the result to gain the standard deviation.

Example

$\sigma = \sqrt{568.96} \approx 23.85$

Interpretation

Interpreting the standard deviation can be simple or tricky, depending on what your research intention is. In numerous instances, the mere value of the standard deviation already gives a lot of insight on whether it is in the area of expectation or not. However, there may be times where interpreting it can seem difficult. Here you can find three ways to interpret the standard deviation more distinctly.

Empirical rule

The empirical rule, also called the 68-95-99.7 rule, is used to interpret whether the outcome of the study is considered reliable in the setting of a normal distribution. Therefore, you can calculate if 68% of your values are one standard deviation around your mean, 95% should lie two standard deviations around and 99.7% three standard deviations around the mean.

68%: $\mu \pm 1 \sigma$
95%: $\mu \pm 2 \sigma$
99.7%: $\mu \pm 3 \sigma$

Example

68%: $59.2 \pm 1 \times 23.85 = [35.35;83.05]$
95%: $59.2 \pm 2 \times 23.85 = [11.5;106.09]$
99.7%: $59.2 \pm 3 \times 23.85 = [-12.35;130.75]$

This example shows, first of all, that the calculation needs personal interpretation. Our example is based on an exam, which is graded with credits 0-100. This concludes that there cannot be more than 100 credits or less than 0. However, it does not mean that the calculation is wrong, it is merely not adapted to the grading system.

However, if we also look at the 68% margin, you will see that in the example only 55% of the values lie in the area of one standard deviation around the mean. In cases like this, it is necessary to look for the origin of the problem, which is, in this example, clearly the small sample size. If your sample is very small, your results can also be counted as valid if at least 50% of the values are in the area of one standard deviation around the mean.

Z-Score

The z-score, or also called standard score, is a way to normalize the data to resemble a standard normal distribution around the mean of 0 and a standard deviation of 1. Thereafter, you can look at each individual z-score and determine, whether the value is an outlier or not.

|z| < 1 → Typical

1 < |z| < 2 → Unusual

|z| > 2 → Extreme

$z = \frac{x- \mu }{ \sigma }$

Example

In the example from above, you would start out with calculating the z-score for the highest and lowest value:

$\mid z \mid = \mid \frac{99- 59.2 }{ 23.85 } \mid \approx 1.7$

$\mid z \mid = \mid \frac{22- 59.2 }{ 23.85 } \mid \approx 1.6$

Subsequently, you can figure out where your typical values lie, and thus if they pose the majority. The lower border here is 37 with a z-score of about 0.9 and the upper border lies with 71 and a z-score of 0.5. This means, that 11 values of the dataset are classified as typical while none is classified as extreme. A result like this would most likely ask for a second study, as just above 50% of the values are considered normal, which is a rather low rate.

Coefficient of variation (CV)

The coefficient of variation (CV) is a simple measure that is calculated by dividing the standard deviation by the mean. It allows you to compare two or more datasets with the same unit measurement and how they vary. The CV norms different means and standard deviation to one comparable value. The µ in the following formula stands for the mean and can be replaced by $\overline{x}$ just the same.

$CV = \frac{ \sigma }{ \mu }$

Example

The CV of our aforementioned example would be 23.85/59.2=0.4.

The CV would be applicable if you wanted to compare, for example, three classes.

	Class 1	Class 2	Class 3
Mean	59.2	65.5	52.3
SD	23.85	20.56	27.48
CV	0.40	0.31	0.53

Thereafter, you can set thresholds, such as a CV lower than 0.3 shows an extremely low variability and indicates, in this example, an extremely good class or an easy test. A CV above 0.5, on the other hand, might hint at a very hard test or a less intelligent class. These thresholds can be set individually for your study and there are no strict rules as it depends on the study setting.

Standard error and standard normal distribution

There are quite a few standardized values in statistics, which can easily be confused due to their name. The standard deviation is, as already explained, the square root of the variance, describing the average deviation of single values from the mean. The standard error, on the other hand, estimates the deviation of countless sample means from the population mean.

Lastly, the standard normal distribution is a distribution of data symmetrically around its mean, which is 0. This must not be confused with a normal distribution in general, which is just symmetrical around the mean without a predefined mean value.

Mean absolute deviation (MAD)

Standard deviation is only one way of measuring variability. You can also use the mean absolute deviation or MAD. This method uses the original units of the data, so interpretation will be easy. Calculating MAD is also very easy. You just need to follow these steps:

Calculate the sample average
Find the absolute deviation of each data point from the mean. You should ignore any negative signs.
Find the average of all absolute deviations

While MAD has some benefits, the standard deviation is still the most commonly used measure of variability. One of its advantages is that it weights unevenly spread out samples more as compared to evenly spread out samples. That means you will be able to tell that the data is more unevenly spread out. Standard deviation also gives you a more precise measure of variability. It is also worth noting that standard deviation is more sensitive to outliers.

Print Your Thesis Now

BachelorPrint as an online printing service offers
numerous advantages for Canadian students:

✓ 3D live preview of your configuration
✓ Free express delivery for every order
✓ High-quality bindings with individual embossing

to printing services

FAQs

What does the standard deviation mean in statistics?

Standard deviation is the average distance of values from the mean, being an indicator of the amount of variability in a dataset.

What information can you get from low standard deviation?

Low standard deviation means that the data points are clustered around the mean and the variability is low. Usually, low variability is something a researcher wants to achieve because it means that his treatment has the same effect on most of their participants.

Is a low standard deviation better than a higher one?

Yes, a high standard deviation shows that the data is less reliable as it is widely spread. Furthermore, it indicates that the treatment the researcher used on the sample might not have any effect since every value is different. In measurements, it can also mean that there was an error in the sampling process or data collection.

What is variance?

Variance is the degree of spread in a dataset. If there is more spread in the dataset, the variance will be large in relation to the mean. The standard deviation is the square root of the variance and more frequently used as the value is easier to grasp for the mind.

Category

Standard Deviation – Calculation & Interpretation

How do you like this article? Cancel reply

Standard Deviation in a nutshell

Definition: Standard deviation

Calculation

Step 1: Calculate the mean

Step 2: Subtract the mean from every value, then square

Step 3: Sum up the squares

Step 4: Divide by the number of values

Step 5: Square root

Interpretation

Empirical rule

Z-Score

Coefficient of variation (CV)

Standard error and standard normal distribution

Mean absolute deviation (MAD)

FAQs

What does the standard deviation mean in statistics?

What information can you get from low standard deviation?

Is a low standard deviation better than a higher one?

What is variance?