Central Tendency ~ Understanding Mean, Median & Mode

When conducting a study, you often find yourself confronted with an accumulation of data, which needs to be interpreted. Central tendency, a key concept within the realm of statistics, aids in understanding the central or typical value in a dataset. The different measures form the backbone of many statistical analyses, providing insight into the distribution of data points. In our subsequent discussions, we will delve into how these vital measures of central tendency are employed to synthesize the results of a study.

Index

Inhaltsverzeichnis

1 Central Tendency – In a Nutshell
2 Definition: Central tendency
3 Mode
4 Median
5 Mean
6 Range of data
7 Outlier effect
8 Distributions
9 When to use which
10 FAQs

Central Tendency – In a Nutshell

Measures of central tendency define different types of centers in a dataset or distribution. The mode is the most frequent value, the median the middle value on in a sorted set and the mean is the average.

Definition: Central tendency

Central tendency is an umbrella term for measures describing a dataset with a single value that represents the middle of the distribution. These include:

The mode – the most frequent value in the set.
The median – the middle value of the set.
The mean – the average of the set.

Even though it does not measure central tendency, some people count the range of data also as one of them. This measurement simply defines the difference between the highest and lowest value in the dataset.

Conduct a final format revision for a print of your thesis

Before submitting your thesis for print, check on your formatting with our 3D preview function for a final time. It provides an exact virtual visualization of what the printed version will resemble, making sure the physical version meets your expectations.

Mode

The mode is the most frequent value in the dataset, which you can easily find by counting how often which value exists in the set. Depending on your sample, you can get one mode, multiple or none at all. If you visualize your data in a graph determined by frequency, the highest bar or point on the curve is your mode.

Example

For example, in a fictional study, you ask 21 participants on whether they identify as male, female or non-binary. After counting the votes, you find out that 7 participants are male, 9 are female and 5 are non-binary. Thus, “female” is the mode, as it has the most votes.

Median

The median is the middle value in a dataset arranged in ascending or descending order, dividing the distribution in half. To calculate the median in a set that is too big to simply count, you divide the number of values by two. If it is an odd numbered set, you will get an x.5 number, which you have to adjust upwards (e.g., set with 125 values → 125/2= 62.5 → Mode= value on position 63). If the set is even numbered, you will add the two middle numbers together and divide the result by two (e.g., set with 126 values → 126/2=63 → Mode=63+64/2=63.5).

Example

You continue with the next question, asking them how tasty they found the cake you were serving them, on a scale of 1-5 (or very tasty, tasty, neutral, not tasty, disgusting). In the end, you receive the following votes, simplified in numbers:

2,5,3,4,1,2,1,3,2,1,2,3,2,4,3,5,2,1,4,3,2

Now you will have to sort them in ascending order, in order to determine the median:

1,1,1,1,2,2,2,2,2,2,2,3,3,3,3,3,4,4,4,5,5

Since you have 21 participants, you divide that number by 2, which results in 10.5. After adjusting that number upwards, you gain position 11 as the median. Now you look, which value sits on position 11 in your ordered dataset, in this case it is the last “2.” This is your median.

Mean

There are two different types of mean, the arithmetic and the geometric mean, which both calculate the average of a dataset.

Arithmetic mean

The arithmetic mean divides the sum of values by the number of values. It describes the average of a dataset and is mostly used with discrete, integral data.

Example

The next question to your participants is how old they are, resulting in the following distribution:

25,30,14,34,22,23,45,19,24,35,43,48,26,24,17,37,42,33,46,23,32

After adding them all together, you end up with a sum of 642, which you then divide by the number of participants (21), resulting in an arithmetic mean of about 30.57.

Geometric mean

The geometric mean multiplies all values and then extracts the n^th root, with n being the number of values. It also describes the average of a dataset, but is mostly used with continuous, fractioned data or wide ranges.

Example

The last question for your participants is how many hours they spend on their phone per day. To simplify the numbers a little, they are adjusted to 0.25 = 15 minutes, 0.5 = 30 minutes and 0.75 = 45 minutes.

1.5, 0.75, 3.25, 1, 1.75, 0.5, 1.5, 2.25, 1.25, 2, 0.75, 2.5, 1.5, 2.5, 3, 2.5, 1.75, 1.25, 2.75, 1.5

First you multiply all the numbers together, resulting in around 12,844.69, before subtracting the 21^st root, leaving you with a geometric mean of 1.57.

Range of data

The range of data in a set is technically not a measure of central tendency, but it is still sometimes counted in with them. The range just subtracts the lowest value from the highest value to determine how far the set is reached on both sides.

Example

Taking the values from the example of the arithmetic mean, the participants’ ages, you first need to find the lowest and highest value, which in this case are 14 and 48. Then you subtract 14 from 48, resulting in a range of 34.

Outlier effect

An outlier is an extreme value that does not quite fit in with the rest. It can be either high above the others or much below. Outliers can have different effects on datasets, like skewing the distribution, and also on measures of central tendency.

The mode is generally little affected by outliers, as the most frequent value is often very clear. In cases like this, outliers only become interesting if they are the mode or very close. For example, if your dataset consists of numerical answers like 1,2,3,4,5 and 31 and 2 as well as 31 are the most frequent values. However, the interpretation of an outcome like this is very individual and depends on the study and situation.

The median is never affected by outliers, as only the number of values is important and not their actual value. Both types of mean are certainly affected by outliers, since each value is included into the calculation

Distributions

In different distributions, you will find the measures of central tendency in different places.

Normal distribution

A normal distribution is symmetrical and has its highest point in the middle of the curve. Thus, the mode, the median, and the mean are all the same.

Skewed distributions

No matter whether the distribution is positively skewed (highest point closer to the y-axis) or negatively skewed (highest point further away from the x-axis), the mode can always be found at the highest point, the mean, and median follow downwards towards the longer end. In this case, however, the median and mean have to be calculated and can most likely not be drawn from the graph.

When to use which

Knowing when to apply which measure of central tendency is important in statistics, since they give valuable insight on your data.

As a short overview of the different types of variables, categorical variables are divided into nominal data, which includes individual values that cannot be ranked, and ordinal data, which is qualitative data that can be ranked. Furthermore, you can divide quantitative data into discrete, meaning countable data, and continuous, meaning that there are infinite values on a line, such as time or length.

The following table will explain when each measure of central tendency can be used and why or why not.

	Mode	Median	Arithmetic mean	Geometric mean
Nominal v.	x
Ordinal v.	x	Only odd numbered	Only if numbered
Discrete v.	x	x	x	x
Continuous v.	Only in intervals	x	x	x

Mode

The mode can be used with every type of data, since even if you only have a nominal variable, there will most likely be one value chosen more often than the other. When there are multiple possible values, the outcome will most likely be multimodal, but this depends on the individual topic. If, however, there are too many values in the dataset, the mode is not as expressive.

Continuous variables are generally not suited for defining a mode due to the fact that they are measured instead of counted. Thus, it is nearly impossible to gain the exact same value twice. Only if you divide them into intervals, you can determine a mode-interval, in which most of the values lie.

It is also worth noting the cases where you should never use this method.

You shouldn’t use this measure if all values appear the same number of times.
Also, it shouldn’t be used if there is a very small number of values.

Median

Determining the median is not possible for nominal data, since it cannot be ranked or brought in a useful order. Furthermore, as the median is actually calculated, it is also not possible for even-numbered ordinal data, as you cannot perform mathematical operations with qualitative data. However, if the ordinal dataset is odd-numbered, it is possible, since the middle value can simply be counted.

For discrete and continuous data, the median can always be calculated, as the values are numerical and can be used in the mathematical operations needed to determine this measure of central tendency.

Arithmetic mean

With nominal variables, you can never calculate the arithmetic mean, as there is no “average” between qualitative categories. The same applies to ordinal variables in general. However, if you number the ranks, you can theoretically calculate the arithmetic mean, but this is not common practice in statistics.
With discrete as well as continuous data, the arithmetic average is a widely used measurement and can be calculated with no problems.

Geometric mean

The geometric mean can only be applied with quantitative data, meaning discrete and continuous variables, as these are the only types where mathematical operations can be performed.

In theory, calculating the geometric mean with numbered ordinal data would be possible. However, it is never used in such cases, since the arithmetic mean would be the preferred measure of mean.

Print Your Thesis Now

BachelorPrint as an online printing service offers
numerous advantages for Canadian students:

✓ 3D live preview of your configuration
✓ Free express delivery for every order
✓ High-quality bindings with individual embossing

to printing services

FAQs

What are the measures of central tendency?

The measures of central tendency include the mean, mode, and median.

What is the best measure of central tendency to use in a strongly skewed distribution?

If the distribution is strongly skewed, you should use the median because it is least affected by outliers due to the fact that is considers the number of values, not the values themselves.

Can I use measures of central tendency on all levels of data?

You can use mode on all levels of data, but median and mean cannot be used on nominal data and in some cases not even with ordinal data.

How do outliers affect the measures of central tendency?

Outliers have no effect on the median and very little on the mode, since they consider mostly the number of values rather than the values themselves. The mean, however, is highly affected by outliers, as every value influences the result of the calculation.

Category

Your Steps to Success

Central Tendency – Understanding Mean, Median & Mode

How do you like this article? Cancel reply

Central Tendency – In a Nutshell

Definition: Central tendency

Mode

Median

Mean

Arithmetic mean

Geometric mean

Range of data

Outlier effect

Distributions

Normal distribution

Skewed distributions

When to use which

Mode

Median

Arithmetic mean

Geometric mean

FAQs

What are the measures of central tendency?

What is the best measure of central tendency to use in a strongly skewed distribution?

Can I use measures of central tendency on all levels of data?

How do outliers affect the measures of central tendency?

How do you like this article? Cancel reply