The median is an essential concept in statistics and represents a type of measure of central tendency of a set of numerical data. Unlike the mean or mode averages, this type of average is calculated by working out where the middle of a set of data is, also known as the midpoint. This article provides a thorough understanding of the median and its central tendency of skewed distributions by demonstrating its calculations and significance in statistical analysis.
The median is the middle number in a set of data.
In the aforementioned example, the median is six while the mean average would be seven, which is relatively close. However, in another data set – say 3, 5, 6, 91, 120 – it would still be six while the mean average would be 45.
Note: The median averages can yield very different results.
Whether a set-up data is small or large, the calculation method is the same. You still just have to arrange all the data in order and pick the middle value. That’s straightforward for odd-numbered data sets.
However, if there is an even number of values in a given set of statistical data, then there will be no single value in the middle.
Let’s imagine a set of data based on UK shoe sizes.
From a survey of recent shoe sales, it is established that on a given day, pairs of shoes in sizes 3, 4 ½, 9, 9 ½, 7, 12, 5, 6, and 5 were sold. In other words, there were nine pairs of shoes sole with the most frequently occurring being size five.
In another example, using an even number of values, we’ll imagine the finishing times of ten runners who have completed a circuit of a track.
The median in a normal distribution
Normally, distributed sets of data can be represented easily on a graph. When a normal distribution makes a bell curve on a graph, the midpoint between the two tips of the bell will be the median value.
Because bell curves formed by normal distributions are symmetrical, the midpoint on the x-axis of such a graph will also coincide with the highest point on the y-axis. As such, when data sets are normally distributed, the midpoint average value will also be the most frequently occurring. In other words, the median, mean, and mode averages will all be the same.
Mean averages are calculated by adding up all the values in a set and dividing that sum by how many values there are, whereas the median is a straightforward midpoint between the highest and lowest values.
Median averages that rely on midpoints are good for establishing averages that do not take account of extremes at either end of the scale. In short, they tend to ignore outlier statistics.
With an odd set of data, the midpoint formula to use is:
while the formula
should be applied to even sets of data.