Welcome to the NMS chemical and biological metrology website. Please
log in or
register to view restricted content.
Robust statistics
The other sections describe the use of outlier detection and rejection for dealing with extreme values in data sets. The aim of rejecting outliers is to obtain a more reliable estimate of the mean and standard deviation of a data set. An alternative to outlier rejection is robust statistics. Robust statistics provide a more reliable estimate of the mean and standard deviation in the presence of extreme values (and also provide a sound estimate when no outliers are present). A simple approach to calculating the robust mean and standard deviation is described below. An Excel Add-in capable of calculating more sophisticated estimates is available from the RSC Analytical Methods Committee.
Robust estimate of the population mean
The median can be used to provide a robust estimate of the population mean. The median is obtained by arranging a data set in ascending order and finding the middle value. The median of n ordered values x1…xn is given by:

Compared to the arithmetic mean, the median is influenced less by the presence of outliers.
Robust estimate of the standard deviation
The median absolute deviation (MAD) is an easily calculated robust estimate of the standard deviation. The median absolute deviation is obtained as follows:
- calculate the median of the data set
- calculate the absolute difference (deviation) of each data point from the median value
- calculate the median of the absolute deviations.
For n values, the median absolute deviation is therefore calculated from:

where

represents the median of the data set.
For a normal distribution, MAD ≈ 0.674σ.
To provide a robust estimate that is directly comparable with the standard deviation of a normal distribution, the MAD value is divided by 0.674. The resulting value is usually referred to as ‘MADE’ (pronounced ‘mad e’):

Example
The data shown below (listed in ascending order) are from a round of a proficiency testing scheme. Calculate the mean, standard deviation, robust mean and robust standard deviation.
|
3.5 |
|
4.0 |
|
12.3 |
|
12.6 |
|
12.7 |
|
12.8 |
|
12.8 |
|
12.8 |
|
12.8 |
|
12.9 |
|
12.94 |
|
12.99 |
|
13.0 |
|
13.05 |
|
13.1 |
|
13.1 |
|
13.2 |
mean |
11.8 |
standard deviation |
3.04 |
median |
12.8 |
MADE |
0.297 |
The median is the middle value when the data set is arranged in order of magnitude.
The MAD value is calculated by determining the median of the absolute deviations form the median value as shown below:
Data |
Absolute deviation from the median |
Deviations arranged in ascending order |
3.5 |
9.3 |
0 |
4.0 |
8.8 |
0 |
12.3 |
0.5 |
0 |
12.6 |
0.2 |
0 |
12.7 |
0.1 |
0.1 |
12.8 |
0 |
0.1 |
12.8 |
0 |
0.14 |
12.8 |
0 |
0.19 |
12.8 |
0 |
0.2 |
12.9 |
0.1 |
0.2 |
12.94 |
0.14 |
0.25 |
12.99 |
0.19 |
0.3 |
13.0 |
0.2 |
0.3 |
13.05 |
0.25 |
0.4 |
13.1 |
0.3 |
0.5 |
13.1 |
0.3 |
8.8 |
13.2 |
0.4 |
9.3 |
Median (MAD) |
0.2 |
MADE = MAD/0.674 = 0.2/0.674 = 0.297.
Last modified on
18 August 2009.