Box and Whisker PlotCreate a Box and Whisker Plot using Microsoft® Excel®
A box plot or box and whisker plot can be created using Excel - you just need to know a couple of tricks. First, the box can be created using stacked column charts. Second, the whisker can be created using y-error bars. Excel provides built-in functions that you will need to calculate the quartiles used for the "box" part of the box and whisker plot. You can jump right in by downloading the free Box Plot Template below, but I also suggest you read through the information below which describes how the functions and calculations used to create a box and whisker plot. Box Plot Template for Excel
Disclaimer: This spreadsheet and the information on this page is provided for educational purposes. If you choose to use the box plot spreadsheet to create box and whisker plots for your data, you do so at your own risk. Note the terms of use within the spreadsheet.
Creating a Box and Whisker PlotBox plots are very useful data visualization tools for depicting a number of different summary statistics and especially for graphically comparing multiple data sets. It is much easier to create box plots in Excel if you know how to structure your data. You can take a look at the box plot template as an example.
Fig 1. An example box and whisker plot from the Box Plot Template showing the IQR, whiskers, and max/min outliers. Creating the BoxThe box part of a box and whisker plot represents the central 50% of the data or the Interquartile Range (IQR). The lower edge of the box plot is the first quartile or 25th percentile. The upper edge of the box plot is the third quartile or 75th percentile. You may want to check out my article on percentiles for more details about how percentiles are calculated.
The location of the median line relative to the first and third quartiles indicates the amount of skewness or asymmetry in the data. If the distribution is symmetric, the median will be exactly in the middle. if the median is closer to Q3, the distribution is negatively skewed (or "skewed to the left" meaning the left tail of the distribution is longer). If the median is closer to Q1, the distribution is positively skewed. The box plot in Excel is created using a stacked column chart with 3 series. The first series (bottom column) is Q1 and the border and area properties are set to none so that the column is not visible in the chart. The second series is Q2-Q1. The third series is Q3-Q2. These two series, stacked together make up the interquartile range. The area property is set to none for these two series to create just the outline for the box plot. Creating the WhiskerThe whiskers in a box plot represent the tails of the distribution. The whiskers can be created using error bars in Excel. Because of the ease of calculation, the convention for the length of the whisker that I have used in the box plot template comes from [1]:
Another common convention is that instead of extending the whisker to a calculated value of Q3+1.5(IQR), the whisker is extended to the last data point that is less than or equal to Q3+1.5(IQR), and similarly with the lower whisker. In the box plot template, the whiskers are created by adding Y-error bars to series 1 (Q1) and series 3 (Q3-Q2). OutliersThe biggest problem with creating a box plot in Excel is how to show the outliers - the points that fall outside of the range depicted by the box and whiskers. The reason this is a problem is because you don't necessarily know how many outliers there will be. Therefore, instead of showing a point for each outlier, the box plot template above plots only the max and min values if they are outliers. For reference, the number of upper and lower outliers is given in the table to indicate if there are more outliers than just the max or min. Box Plot VariationsI haven't found a good way to create the following box plot variations in Excel, so I won't describe them in detail, but I've included them here for your information. Most good statistical software like MINITAB® will be able to include these additional features to box plots. Variable-Width Box PlotWhen you are comparing samples with different sample sizes, it is common to represent the sample size by the width of the box plot.
Notched Box PlotA notched box plot is used to show the length of the confidence interval for the median.
Box Plot References
|
|||||
|
|
Become a Fan of Vertex42 |