The box plot chart (aka box and whiskers) is a chart to visually display the distribution of a dataset: the maximum, minimum, median, and the range on either side of the median.
Google Data Studio currently doesn't provide a Box Plot chart type, but you can create something similar using a Stacked Column chart type. It won't look totally standard, but in a pinch can be used as a substitute. Scroll to the end to see a screenshot of the end result. The method for producing this pseudo box chart is as follows.
Step 1: Set Up Your Source Data
The first step is to set up your source data. We'll do this in Google Sheets, which you can then add as a data source to your Data Studio report. If you want to follow along, you can view the sample sheet here (just make a copy if you want to plug in your own data).
Assuming your data lies in cells A1:A20, first calculate the minimum, maximum, lower quartile, median, upper quartile, and maximum for your data set using the following formulas:
Add these formulas into a second table, to produce an output like this:
Next, you need to create a third table that calculate the differences between these metrics, as the differences represent the size of each segment in your box chart. To make things easy for importing into Data Studio, the values should be located at the very top of your sheet, in 5 separate columns, i.e. starting from cell A1 and continuing through column F.
The Minimum and Median should be directly copied in, while the Lower Quartile Δ equals Lower Quartile - Minimum, Upper Quartile Δ = Upper Quartile - Lower Quartile, and Maximum Δ = Maximum - Upper Quartile. For reference, these are the formulas.
Lower Quartile Δ:
='Box Data'!C2-'Box Data'!B2
Upper Quartile Δ:
='Box Data'!E2-'Box Data'!C2
='Box Data'!F2-'Box Data'!E2
The end result should look like this. You can of course name it however you like, but for this example let's name the tab "Box Chart".
Step 2: Import Data into Data Studio
In your Google Data Studio report, click Resource > Manage added data sources > + Add a Data Source. Find and select the Google Sheets connector, and add in the Google Sheet tab ("Box Chart") containing the final table from step 1.
Make sure "Use first row as headers" is checked, and click Connect. You should now see a list of your dimensions and metrics like this:
Click Add to Report and accept the confirmation box. Your Sheets source data is now available for use in Data Studio.
Step 3: Add Source Data to a Combo Chart in Data Studio
To create the box plot chart, click Insert Combo Chart. Click on the chart, and within the Data settings dialog, select the Data Source you just added, Dimension = Name, and Metrics = Minimum, Lower Quartile Δ , Median, Upper Quartile Δ , and Maximum Δ .
At this point it likely looks like a regular bar chart. Click Chart > Line at the top of your settings box, and select "Stacked Combo Chart". Now it should look something like this:
Step 4: Customize the Combo Chart Style
All the data is visible now. To turn this into something resembling a box plot, we just need to customize the appearance. Click on the chart and click the Style tab. Now, do the following:
- Scroll down to the "Background and Border" section, and choose the background color of your chart. For this example we'll choose plain white.
- Your Minimum is shown by Series #1. For Series #1, choose Bars, and change the Series Color so it matches the background color above (white in this example). It's important to match the background color as this will make data from your lowest quartile disappear, meaning your chart will look like a box plot rather than starting all the way at the X-axis like a bar chart. After choosing white for the background and Series #1:
- Series #2 is your lower quartile. Choose Bars, and change the Series Color. It can be anything, but we'll use light green in this example.
- Series #3 is the median. Choose the following options: Line, Line Weight = 0, Show Points, and Show Data Labels. We'll use black.
- Series #4 is the range around the median. Choose Bars, and change the Series Color. We'll use green.
- Series #5 is the upper quartile. Choose the same settings as for Series #2: Bars and light green.
- Under General, verify that you have "Stacked Bars" selected.
- Finally, under "Legend", click "None".
The final result will look like this. For additional clarity, you can add in a text box with labels, as shown here:
5 thoughts on “Add a Pseudo Box Plot to Google Data Studio”
Hi Ana, This is an awesome post, Thank you.
Just one question: Because the bars are stacked, I am not getting faithful numbers (totals) aligned with the Y axis. For example the maximum score in your table is 80. But alongside the Y axis the maximum score displays at about 120. Any ideas on how to fix this? In your example did you manually add the Y axis legend? Thanks again - this has really given me some good ideas.
Hey Doug! Thanks for the comment, I appreciate it. The Y axis should automatically display the correct number, I didn't manually add in anything. Would it be possible for you to share your report so I could take a look?
is there a solution for a time series where the data can sometimes go below zero?
Good question. I don't know any way to do this since this workaround relies on the stacked bar chart type, which only goes from 0 - 100%.
puts a smile on my face when someone comes up with a very creative, hacky solution, nice one Ana
Comments are closed.