A histogram is a type of chart that visually shows the distribution of data. It's useful when looking at the average or median of a data set, to understand whether that data point is common or not. For example, an ecommerce dashboard will typically include an Average Order Value metric. Say this metric is $50. This could mean that your site is usually selling products that cost about $50. But, alternately, it could mean you're selling a lot of $99 products and a lot of $1 products. A histogram will reveal what averages can't tell you.
This post will walk through how to create a histogram using your existing Google Analytics transaction data. The histogram data is calculated in Google Sheets, and can be visualized in either Google Sheets or Google Data Studio. (There is currently no way to skip the intermediary Google Sheets step without changing the tracking code setup in Google Analytics).
Step 1: Import Transaction Data into Google Sheets
For this exercise, we'll use the Google Analytics spreadsheet add-on, as it's the most convenient (and free) way to access your Google Analytics data in Sheets.
Choose your view ID, a start/end date, and set Metrics = ga:transactions, ga:transactionRevenue, and Dimensions=ga:transactionId. It should look like this:
Step 2: Choose your bin size
You want to categorize all your transactions into groupings (aka bin sizes) that make sense. If your data set is pretty small and even, you might be able to do this directly, e.g. if all your orders fall between $1-$100 you could just make bins of $0-9, $10-19, $20-$29, and so on. However, if you have a large data set and orders of all different sizes, you will probably want to use a more systematic method.
This is a simple method that will work for most cases:
- Create a tab in your Google Sheet and add this formula: =max(Transactions!C16:C)
- Take this max and round it up to the closest whole number that divides evenly by 10 or 100. For example, if your max is $62.90, round it up to $70; if it's $833.27, round it up to $900.
- Divide the above number by 10. This is your bin size, and we will have 10 bins of this size. In the examples above, you'd have 10 bins of $7 ($0-$7, $8-$15, $16-$23, etc) or 10 bins of $90 ($0-$90, $91-$180, $181-$270, etc).
If you find that most of your data is in the first couple bins, and you just have a few outliers in the upper bins, I suggest sorting by transaction size and removing those outliers from the =max() formula. Then complete the steps as above, and group those outliers into a bin by themselves (e.g. your bins would be $0-$7, $8-$15, $16-$23, etc. and then one final bin of $71+ to capture all the high outliers together in a single bin).
The above should work for most ecommerce sites, but if you want fewer or more bins, or want to understand the theory behind this, you can check out some more sophisticated bin-sizing techniques. Choose Bin Sizes for Histograms in Easy Steps is a fantastic article that goes into a lot more detail on this.
Step 3: Calculate the frequency
To make this section easy to follow, I've provided the source sample histogram sheet for reference. If you want to use it yourself, you can make a copy and plug in your own view ID to the Report Configuration tab. Just click to access.
The sheet contains the following:
- the Report Configuration tab that sets up your query to pull transactions from Google Analytics
- a Transactions tab that contains the results of the transactions query
- a Bin Sizing tab to determine the size of your bin
- a tab that sets up the table for your histogram.
The above sections described steps #1-3, so this section focuses on #4, the table for your histogram. To make this table, create a tab containing your low and high bounds for each bin in columns A and B. Then add in formulas to automatically produce bin names and calculate the frequency and % of total transactions:
Revenue Bin Name: ="$"&A2&"-$"&B2
% of Total Transactions: =D2/sum(D:D)
All together, the formulas will look like this:
The output will look like this:
Step 4A: Set up the histogram in Google Sheets
(step 4B shows how to do this for Google Data Studio)
Select your data and click Insert Chart.
On the right side, a chart editor will appear. Delete everything except X Axis = Revenue Bin Name, and Y Axis = % of Total Transactions.
Step 4B: Set up the histogram in Google Data Studio
In Google Data Studio, click Resource > Manage added data sources > ADD A DATA SOURCE. Select Google Sheets and your table from above. Click Connect and Add to Report, and accept the verification pop-up that appears.
Now that the histogram data source from Google Sheets is available in Data Studio, click Insert Bar Chart. Choose Dimension = Revenue Bin Name, Metric = % of Total Transactions. To make sure that the bins appear in the correct order, choose Sort by Bucket #.
Step 5: View and customize the completed histograms
The completed histogram will look like this in Google Sheets.
Similarly, it will look like this in Google Data Studio: