DOM Scraping Together a Datalayer for Google Analytics Ecommerce Tracking

Google Analytics' e-commerce tracking is generally the most developer-intensive part of a Google Analytics implementation, because it requires a developer-built data layer on the transaction confirmation page. The data layer collects all the relevant data -- the transaction ID, the checkout total, the product names, the prices, and so on -- and lays them out in an standardized syntax that can be read by Google Tag Manager and sent off to GA's servers.

But what if... you had to implement e-commerce tracking without any outside developer assistance, entirely through Google Tag Manager?

This post will show how to set up e-commerce tracking on your own, entirely through GTM. There will still be an e-commerce data layer, but you'll build it yourself, by scraping the data you need off the page and adding it to a custom HTML tag. Since this type of DOM scraping isn't reliable in the long run, this tracking should usually just be used for protoyping, short term patches, or cases where there's no other option (it happens). For this example we'll use GA's Standard Ecommerce Tracking, but you could modify the script slightly for Enhanced Ecommerce.

Please note that this endeavor requires the ability to identify CSS selectors and edit some basic JavaScript. I've provided complete code snippets, but you will likely need to adjust them for your own site.

COLLECT THE DATA INTO VARIABLES

These are the required elements for ecommerce tracking, as shown in Google Tag Manager's documentation on the Google Analytics data layer requirements for ecommerce tracking.

dom-scraping-ecommerce-img3

What this means is that at a minimum we only need to set up variables that collect the transaction ID and transaction total. All the Product Data values are located inside the transactionProducts array, which is optional. However since this aspect of ecommerce tracking is usually included, I'll show how to include it here as well. Let's go through the process of collecting these off the page.

SCRAPING TRANSACTION DATA

The transaction data is very straightforward, because there's only one transaction ID and one total per checkout. To access these, inspect the page to find their selectors.  Here I've hovered over the transaction total on an ecommerce site and right-clicked to select Inspect and open the Developer Tools pane, revealing the following:

dom-scraping-ecommerce-img4

This shows that the Total value is located under div class = "order-summary:total", and can be accessed via this JS snippet:

document.querySelector("div.order-summary\\:total span:nth-child(2)").innerText;

Since that returns a string, we also need to convert it into a number, by adding this to the end:

.replace(/[^0-9\.-]+/g,"")

Putting it all together, in GTM we navigate to Variables > New >  Custom JavaScript and enter the following:

function(){
  return document.querySelector("div.order-summary\\:total span:nth-child(2)").innerText.replace(/[^0-9\.-]+/g,"");
}

The completed Variable should look like this:

dom-scraping-ecommerce-img5

Repeat the above for the transaction ID. Since the transaction ID doesn't have to be a number, you can leave off the .replace(/[^0-9\.-]+/g,"") part of the function.

If you have trouble with the above, you can try this Chrome extension that identifies the selector for you: GTM Variable Builder.

SCRAPING PRODUCT DATA

The product data is a little more complicated as there may be multiple products in a single checkout. As with the transaction data, you find the selector that identifies your products, and then you need to loop through it and push all the relevant information into an array. You will also be using document.querySelectorAll (rather than document.querySelector) because you want to capture ALL the product values, not just the first one. This code snippet should work, but you need to replace ".ItemName" with your site's specific CSS selector.

function(){  
var itemNames = document.querySelectorAll(".ItemName");
var iLen = itemNames.length;
var nameArray = []
for (i = 0; i < iLen; i++) {   
    nameArray.push(itemNames[i].innerText);	 
}
  return nameArray;  
}

The completed Variable should look like this:

dom-scraping-ecommerce-img6

Repeat this for each of the 4 product variables, modifying only the query selector (the value located in the parentheses after .querySelectorAll). The product prices will also need to be converted into a number.

As mentioned above, the list of products is optional, so if you find this step difficult, or you don't actually need to track the specific products, you can just leave it out.

Once you've completed scraping the transaction and product data, you should have the following variables:
{{JS - Transaction ID}}
{{JS - Transaction Total}}
{{JS - Product Names}} //optional
{{JS - Product SKUs}} //optional
{{JS - Product Prices}} //optional
{{JS - Product Quantities}} //optional

CUSTOM HTML TAG FOR GOOGLE ANALYTICS ECOMMERCE DATA LAYER

Once you have the variables above, you can copy and paste this code as is into a custom HTML tag:

<script>
var ecomProducts = [];
for (var i = 0; i < {{JS - Product SKUs}}.length; i++) {
  ecomProducts.push({
    'sku' : {{JS - Product SKUs}}[i],
    'name' : {{JS - Product Names}}[i],
    'category' : '',
    'price' : {{JS - Product Prices}}[i],
    'quantity' : {{JS - Product Quantities}}[i]
  });
}  
window.dataLayer = window.dataLayer || [];
dataLayer.push({
  'event' : 'trackTransaction',
  'transactionId' : {{JS - Transaction ID}},
  'transactionAffiliation' : '',
  'transactionTotal' : {{JS - Transaction Total}},
  'transactionTax' : 0,
  'transactionShipping' : 0,
  'transactionProducts' : ecomProducts 
});
</script>

If you aren't tracking specific products, just edit out that section of the script and paste in the following snippet instead:

<script>
window.dataLayer = window.dataLayer || [];
dataLayer.push({
  'event' : 'trackTransaction',
  'transactionId' : {{JS - Transaction ID}},
  'transactionAffiliation' : '',
  'transactionTotal' : {{JS - Transaction Total}},
  'transactionTax' : 0,
  'transactionShipping' : 0,
  'transactionProducts' : '' 
});
</script>

Attach a pageview trigger to this datalayer push tag, so that it fires on your transaction complete page.

SET A TRIGGER

In the above code we're pushing a GTM event called trackTransaction to the data layer.

Set a trigger to read this event, with the following settings:

Trigger Type = Custom Event
Event name = trackTransaction
This trigger fires on: All Custom Events

dom-scraping-ecommerce-img2

SET THE GOOGLE ANALYTICS TRANSACTION TAG

Set up a transaction tag with the following settings:

Tag Type = Universal Analytics
Track Type = Transaction
Trigger = trackTransaction custom event

dom-scraping-ecommerce-img1

CONCLUSION

In the above steps, we grab all the pertinent ecommerce values off the page using JavaScript and CSS selectors, and collect them into GTM Variables. We then set a custom HTML tag that creates an ecommerce data layer using those Variables, including a custom event trigger. Finally, we created a Universal Analytics Transaction tag that reads that data layer and submits the information to Google Analytics to populate the Ecommerce reports.  Again, it's not recommended as a long-term solution, since any changes to the page layout can break the CSS selectors, but it demonstrates the power of GTM and is a very convenient option when you need it.

24 thoughts on “DOM Scraping Together a Datalayer for Google Analytics Ecommerce Tracking”

  1. Hi Ana

    Many thanks for this article. I am struggling to get the HTML tag to fire, as its trigger seems to rely on an event existing, that is contained within said HTML tag. So i can't see how it would ever fire? I must be reading something wrong, but would be keen to hear your feedback

      • Hi Simon, thank you for your comments! I just took a look at the post and you're absolutely correct, I neglected to mention that you need to set a page view trigger for the data layer push. Is that what Stu advised?
        It's funny, because I wrote this post over a year ago, and according to GA it's had well over 1000+ views by now, yet you are the first one to notice (or at least the first to comment) that I left out a step. Yikes, I wonder how many people have been confused by this... I'll update the post.

  2. Hi Ana
    Yes that is exactly what he advised... i should have spotted it myself, but sometimes you need another pair of eyes/ears to help! Thanks for updating it 🙂
    Simon

  3. Hi Ana! Thanks for great article!

    I had only one question: on which page do I need to search for the selector for the sku, name, price product, etc.

    I have 3-4 different pages on the site and they have different selectors for these elements. There is a page with the goods, there is a basket, there is an order, there is a purchase page. Where to get information for sku, name, price, quantity? thanks

    • Hi Ilia, these metrics should come from the final confirmation page (the page that says something like "thank you for your order"). This way you'll be populating your datalayer with transaction metrics, which is what you want.

      • Thanks for the answer. And if on this page there is no sku and price variables? How then to be? Is it possible to take them from the product page, but so that they fall into the data layer on the thank you page?

      • No, it has to be on the transaction complete page, since the value needs to be available on the page at the time the tag sends. Even if you thought up some kind of workaround (e.g. storing the sku in a cookie on the product page), that wouldn't work if they went to multiple product pages before checking out. You need the information from the transaction page to ensure you're tracking the correct values.

      • That is, the developer is still needed? 😉 It must display all 6 required variables on the page, so that we can then extract them using Google Tag Manager

      • Well, there are only 2 required variables: transaction ID and transaction total. But yes, DOM scraping requires that values are available to be scraped from the DOM. If you want to track additional product values that aren't on the page, you will still need a developer.

    • That's the same page I linked to and screenshotted in this post, right? Under the heading 'Collect the Data into Variables' I highlighted that transactionProducts is optional. Of course, if you choose to include it, then it has 4 required components.
      Anyway, since it sounds like you definitely want product data, you'll probably need to check with your developers. Though one other possibility would be to fire the transaction tag when the user clicks Submit (presumably the product details are available on that page?). This is a bit less accurate since it will submit the tag even if the transaction doesn't succeed, so it really depends on what types of tradeoffs you're willing to make...

  4. hello,
    i have sent you a message on facebook page,
    if you can help me at something,
    it would be amazing !
    thank you

  5. Such an awesome article!

    Thank you so much, I've been looking for an alternative way to implement Ecommerce Tracking via GTM.

    This was super useful. Could I ask you a question?

    Would this be a more secure or robust method if I were to insert specific CSS ID's to each of the DOM Elements I want to scrape? Then it wouldn't matter if the layout changed slightly overtime as long as these IDs still exist is that correct?

    Thanks for your answer!

    • Thanks for the comment, glad it's useful. And yes, you're exactly right -- this method becomes a lot more reliable if you add in specific IDs. A clear ID like gtm-revenue would work. Though I personally like to use data attributes, since those are specifically intended for containing data rather than changing the layout/design of the site. I wrote one article about tracking with data attributes here.

      • Great, thanks for the quick reply. I highly appreciate that. Have good weekend and stay safe 🙂

  6. Thanks for the article! Before jumping in, can this be accomplished if there isn't a transaction ID associated with a purchase (in or case a booking)?

      • Thanks Ana! Sorry, last question. How do I reference a built-in variable like Random Number in a custom js variable?

      • You don't need to reference your random number variable in a custom JS variable, it's already a variable. So instead of scraping the transaction ID off the page into a custom JS variable, you plug that random number variable into the script where it says {{JS – Transaction ID}}.

Comments are closed.