Meta: How Google Analytics Tracks Its Own Analytics

Working as a Google Analytics / Google Tag Manager consultant, it's common to navigate around the web with tools like the Google Analytics Debugger and Google Analytics Tag Assistant still active in the browser. These tools display information about every tracking call sent to Google Analytics' servers, allowing you to view and debug the GA implementation of any website. I noticed Google Analytics uses Google Analytics on its own site to track itself. This is a bit meta, but I thought it might be worth looking at more deeply -- they created the tool; maybe they're using it in some interesting or non-obvious ways.

GOOGLE ANALYTICS SELF TRACKER REVIEW

GA is sending user information to at least 4 different trackers. The debugging tool showed that there was an attempt to send data to a 5th tracker (UA-38676921-24), but it got blocked with the error message 'Aborting cookie write: Prohibited domain.'.

meta-ga-img2

The top account, UA-10005-1, is using the old legacy ('classic') GA code, while the bottom 3 are using the newer Universal Analytics syntax. Google Tag Manager is also active on the Google Analytics site, but isn't firing GA code.

VIRTUAL PAGEVIEWS (URL RE-NAMING)

The 3 Universal Analytics pageview tags all sent a variation of the actual URL path to GA.

Original URL path: /analytics/web/#home/a8399080w96273388p100425068/

Revised URL path: ga("set", "page", "/analytics/web/home/")

This does 2 things: strips out the hash tag from the URL and removes the unique string at the end of the URL. That makes sense to allow them to aggregate all their user data together.

This convention of simplifying URLs was followed on other pages; for example, /analytics/web/#embed/report-home/a8399080w96273388p100425068/ became /analytics/web/embed/report-home

CUSTOM DIMENSIONS AND VARIABLES

Below are the details of custom dimensions /variables set by the four active GA trackers. Custom variables were the precursor to custom dimensions, and are only used by the legacy GA code (UA-10005-1). Custom variables also have names attached, while the dimensions use numbers instead, adding some opaqueness to what is actually being tracked.

UA-10005-1 (classic) +
Quality_of_service STANDARD
Industry_verticalINTERNET_AND_TELECOM
UA-38676921-2 (Universal) +
user IDwuL7LlXbeE0xXCVmPZu0dm+hmjo=
CD 1No
CD 2No
CD 47
CD 511
CD 1680
CONTENT GROUP 2ungrouped
UA-38676921-4 (Universal) +
user IDnFwvLT0/EIYto3JPrxhELmHorDY=
CD 1Yes
CD 2No
CD 3Web
CD 41
CD 6441
CD 748
CD 81106
CD 13INTERNET_AND_TELECOM
CD 281
CD 29FOUR
CD 37a:13,a:81,a:22,a:211,a:47,a:56,a:78,a:92,a:105,a:113,a:197,a:127,a:130,a:135,a:153,a:192,a:195,a:164,a:162,a:181,a:189,a:212,a:190,a:210,a:234,a:215,a:213,a:222,a:171,a:243,a:255,p:8,p:4,p:7,p:29,p:50,p:59,p:63,p:68,u:3,u:16,u:20,u:25,u:34,u:35,u:36,u:37,u:46,u:54,u:49,u:52,u:59,u:62,u:86,u:101,u:140,u:142,u:159,u:157,u:170,u:169,u:172,u:192,u:175,u:176,u:177,u:178,u:188,u:191,u:202
CD 53
CONTENT GROUP 2ungrouped
UA-38676921-6 (Universal) +
user ID/JuCA+gWU8pQx15Bmlnc7o1Msu8=
CD 1No
CD 2No
CD 3Web
CD 45
CD 51
CD 12INTERNET_AND_TELECOM
CD 150
CD 160
CD 170
CD 180
CD 190
CD 200
CD 21FOUR
CD 22a:13,a:81,a:22,a:211,a:47,a:56,a:78,a:92,a:105,a:113,a:197,a:127,a:130,a:135,a:153,a:192,a:195,a:164,a:162,a:181,a:189,a:212,a:190,a:210,a:234,a:215,a:213,a:222,a:171,a:243,a:255,p:8,p:4,p:7,p:29,p:50,p:59,p:63,p:68,u:3,u:16,u:20,u:25,u:34,u:35,u:36,u:37,u:46,u:54,u:49,u:52,u:59,u:62,u:86,u:101,u:140,u:142,u:159,u:157,u:170,u:169,u:172,u:192,u:175,u:176,u:177,u:178,u:188,u:191,u:202
CONTENT GROUP 2ungrouped

Since custom dimensions don't display their names in the tracking calls, I can only compare among accounts to make an educated guess at what most of these dimensions refer to.

For UA-38676921-2, my best guess is that it's tracking the following:
CD1: Premium (360) or non-Premium GA account
CD4: Number of Properties within the selected Account
CD5: Number of Views within the selected Account
CD12: Industry of the Property

For UA-38676921-4, it's tracking:
CD4: Number of Accounts within that organization (can be >1 for Premium 360 clients)
CD6: Number of Properties associated with my email address
CD7: Number of Accounts associated with my email
CD8: Number of Views associated with my email
CD13: Industry of the current Property
CD29: Number of Views within the Account

For UA-38676921-6, it's tracking:
CD1: Yes = Premium 360 Account, No = non-Premium
CD3: Channel ('Web' or 'App')
CD4: Number of Views within current property
CD12: Industry of the current Property
CD15: 1 = Property linked to Google AdWords, 0 not linked to AdWords
CD16: 1 = Property linked to Google AdSense, 0 not linked to AdSense
CD21: Number of Views within the Account

There's also one very long string (CD22 and CD37 in UA-38676921-6 and UA-38676921-4, respectively) but it's unclear what it contains as it's encoded.

EVENT TRACKING

Clicking around reveals what events are being tracked by Google Analytics. All these events appeared to be tracked by all 3 Universal Analytics trackers. These are some of them:

#1 Clicking on a property from the account overview page:

categoryHomeUI
actionAcquisition Card Tab Switch
labelTraffic Channel

#2 Expanding the left-side menu:

categoryLeft Nav
actionSection Click
labelexpanded

#3 Clicking into the Audience > Geo > Location report

categoryLeft Nav
actionClick
labelvisitors-geo

#4 Drilling into the locations report (2 events set simultaneously)

categorykpi
actionperformance hit
label-
categoryDrilldown
actionanalytics.country
label-

Overall, GA is tracking most interactions on the page that would be useful for improving the user interface: navigation clicks, report dropdowns, drilldowns, export, date changes, etc. They aren't tracking in crazy detail -- quite a few elements like sorting are just grouped into a generic "ec = kpi, ea = performance hit" event.  They also aren't tracking every user interaction like hovers or each movement of the mouse, just clicks that open a report or produce a new view.

CONTENT GROUPING

GA sets the following groups in the tracking code:

  1. ungrouped (for Discover, Home, Customization, Admin)
  2. content (Behavior reports)
  3. visitors (Audience reports)
  4. acquisition (Acquisition reports)
  5. conversions (Conversion reports)

Maybe this was set up correctly at one point, but for now so much of the content is falling into 'ungrouped' it doesn't seem very useful.

CUSTOM METRICS

Property UA-38676921-4 appears to have the most built-out tracking, and is the only one including custom metrics. For now it's not clear on what these are actually counting -- I'll update this if I can figure it out.

UA-38676921-4
CM2128792
CM3190113
CM43

CONCLUSION

Google Analytics tracks itself using a fairly conventional implementation. It uses multiple trackers, custom dimensions, custom metrics, events, and tracks a user ID. However it doesn't use Google Tag Manager, the event tracking is not especially detailed, content grouping wasn't set up for all pages, the events aren't named in a particularly logical fashion (imo), there's no cross-domain tracking, etc. They also aren't tracking much user detail. That is probably because they can match up that info server-side based on the user ID and their internal clickstream reports, and don't need to pass it into GA explicitly -- in the end I'm sure Google isn't tracking *less*  user data than other sites.  So while it was fun to look at Google's own implementation of Google Analytics, it didn't really display an especially advanced or well-thought out implementation, and seemed like the tracking has been deteriorating over time. Overall it gives the impression that Google employees aren't actually using Google Analytics as their primary tool for analyzing usage of Google Analytics, and I suspect most analysis occurs elsewhere.