Exclude Selenium WebDriver traffic from Google Analytics

Published: by Creative Commons Licence

For sites being tested by automated UI testing frameworks like Selenium WebDriver or Watir, developers may find it useful to exclude internal automated testing traffic from Google Analytics.

However, first things first, before talking about how to exclude such traffic, the question would be "why such traffic exists?". If such traffic comes from unknown sources, i.e. some people are running Selenium WebDriver against the website, then there isn't much can be done, unless the IP/ISP or User Agent can be uniquely identified. If such traffic is from site developers' own Selenium WebDriver tests, then the question becomes "Why Selenium tests were run against production"? Normally production servers only require some manual exploratory/smoke testing, instead of intensive automated testing to mess around live databases and slow down the servers.

In this case, when Selenium is used only against testing environment, it would be relatively easy to exclude its traffic. One approach would be using a different Google Analytics token from production, so the analytics data get separated out. Another solution is to avoid executing Google Analytics snippet if the site is deployed on testing servers.

Anyway, if it is really necessary to use Selenium against production for some reason, Here are few solutions which might help exclude internal traffic from Google Analytics. Some of these are just general approaches for excluding certain traffic from Google Analytics side, while some may involve updating Selenium test code explicitly.

General solutions

Exclude IP/ISP

A custom filter on certain IP address/range or ISP can be created in order to filter out internal traffic. See Google Analytics official documentation's 'Exclude internal traffic' section for details.

  1. Go to the ADMIN Section of Google Analytics and select one website view from your account.
  2. Choose Filters.
  3. Click button ADD FILTER.
  4. Select Create new filter and give it a new filter name.
  5. Choose filter type: Custom.
  6. Check Exclude radio button.
  7. Choose "IP Address" from Filter Field and type in IP Address Range in Filter Pattern.
  8. Apply this filter to the views you want to filter. For example, if the single IP address is 176.168.1.1, then enter 176\.168\.1\.1. If the range of IP addresses is 176.168.1.1-15 and 10.0.0.1-14, then enter ^176\.168\.1\.([1-9]|1[0-5])$|^10\.0\.0\.([1-9]|1[0-4])$

Create IP Filter on Google Analytics

However, this may also filter out data that are not generated by Selenium WebDriver. On the other hand, if the tests are run on distributed CI systems or environment with dynamic IP, maintaining those IP addresses in Google Analytics setting might be too much of a hassle.

Edit hosts file

Without changing Google Analytics settings from data-receiving end, one other direction is to block GA's traffic being sent to its server. To achieve it, editing hosts file on the machines where the tests are run would be a straightforward thing to do. It requires certain permissions on testing environment, and will block all traffic for all sites to Google Analytics, hence this might not be as good as it sounds.

For more details about how to exclude Google Analytics by editing hosts file, here are few posts explaining the same thing:

Selenium specific solutions

Disable JavaScript

Since Google Analytics's tracking code is just a piece of JavaScript code, disabling the JavaScript in browser seems to be one reasonable solution.

Seriously? Nope. Modern websites nowadays use a lot of JavaScript to make things happen, disabling it doesn't seem to be feasible at all, unless the target site doesn't use JavaScript. Selenium WebDriver also requires JavaScript to function properly, as a result, starting Selenium with JavaScript disabled might cause all tests behave strangely.

Although a previous article Disable JavaScript using Selenium WebDriver shows that this is achievable in some browsers, this should not be encouraged anyway due to the side effects.

Set custom user agents

As most of the browsers' user agents can be set through Selenium, wrap site's Google Analytics' snippet with an if-statement to ignore some certain user agents would be another possible approach to deal with it.

For example, Google Analytics snippet will only be executed if browser's user agent doesn't contain "phantomjs".

if (!navigator.userAgent.match(/.*PhantomJS.*/gi)) {
    // Google Analytics's tracking snippet
}

If the testing is done using browsers like Chrome or Firefox, an additional step is needed to set a special testing user agent when starting browsers using Selenium WebDriver. A previous article Set user agent using Selenium WebDriver C# and Ruby provides few sample snippets on how to set user agents. Note that user agent can't be set for IE using Selenium WebDriver, therefore this approach is not going to work for IE[1].

Here, Firefox for example, add something identifiable (e.g "Selenium") into the user agent when launching, then wrap Google Analytics's tracking code with if-statement to ignore any user agents that contain word "Selenium" like above.

# Environment Tested:
# Windows 10, Ruby 2.3.3p222, Selenium 3.0.5, GeckoDriver 0.13, Firefox 50.1
require 'selenium-webdriver'

profile = Selenium::WebDriver::Firefox::Profile.new

original_ua = 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:26.0) Gecko/20100101 Firefox/26.0'
profile['general.useragent.override'] = "Selenium #{original_ua}"

driver = Selenium::WebDriver.for :firefox, :profile => profile
# Now user agent is 'Selenium Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:26.0) Gecko/20100101 Firefox/26.0'

Opt-out plugins

Officially, Google provides browser add-ons to opt-out Google Analytics. With this plugin installed in the browser, no Google Analytics data will be collected and used by Google. It supports IE 11, Chrome, Firefox, Safari and Opera, which unfortunately, means that this won't work for PhantomJS. Additionally, Selenium WebDriver cannot start IE with custom add-ons installed[2], therefore this won't work for IE unless pre-install manually.

Chrome

The Chrome extension can be found within Chrome Store here and downloaded using third party sites like Chrome Extension Downloader. Save it somewhere locally and start Chrome using the following Selenium WebDriver code.

# Environment Tested:
# Windows 10, Ruby 2.3.3p222, Selenium 3.0.5, ChromeDriver 2.27, Chrome 55.0
require 'selenium-webdriver'

profile = Selenium::WebDriver::Chrome::Profile.new
profile.add_extension("./Google-Analytics-Opt-out-Add-on-(by-Google)_v1.1.crx")

driver = Selenium::WebDriver.for :chrome, :profile => profile

Firefox

In order to start Firefox with this add-on installed, first download and save the latest version of Google Analytics Opt-out Add-on here, then use the following code to start Firefox.

# Environment Tested:
# Windows 10, Ruby 2.3.3p222, Selenium 3.0.5, GeckoDriver 0.13, Firefox 50.1
require 'selenium-webdriver'

profile = Selenium::WebDriver::Firefox::Profile.new
profile.add_extension("./gaoptoutaddon_0.9.8.xpi")

driver = Selenium::WebDriver.for :firefox, :profile => profile

Use a proxy

BrowserMob Proxy allows manipulating HTTP requests and responses, capturing HTTP content, and exporting performance data as a HAR file. It supports blacklisting which can be used as a way of blocking data sent to Google Analytics.

Below is a simple example of how it's done using the Ruby client for the BrowserMob Proxy.

require 'browsermob/proxy'
require 'selenium-webdriver'

server = BrowserMob::Proxy::Server.new("./browsermob-proxy-2.1.4/bin/browsermob-proxy", :log => true)
server.start

proxy = server.create_proxy
proxy.blacklist("https?:\/\/www\.google-analytics\.com\/.*", 404)

profile = Selenium::WebDriver::Firefox::Profile.new
profile.proxy = proxy.selenium_proxy

driver = Selenium::WebDriver.for :firefox, :profile => profile

proxy.new_har "browsermob"
driver.get 'http://yizeng.me/'

har = proxy.har
har.entries.first.request.url
har.save_to "./browsermob.har"

proxy.close
driver.quit

Comparison

Solution GA Admin Server Source code Testing code Pros Cons
Access required
Exclude IP/ISP - Straightforward
- No code changes
- High maintaining costs for environment using dynamic IP
- May exclude traffic other than Selenium
Edit hosts - No code changes - Need permissions to setup on testing environment
- May exclude traffic other than Selenium
Disable JavaScript - Straightforward
- Accurate
- Make Selenium WebDriver barely usable
- Only support few browsers
Use special UA - Accurate
- Controllable
- Source code changes required
- Need to set a custom user agent in Selenium code
- Not possible for IE
Opt-out plugins - Accurate
- Official
- Need to load plugins in Selenium code
- Only possible for Chrome and Firefox
Use a proxy - Accurate
- Regardless of browser types
- Need third party libraries
- Need to setup together with Selenium

[1]: IEDriver developer Jim Evans' comment at github.com/seleniumhq/selenium/issues/759.

[2]: IEDriver developer Jim Evans' comment on Selenium Mailing List.