Friday, September 28, 2018

Google Analytics Filters - How to exclude CIDR IP Ranges with Regular Expressions?



In the last post, we have seen how to automatically add a list of rules and channel groups to multiple Google Analytics views. In the current post, let's look at the Filters in Google Analytics and we will take one specific case of excluding all the traffic given an IP Range (CIDR format) using Regular Expressions.

What are Filters?


This is pretty straightforward. Filters in Google Analytics provide you the way to include or exclude certain traffic which satisfies a set of predefined rules. Let's say you work for a big organization and all of its workforce access the company's website quite often and do a lot of activity. In this case, your data won't look as cool. Your analytics show a very low bounce rate and high page views which are not really true from a business standpoint. So, what you need to do is exclude this traffic from your organization machines. Another scenario where you want to analyze a specific section of people coming from a city alone. In both the scenarios, what you need to do is to create filters and create standalone views where these filters are applied.



How and where to create a filter?


You can navigate to the Admin section of Google Analytics and find the property and the view under which you want to create a new filter and click on Filters as shown in the below picture. In this blog post, let's just focus on creating a filter which can exclude all the traffic coming from a range of IP addresses but you have been provided with a CIDR IP Range.

Where to find Filter section
Clicking Add Filter for a new filter 
Filter configuration
Enter "IP Address" in the drop down



the place to add your filter expression

Let's say your IP Range in CIDR notation is "110.40.240.16/22". You cannot enter CIDR notation in the filter expression. So, you first need to convert this to a simple IP address format with a starting IP and an ending IP. To get this you can use any of the CIDR to IP converter. You can also read my blog post on Geekonation on writing JavaScript code to convert CIDR to IP Addresses to get a better understanding. If you convert the above CIDR, you will get the starting IP as "110.40.240.0" and the ending IP as "110.40.243.255". So, you need to come up with a Regular Expression which can cover all these IPs in a single shot.

Writing the Regular Expression


  1. Try to be as generic as possible by finding all the patterns
  2. Escape all the special characters
  3. When writing a regex for IP addresses give priority to the bigger numbers first. We will understand this in a while
  4. The filter expression has a character limit of 255. So if your regex goes bigger than 255 characters, split it into multiple parts and create multiple filters
  5. Important regex expressions to remember while dealing numbers are [0-9] matches any number between 0 and 9 and | - pipe symbol denotes "OR"
  6. Remember the last part of the IP addresses can only range up to 255 
Now, let's start writing the expression. IP addresses have four parts delimited with a dot. If you observe the starting and ending IP, the first two parts are identical. So, our regex looks like this "110\.40\.someregex". The third part of the IP says the numbers start from 240 and end at 243. So, how do you write it? Look at the pattern, the first two digits remain the same and third digit ranges from 0 to 3. So, we can write 24[0-3] which will match 240, 241, 242, 243. Now our regex look like this "110\.40\.24[0-3]\.someregex". The last part says it covers all the numbers from 0-255 in 110.40.240.X, 110.40.241.X, 110.40.242.X and 110.40.243.X

So, our range of numbers is from 0 to 255. Remember the 4th point. Try writing the regex for bigger numbers and then goto smaller numbers like this. First for 250-255, 200-249, 100-199, 10-99, 0-9 which translates to 25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9] - This covers all the numbers from 0-255. Now let's write everything together which comes down to

110\.40\.24[0-3]\.25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9]

This is the final expression you need to add to the filter. If you have another CIDR "14.12.72.8/17", just come up with an expression like the above one and put a pipe symbol between two of them and add it to the filter like this and save the filter. You are done. Now you should stop seeing the traffic from these IP Addresses for the view you have added the filter to.

(110\.40\.24[0-3]\.25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])|(14\.12\.12[0-7]|1[0-1][0-9]\.25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])

I hope, you got an understanding of how to write a regex for IP ranges. Let me know if you face any difficulties or comments. Keep reading and support!


4 comments :

  1. Excellent Post.. As per latest GDPR guidelines.. you being setting a field in GTM tag 'ipanonymize'-True as an output last 8 bits masked and sent to GA... I those scenarios for building a range should we give static value post last . As .0

    One more question if we applying IP based range filter is it allowed under GDPR norms ??

    ReplyDelete
    Replies
    1. Hi Neha! yes as per GDPR, IP address is considered PII but only if you collect it in a dimension or something to show in your reports. But by default GA collects users IPs for Geolocation data. Here we are just telling GA not to show any data related to those IP range. So it doesn't violate GDPR. Coming to the ipanonymize, it is purely optional. If you enable ipanonymize, your job of writing IP range becomes much simpler as you are going to just write one IP instead of 256 IPs. But if you want to specifically stop particular IP ipanonymize will do more harm since it eliminates the other 255 IPs which you don't want to eliminate.

      Delete
    2. Looks like that I am wrong. From GDPR guidelines, it seems that GA shouldn't collect the IP address without user's consent in EEA parts of the world. So, yeah if your organization wants to be on safer side, you can enable IP anonymization. Somehow GA doesn't have proper documentation on this case. I will get back with an article on this with more information after research.

      Delete