Abstract: The European legislation which regulates gathering and processing Personally identifiable information (PII) has a direct effect on how web analytic services (tracking of website traffic) must behave concerning EU citizen's data. The legislation requires the Data Controller (website owner) to obtain consent from every website user before any PII is sent for processing, and/or before anything is stored at or retrieved from the user’s device. There is an unceasing effort from the technology companies towards developing or employing technologies which would allow them to continue to identify a user without asking him for permission to do so. However, the legal opinion seems to be unshaken: Website owners must either ensure no personal data is gathered and processed, or upfront lawfully obtain consent. Both strategies are perfectly doable with Google Analytics (GA).
There are two main pieces of legislation governing usage of 3rd party data processors like GA: (1) the Privacy and Electronic Communications Regulations (PECR); and (2) the General Data Protection Regulation (GDPR). The first mainly defines what constitutes as personal data, and the second gives a guidance under which conditions such data can be gathered and processed.
In order to provide reliable analytics of the behaviour of an average user, GA needs to identify each single user. In its default setup GA aims to obtain as much personal data as possible on each user, such as their IP address, browser type, operating system, geography (etc.). Additionally, GA stores information on the user’s computer in form of a file called the 'cookie'. For this to be legal according to EU GDPR, the website owner needs consent from the user, obtained upfront and given actively. That means solutions such as a pre-ticked box or an opt-out offer are insufficient. The user must active choose to allow their data to be gathered and processed by selecting from two options of the same size and visibility, or by selecting the box by themselves.
Studies suggest that only around 9% of visitors are willing to give the consent to be tracked, which renders the statistic provided by GA unreliable. Several techniques have been suggested as a way around the necessity of obtaining the consent: (1) anonymizing the IP address, (2) replacing cookies with HTML5 LocalStorage, and (3) using browser fingerprinting instead of using storage at all. However, none of these techniques are compliant with the legislation.
This was actually not intended as a workaround on its own. But in order to be eventually compliant with EU regulations, Google had to offer a way to anonymize the IP whilst still offering high level of statistic accuracy. Why doesn't it work on its own? GA still gathers lots of private information and stores it locally on the user's computer.
Before the introduction of the HTML5, cookie files were the only way to store information during a user's visit on the website (a user session). Because the web is stateless, storing some information locally in a user's browser is the only way how to accurately identify that the same browser was used to browse different pages of the same website. When the GDPR was introduced, requiring the web owners to inform their users about the gathering and processing PII, cookies were often mentioned as the biggest issue, and some developers suggested switching to LocalStorage offered by HTML5. The problem is, the PECR in regulation 6 states that "[...] person shall not [store or] gain access to information stored, in the terminal equipment of a subscriber or user unless [GDPR rules] are met." Cookies were mentioned simply because they were the only technology at the time which allowed for storing information in the "terminal equipment of a subscriber or user" - but any technology doing so must comply with regulations. There are GDPR-compliant situations when consent is not required for using the browser local storage (or cookies), described in an article by The UK's Information Commissioner, but using the storage for tracing does require the user’s consent.
Supporters of this technique suggest that replacing locally stored information by dynamic browser fingerprinting should be compliant with EU legislation. With fingerprinting no information is stored locally, instead every time a user visits the page he is identified by the metadata he shares (e.g. IP address, browser type, language settings). However, this by itself would still be illegal, as gaining access to information stored in the terminal equipment of a subscriber or user still requires active user consent according to the GDPR. More about this topic can be found here.
There are four parts to being GDPR and PECR compliant without bothering the webpage users with a banner. The first three parts are well described in an article by Chris Shuptrine, the fourth is shown below and concerns GA settings.