Data Protection Academy » Data Protection Wiki » Data protection and anonymisation procedures

Data protection and anonymisation procedures

Data protection and anonymisation procedures

More data protection through anonymisation

The anonymisation is one of the most important measures to individual-related data to protect them. For it deprives them of direct or indirect reference to persons. In the Recital 26 the DPA states: "The principles of data protection should [...] not apply to anonymous information, i.e. information which does not relate to an identified or identifiable natural person, or personal data which has been rendered anonymous in such a way that the data subject cannot or can no longer be identified". At the same time, the data should remain useful for analysis despite the anonymisation.

Anonymisation removes identifiable information such as names, telephone numbers and e-mail addresses, or modifies the records to make them less precise, for example by adding "noise" to the data. A distinction is made between absolutely and factually anonymous data.

Absolutely anonymous data are modified by coarsening and removal of features to such an extent that identification is prevented. Data are referred to as virtually anonymous if de-anonymisation cannot be completely ruled out, but the data can "only be allocated with a disproportionate expenditure of time, cost and labour", as defined in Section 16 (6) of the Federal Statistics Act. According to this regulation, however, the data must be related to scientific projects and institutions.

Many procedures are not user-friendly

There are several anonymisation procedures. Common are procedures for reducing information (for example, aggregation and class formation) and for changing information (for example, the swapping procedure). However, the poor usability of many Internet anonymisation solutions is problematic. Privacy-friendly techniques will only conquer the mass market if they are set by default and do not lead to significant limitations in service quality such as latency and bandwidth.

A special form of anonymisation is data masking. It preserves the format of the data, but changes its values to prevent the identification of persons. The data is alienated or replaced by fictitious data. Alternatively, data can be shortened in such a way that it no longer has any meaning about specific persons. There is static masking, which removes the personal reference from stored data, and dynamic masking, which changes data almost in real time so that no personal data is stored.

Especially for the subsequent anonymisation of databases there are different tools available. However, it is better to avoid personal data from the outset as far as possible. In market research, this is usually possible because it is not the individual consumer but groups of consumers who are being investigated. "Especially the development of anonymisation and pseudonymisation procedures as privacy-by-default solutions represent an important contribution to the protection of data privacy", said the Federal Data Protection Commissioner Andrea Voßhoff.

You want to minimise your risk and implement data protection automatically and with guidance? Inform yourself about the features of the Robin Data Software or via the order of our qualified Data Protection Officer.

Learn more

Anonymisation can be cracked

However, even anonymisation does not offer a hundred percent guarantee for the protection of privacy. Researchers from Imperial College London and the Université catholique de Louvain have developed a machine learning model that calculates how easy it is to identify people using an anonymised set of data comprising postcode, gender and date of birth. The study shows how far anonymisation technology falls short of our ability to crack them.

A recommendation to companies is therefore to use differential privacy. This is a complex mathematical model that allows organisations to exchange aggregated data about user habits while protecting an individual's identity. Differential privacy aims to make the answers to database queries more accurate without identifying the records used to answer them. Attackers cannot then determine whether a particular person is contained in a database. This technique will undergo a first major test in 2020, when it will be used to secure the US census database.

Companies should use anonymisation procedures where possible. The best and most user-friendly solutions are solutions by default. It should be noted that there are also ways to break anonymisation afterwards. Companies should include this in their concept from the outset in order to arm themselves against it.

Caroline Schwabe
Latest posts by Caroline Schwabe (see all)

This might interest you too:

Data Protection Officer

Technical organisational measures (TOMs)

All information on the technical organisational measures according to the GDPR. What do responsible parties have to observe during implementation and documentation?
documentation obligations

Create a GDPR-compliant data processing agreement

All information on the data processing agreement according to GDPR. What do controllers have to consider when creating and managing?
Microsoft Office 365 Privacy

Microsoft 365: GDPR-compliant use in the company

Can Microsoft Office 365 be used in compliance with the GDPR? We show how the configuration complies with data protection.