Data Protection Academy » Data Protection Wiki » Data protection and anonymisation procedures

the silhouette of a man stands in front of a large screen

Data protection and anonymisation procedures

More data protection through anonymisation

The anonymisation is one of the most important measures to individual-related data to protect them. For it deprives them of direct or indirect reference to persons. In the Recital 26 the DPA states: "The principles of data protection should [...] not apply to anonymous information, i.e. information which does not relate to an identified or identifiable natural person, or personal data which has been rendered anonymous in such a way that the data subject cannot or can no longer be identified". At the same time, the data should remain useful for analysis despite the anonymisation.

Anonymisation removes identifiable information such as names, telephone numbers and e-mail addresses, or modifies the records to make them less precise, for example by adding "noise" to the data. A distinction is made between absolutely and factually anonymous data.

Absolutely anonymous data are modified by coarsening and removal of features to such an extent that identification is prevented. Data are referred to as virtually anonymous if de-anonymisation cannot be completely ruled out, but the data can "only be allocated with a disproportionate expenditure of time, cost and labour", as defined in Section 16 (6) of the Federal Statistics Act. According to this regulation, however, the data must be related to scientific projects and institutions.

Many procedures are not user-friendly

There are several anonymisation procedures. Common are procedures for reducing information (for example, aggregation and class formation) and for changing information (for example, the swapping procedure). However, the poor usability of many Internet anonymisation solutions is problematic. Privacy-friendly techniques will only conquer the mass market if they are set by default and do not lead to significant limitations in service quality such as latency and bandwidth.

A special form of anonymisation is data masking. It preserves the format of the data, but changes its values to prevent the identification of persons. The data is alienated or replaced by fictitious data. Alternatively, data can be shortened in such a way that it no longer has any meaning about specific persons. There is static masking, which removes the personal reference from stored data, and dynamic masking, which changes data almost in real time so that no personal data is stored.

Especially for the subsequent anonymisation of databases there are different tools available. However, it is better to avoid personal data from the outset as far as possible. In market research, this is usually possible because it is not the individual consumer but groups of consumers who are being investigated. "Especially the development of anonymisation and pseudonymisation procedures as privacy-by-default solutions represent an important contribution to the protection of data privacy", said the Federal Data Protection Commissioner Andrea Voßhoff.

External Data Protection Officer

You are welcome to contact us as external data protection officer (DPO) order. We also offer individual consulting services as well as audits and will be happy to provide you with a non-binding offer. You can find more information about our external data protection officers on our website.

Anonymisation can be cracked

However, even anonymisation does not offer a hundred percent guarantee for the protection of privacy. Researchers from Imperial College London and the Université catholique de Louvain have developed a machine learning model that calculates how easy it is to identify people using an anonymised set of data comprising postcode, gender and date of birth. The study shows how far anonymisation technology falls short of our ability to crack them.

A recommendation to companies is therefore to use differential privacy. This is a complex mathematical model that allows organisations to exchange aggregated data about user habits while protecting an individual's identity. Differential privacy aims to make the answers to database queries more accurate without identifying the records used to answer them. Attackers cannot then determine whether a particular person is contained in a database. This technique will undergo a first major test in 2020, when it will be used to secure the US census database.

Companies should use anonymisation procedures where possible. The best and most user-friendly solutions are solutions by default. It should be noted that there are also ways to break anonymisation afterwards. Companies should include this in their concept from the outset in order to arm themselves against it.

Ulrich Hottelet

This might interest you too:

The activity report according to the GDPR

Templates, whitepapers and implementation of the activity report according to the GDPR. Create the activity report automatically in just a few steps.

Erasure concept according to the GDPR

Samples, templates and examples for your GDPR erasure concept according to DIN 66398. Automatically create the erasure concept.

Record of processing activities

List of processing activities according to Art. 30 GDPR. Explained step by step with extensive information. Data protection made easy.