Data protection and AI: EU regulators address important questions

Germany

In a recent opinion, the European data protection authorities have commented on key data protection issues in connection with the development and use of AI models.

On 18 December 2024, the European Data Protection Board (EDPB) published an Opinion  on the processing of personal data in the context of the development and deployment of AI models. The opinion is based on a request from the Irish Data Protection Commission (DPC) pursuant to Art. 64(2) GDPR in connection with two proceedings regarding the use of personal data for the development and deployment of AI models initiated by the DPC against Meta and X. The aim of the request was to harmonise the legal assessment of the processing of personal data in connection with AI models by data protection authorities across the EU.

The opinion provides guidance for the European data protection authorities (the EDPB refers to it as a "framework") and will shape their future approach with regard to AI models. In line with the DPC's request, the scope of the EDPB's opinion is limited to the following four questions:

  1. Under what conditions can AI models be considered anonymous?
  2. Under what conditions can the development of AI models be justified on the legal basis of legitimate interest?
  3. Under what conditions can the deployment of AI models be justified on the legal basis of legitimate interest?
  4. What consequences do data protection violations during the development phase of an AI model have for its subsequent deployment?

The most important statements of the EDPB in a nutshell

In view of the considerable complexity of the matter – the EDPB rightly refers to the great variety of AI models and their rapid development – the EDPB finds it difficult to make generalised statements on specific questions in its opinion. Instead, it repeatedly states that the supervisory authorities must always consider the circumstances of each individual case when assessing processing activities in connection with AI models. The main benefit of the opinion is that the EDPB highlights some major data protection issues which can arise in connection with the development and deployment of AI models within the scope of the GDPR.

The EDPB answers the questions raised by the DPC as follows:

1. Under what conditions can AI models be considered anonymous?

According to the EDPB, AI models should only be considered anonymous if it is very unlikely that personal data can be extracted from them. In addition to this abstract statement, which is based on recital 26 sentence 4 GDPR, the opinion also contains specific requirements for the national supervisory authorities to verify whether an AI model can be considered anonymous.

If a controller wishes to invoke the anonymity of its AI model, the "burden of proof" lies with the controller: it must convince the supervisory authority that no personal data can be extracted from its AI model on the basis of the tests it has carried out and the documentation it has prepared. The opinion also contains specifications regarding the documentation to be prepared by the controller and the tests to be carried out.

2. Under what conditions can the development and deployment of AI models be justified on the legal basis of legitimate interest?

The second and third questions are dealt with jointly by the EDPB. The opinion states that the development and deployment of AI models can in principle be justified on the legal basis of legitimate interest (Art. 6(1) f GDPR). However, there are some conditions that must be met. The opinion lists a significant number of factors that must be taken into account as part of the assessment pursuant to Art. 6(1) f GDPR – subject to each individual case. One example is the requirement to strictly limit the scope of the training data to what is necessary (principle of data minimisation).

The question of how to source the training data is interesting as well: In the past, the Dutch data protection authority in particular has been very critical of the practice of scraping data from the internet for AI training purposes. The ECJ took a different stance in its decision of 4 October 2024 (C-621/22) and the EDPB is not fundamentally opposed to web scraping, either. Nevertheless, it is clear from the opinion that the controller must examine very carefully whether the data subjects reasonably expect their data to be scraped for the development of AI models. The EPDB also points out that there is a fundamental obligation to inform the data subjects and that they must be able to object to the processing of their data.

Finally, the controller must also take into account the potential for misuse of the AI model (e.g. the creation of deepfakes in the case of generative AI) when weighing up the different interests, according to the opinion. The EDPB provides examples of various measures that controllers can take to minimise the existing risks and influence the balancing of interests in their favour.

According to a separate press release from the EDPB, it is currently working on guidelines that will inter alia address the issues related to web scraping in connection with AI models in more detail. It remains to be seen whether these guidelines will provide additional clarity.

3. What consequences do data protection violations during the development of an AI model have for its subsequent deployment?

Finally, the EDPB deals with the extremely important question of what consequences it has for the deployment of an AI model if it was developed in breach of data protection regulations.

The authorities state that there is no risk of negative consequences if the AI model itself can be considered anonymous. Where this is not the case, the authorities should examine the appropriate legal consequences on a case-by-case basis. This means that legal violations during the development phase can, in a worst-case scenario, make a model's deployment incompliant as well. If the operator of an AI model is different from its developer, however, it seems that the EPDB intends to be more lenient if the operator has done everything in its power to verify that the model has been developed in compliance with applicable data protection regulations.

Conclusion

In conclusion, the EDPB opinion is a good first attempt to address some particularly relevant data protection issues in connection with the development and deployment of AI models. The opinion strengthens the position of data protection professionals, who have to stand up to other departments within their organisation when it comes to weighing up the opportunities and risks of developing and deploying AI models within the scope of the GDPR. It is clear that the processing of personal data in connection with AI – regardless of its fundamental permissibility in individual cases – entails a considerable amount of work in terms of both legal review and documentation.

On a positive note, it should be noted that legitimate interests pursuant to Art. 6(1) f GDPR can generally be considered as a sufficient legal basis according to the EDPB. Otherwise, the development of AI models in particular would hardly be conceivable within the scope of the GDPR. The finding that data protection violations during the development of an AI model do not necessarily jeopardise its subsequent deployment in all cases, also increases legal certainty.