On 27 September 2024, Hamburg Regional Court handed down a long-awaited judgment. For the first time, a German court has ruled on the question of whether reproductions of copyright-protected content that are made in connection with the training of AI models are permitted under copyright law even without the consent of the holder of the rights. The court found that they are permitted, basing its decision primarily on the provision in section 60d German Copyright Act (UrhG) (Hamburg Regional Court, judgment dated 27 September 2024 – 310 O 227/23).
The AI copyright dispute between a photographer and the LAION organisation, which has attracted a great deal of media attention, has come to a provisional end with this judgment handed down by Hamburg Regional Court. However, due to the central copyright issues in the legal dispute, it can be assumed that the proceedings will proceed through further instances, possibly including the German Federal Court of Justice and the Court of Justice of the European Union.
LAION downloaded a third-party photograph during a review and analysis process
The defendant organisation, LAION, made a data set publicly available on the internet free of charge, which can be used to train AI systems. The data set consisted of 5.85 billion image-text pairs, which contained hyperlinks to publicly accessible images on the internet as well as a description of the image in text form.
To create this data set, LAION drew on a pre-existing dataset created by third parties, which also contained image-text pairs with hyperlinks to images. LAION downloaded the images and used software to check whether the text descriptions in the existing data set matched the corresponding images. Images for which this was not the case were removed from the data set. The hyperlinks and text descriptions of the remaining images were then added to the new data set.
One item in the data set that LAION created was a photograph by the photographer who brought this legal action against LAION. He had licensed the photograph to a photo agency, which had uploaded the photo to its website with a watermark.
The following note was on the photo agency's website:
"RESTRICTIONS
YOU MAY NOT:
(…)
18. Use automated programs, applets, bots or the like to access the XXX.com website or any content thereon for any purpose, including, by way of example only, downloading Content, indexing, scraping or caching any content on the website." (Note: "XXX.com" is the anonymised version of the domain name of the photo agency website)
On the one hand, the photographer who asserted the claim was of the opinion that downloading his photograph as part of LAION's analysis process constituted an unauthorised reproduction pursuant to section 16 German Copyright Act (UrhG). On the other hand, during the proceedings LAION referred to the copyright limitation provisions in sections 44b and 60d German Copyright Act (UrhG), which in LAION's opinion permit the reproduction of copyrighted works in the context of AI training.
Hamburg Regional Court permits reproduction of the photographer's photo pursuant to section 60d German Copyright Act (UrhG)
The question at the heart of the judgment was whether the text and data mining limitations permit reproductions of copyrighted content in the context of AI training. Both section 44b and section 60d German Copyright Act (UrhG) permit reproductions of lawfully accessible works for the purpose of text and data mining. The term "text and data mining" is defined in section 44b (1) German Copyright Act (UrhG) and means:
"...the automated analysis of individual or multiple digital or digitised works in order to obtain information about, in particular, patterns, trends and correlations."
Hamburg Regional Court ruled that the reproduction of the photographer's photo during the creation of LAION's data set was permitted by the provision in section 60d German Copyright Act (UrhG) as the reproduction served the purpose of obtaining information on "correlations". Hamburg Regional Court held that this means the reproduction is classified as a measure for the purpose of text and data mining.
The information obtained by LAION about "correlations" consisted of whether or not the images and their text descriptions matched. In this context, Hamburg Regional Court held it to be irrelevant that the data set could ultimately also be used to train AI systems, as creating the data set was at most a process upstream of training AI. It held that having the mere intention to obtain AI-generated content in future is not an adequate criterion on the basis of which the legal admissibility of the creation of the training data set itself can be assessed.
Hamburg Regional Court further clarified that LAION was not pursuing any commercial purposes and that it was even creating the reproduction for scientific purposes – an additional requirement imposed by the provision in section 60d German Copyright Act (UrhG).
Finally, Hamburg Regional Court explained that LAION probably could not refer to the provision in section 44b German Copyright Act (UrhG) as a defence.
Similarly to section 60d German Copyright Act (UrhG), section 44b German Copyright Act (UrhG) also permits the reproduction of copyrighted content for the purpose of text and data mining, but unlike section 60d German Copyright Act (UrhG), it applies regardless of whether the reproduction was for scientific research or non-commercial purposes.
However, reproduction is not permitted if the holder of the rights has declared that they reserve the right of use pursuant to section 44b (3) German Copyright Act (UrhG), which in the case of works that are accessible online must be in "machine-readable form".
Hamburg Regional Court was inclined to categorise the note on the photo agency's website as such a reservation of use. Such a reservation can also be issued by holders of derived rights, such as a photo agency. The court indicated that the requirement for "machine readability" is also met if the reservation is written in natural language, as AI applications can now understand and interpret natural language themselves.
Hamburg Regional Court did not conclusively decide whether section 44b German Copyright Act (UrhG) was relevant as it granted permission for reproduction purely on the basis of section 60d German Copyright Act (UrhG).
Crucial copyright issues raised in connection with the training of AI remain unanswered
The decision by Hamburg Regional Court only concerns an early stage of AI training, namely the creation of a training data set. Moreover, the data set in this case did not contain the works themselves that would be used for training later, but instead only the works indirectly as a collection of hyperlinks. The decision therefore cannot be generalised to later stages of AI training.
In particular, there is no clear answer as to how direct use of the training data set to train AI should be assessed under copyright law and whether the resulting reproductions of protected content are permitted by the text and data mining limitations. This is because, in such cases, the copyrighted content is not just used to determine "correlations" between this content. Rather, the aim is to extract the essential characteristics of the content to form a basis on which the weighting of the neurons in the artificial neural network is determined. In our opinion, this goes far beyond the purposes of text and data mining, which is aimed at simply determining meta-information. The training of the AI is also much closer to the output of the AI, which in turn can negatively affect how the copyright holder can normally utilise the works used to train the AI.
Even after this judgment, crucial copyright issues raised in connection with the training of AI have not yet been answered.
The statements by Hamburg Regional Court on the requirements for a reservation of use pursuant to section 44b (3) German Copyright Act (UrhG) should also be treated with caution. Firstly, the judgment is not based on these considerations. Secondly, in our opinion, the term "machine readability" must be interpreted very broadly with regard to underlying interests so that a reservation of use written in natural language also meets the requirements of section 44b (3) German Copyright Act (UrhG). Legal certainty here will probably only be established by the decisions that the German Federal Court of Justice or the Court of Justice of the European Union is sure to make.
Using copyrighted content for AI training without the consent of the rights holders still entails risks
Against this background, there are still copyright risks when training AI models, even after the decision by Hamburg Regional Court. Although the decision will have certain relevance in future legal discussions, it is worth emphasising once again that the court did not examine the question of whether the training of AI models itself also falls under the exemptions of sections 44b and 60d German Copyright Act (UrhG). The actions of the defendant examined in the proceedings concern a process that clearly predates the AI training itself and differs significantly from training in technical terms. These differences require a legal distinction to be made that could possibly lead to a different assessment. The judgment by Hamburg Regional Court therefore does not constitute free licence under copyright law for AI training.
The last word has also not been spoken with regard to the situation addressed in the decision of creating a data collection for the purpose of training AI. The legal dispute is likely to be referred to the next instance. Hamburg Regional Court has also already indicated that it considers the case to be suitable for referral to the CJEU.
Until then – as is already the case – reservations of use on websites should be observed when training AI, even if they are formulated in natural language.
We will continue to closely monitor the progress of these proceedings and other court decisions – including overseas – in connection with AI training.
Social Media cookies collect information about you sharing information from our website via social media tools, or analytics to understand your browsing between social media tools or our Social Media campaigns and our own websites. We do this to optimise the mix of channels to provide you with our content. Details concerning the tools in use are in our Privacy Notice.