Illustration 249271074 © Alexey Novikov | Dreamstime.com
Text-to-art has generated more drama. A collective of artists and photographers, including photographer Jingna Zhang and illustrators Sarah Andersen, Hope Larson, and Jessica Fink, have brought Google to court in the Northern District of California. The creatives have accused the tech giant of copyright infringement through its use of the Imagen AI image generator.
The crux of the issue lies in Imagen’s dataset. As the lawsuit details, the artists discovered their work was included in the initial training data for Imagen. This discovery came to light after Google released a paper in 2022 acknowledging the use of a publicly-available dataset called LAION-400M. This massive repository, named the Large-Scale Artificial Intelligence Open Network, contains a staggering 400 million images and captions used to train image generators—and according to the lawsuit, this includes their own works that are copyrighted.
The row isn’t the first of its kind for these artists. They were previously involved in a similar case against the creators of Midjourney and Stable Diffusion, also citing the use of the LAION-400M dataset. Interestingly, when the Midjourney and Stable Diffusion case progressed through the courts, encountering partial dismissal and a refiling in late 2023, Google quickly released an update to Imagen that curiously omitted any mention of LAION datasets as its training source.
The creatives allege this omission was intentional. They deduce Google hoped to “aavoid being named as a defendant in a lawsuit over the legality of training on mass quantities of copyrighted works without consent, credit, or compensation.” This belief, the suit suggests, is partly based on Google’s hiring of Romain Beaumont, a French AI researcher who played a key role in developing the LAION datasets.
This case highlights a growing concern: how copyright law applies in the age of AI. While other AI companies have faced similar accusations, this complaint goes a step further. The plaintiffs argue that current copyright laws haven’t adapted to this new technology, allowing companies to potentially classify the use of copyrighted materials in training datasets as “fair use.” The lawsuit aims to challenge this interpretation and establish a legal precedent that requires explicit consent, credit, or compensation for the use of protected works within AI training data.
[via Futurism and Law.com, cover illustration 249271074 © Alexey Novikov | Dreamstime.com]