OpenAI Accuses New York Times Of ‘Manipulating’ ChatGPT To Plagiarize Articles
By Mikelle Leow, 09 Jan 2024
Photo 204505446 © Seemanta Dutta | Dreamstime.com
OpenAI and The New York Times are crossing words. The artificial intelligence research giant has countered a damning lawsuit brought by the paper, snubbing its allegations to be “without merit.”
The Times accused OpenAI of violating copyright law and training its GPT-4 model, which powers the popular ChatGPT, using millions of its articles spanning nearly a century without authorization. As a result, the chatbot could not only mimic its unique journalistic voice but also generate passages word for word, the media behemoth claimed.
Evidence of OpenAI infringing NYT's copyrighted content verbatim:https://t.co/NQh5zSMOt8 pic.twitter.com/HMrAbVR9Dh
— Nicole Miller (@JOSourcing) December 27, 2023
The lawsuit is significant, as it could potentially reshape the relationship between AI technology and the news publishing industry. The Times provided substantial evidence showing instances where OpenAI and Microsoft’s AI products displayed nearly exact excerpts from its articles, beyond what is typically expected from standard search results.
In a detailed blog post, OpenAI calls out the New York Times for “not telling the full story” and says the newspaper “intentionally manipulated” prompts to make ChatGPT produce responses closely mirroring its stories.
The Microsoft-backed firm emphasizes that such instances of verbatim reproduction are rare and result from a bug in the learning process, and insists that it has implemented measures to prevent such “inadvertent memorization.”
“We also expect our users to act responsibly; intentionally manipulating our models to regurgitate is not an appropriate use of our technology and is against our terms of use,” OpenAI cautions.
The Times has argued that the supposed plagiarism also threatens its business model. Being able to reproduce exact excerpts means users can access its paywalled content for free and that its journalists don’t get paid.
OpenAI has rebutted these claims, characterizing the complaint as unfounded. The company reveals that prior discussions with the Times—communicating as recently as December 19—had been progressing towards a potential partnership, and focused on real-time content display with attribution in ChatGPT.
The AI company expresses surprise and disappointment at the lawsuit, stating that the Times had refused to provide examples of content regurgitation despite OpenAI’s willingness to investigate and address any such issues.
OpenAI stands by its stance that using publicly available internet materials—including articles from the Times—for AI training constitutes fair use, which dictates that copyrighted works can be used to create secondary works, provided the use is transformative.
The outcome of this legal battle is poised to be a landmark decision, potentially setting a new precedent in AI development and the use of copyrighted materials in machine learning.
[via The Guardian and Hollywood Reporter, images via various sources]