A federal judge has dismissed a lawsuit filed by news outlets Raw Story and AlterNet, which accused OpenAI of misusing their copyrighted content to train its AI language model ChatGPT.
On November 7, U.S. District Judge Colleen McMahon in New York granted OpenAI’s request to dismiss the complaint in its entirety, stating that the plaintiffs failed to demonstrate a concrete injury required for legal standing under Article 3 of the U.S. Constitution.
The decision marks one of the first major legal wins for an AI company facing copyright infringement allegations from news publishers.
Newsweek contacted OpenAI and the publisher of both Raw Story and AlterNet via email for comment.
“Plaintiffs have not alleged any actual adverse effects stemming from this alleged DMCA (Digital Millennium Copyright Act) violation,” McMahon wrote in her decision. “No concrete harm, no standing.”
She added that the plaintiffs did not provide specific examples of ChatGPT reproducing their copyrighted content without attribution, making the likelihood of such an occurrence “remote.”
The Lawsuit’s Core Allegations
Raw Story Media, Inc., which owns both Raw Story and AlterNet, filed the lawsuit on February 28, 2024. The complaint alleged that OpenAI violated Section 1202(b)(1) of the DMCA by removing copyright management information (CMI)—such as authors’ names, article titles and copyright notices—from thousands of their articles when training ChatGPT.
Raw Story argued that this removal of CMI trained ChatGPT to generate responses that do not acknowledge copyrights or provide proper attribution, effectively facilitating plagiarism. They sought statutory damages of at least $2,500 per violation and an injunction requiring OpenAI to remove their works from its training datasets.
“Raw Story’s copyright-protected journalism is the result of significant efforts of human journalists who report the news,” Raw Story publisher Roxanne Cooper said in February. “Rather than license that work, OpenAI taught ChatGPT to ignore journalists’ copyrights and hide its use of copyright-protected material.”
“It is time that news organizations fight back against Big Tech’s continued attempts to monetize other people’s work,” Raw Story CEO and founder John Byrne said at the time.
“For 20 years, Raw Story has spent millions of dollars in efforts to help Americans make important decisions about their leaders and their lives. Big Tech has decimated journalism. It’s time that publishers take a stand,” he added.
McMahon’s Response in Favor of OpenAI
In her ruling, McMahon agreed with OpenAI’s argument that Raw Story and AlterNet’s claim lacked standing because it failed to allege a concrete injury resulting from the alleged DMCA violation. She said that the plaintiffs did not demonstrate any actual adverse effects or a substantial risk of future harm.
“Given the quantity of information contained in the repository, the likelihood that ChatGPT would output plagiarized content from one of Plaintiffs’ articles seems remote,” wrote the judge. “Plaintiffs have nowhere alleged that the information in their articles is copyrighted, nor could they do so.”
“And while Plaintiffs provide third-party statistics indicating that an earlier version of ChatGPT generated responses containing significant amounts of plagiarized content, Plaintiffs have not plausibly alleged that there is a ‘substantial risk’ that the current version of ChatGPT will generate a response plagiarizing one of Plaintiffs’ articles,” she added.
The judge also said that the plaintiffs’ true grievance appeared to be the unlicensed use of their articles to develop ChatGPT without compensation, rather than the removal of CMI. “Let us be clear about what is really at stake here,” she said.
Despite the dismissal, Raw Story and AlterNet have the opportunity to replead their case. McMahon expressed skepticism about their ability to allege a tangible injury caused by OpenAI but was open to considering an amended complaint.
“We do intend to continue the case,” Byrne said. “We’re confident that we can address the court’s concerns in an amended complaint.”
Matt Topic, a partner at Loevy & Loevy, the law firm representing Raw Story Media, told Reuters, “[we’re] certain we can address the concerns the court identified through an amended complaint.”
Implications for Other AI Copyright Cases
“We build our AI models using publicly available data, in a manner protected by fair use and related principles, and supported by long-standing and widely accepted legal precedents,” an OpenAI spokesperson said in response to the judge’s decision.
The company had argued that the plaintiffs failed to offer proof that ChatGPT was trained on their material or that any harm resulted from it. OpenAI maintains that its use of publicly available data is lawful and falls under fair use provisions.
The dismissal could have broader implications for other copyright cases involving AI companies and content creators. OpenAI and other tech firms are currently facing a wave of lawsuits from authors, visual artists, music publishers and news organizations over the data used to train their generative AI systems.
Notably, The New York Times filed a lawsuit against OpenAI and its partner Microsoft in December 2023, alleging that “millions” of its articles were used without permission to train ChatGPT.
That case has since been combined with April 2024 lawsuits from eight Alden Global Capital-owned publications, including the New York Daily News. The publishers are currently searching OpenAI’s training database in secure conditions to find instances of their copyrighted work being used.
“We’ve spent billions of dollars gathering information and reporting news at our publications, and we can’t allow OpenAI and Microsoft to expand the Big Tech playbook of stealing our work to build their own businesses at our expense,” Frank Pine, the executive editor of Alden’s newspapers, said in a statement at the time.
Ultimately, McMahon’s ruling against Raw Story’s complaint, while skeptical of the publishers’ ability to demonstrate concrete harm, leaves open the possibility that other legal theories might better address the fundamental issue of compensation for content used in AI training. “Whether there is another statute or legal theory that does elevate this type of harm remains to be seen,” she wrote.