Published on July 25, 2025 5:01 PM GMT
A class action over pirated books exposes the 'responsible' AI company to penalties that could bankrupt it — and reshape the entire industry
This is the full text of a post first published on Obsolete, a Substack that I write about the intersection of capitalism, geopolitics, and artificial intelligence. I’m a freelance journalist and the author of a forthcoming book called Obsolete: Power, Profit, and the Race to Build Machine Superintelligence. Consider subscribing to stay up to date with my work.
Anthropic, the AI startup that’s long presented itself as the industry’s safe and ethical choice, is now facing legal penalties that could bankrupt the company. Damages resulting from its mass use of pirated books would likely exceed a billion dollars, with the statutory maximum stretching into the hundreds of billions.
Last week, William Alsup, a federal judge in San Francisco, certified a class action lawsuit against Anthropic on behalf of nearly every US book author whose works were copied to build the company’s AI models. This is the first time a US court has allowed a class action of this kind to proceed in the context of generative AI training, putting Anthropic on a path toward paying damages that could ruin the company.
The class action certification came just one day after Bloomberg reported that Anthropic is fundraising at a valuation potentially north of $100 billion — nearly double the $61.5 billion investors pegged it at in March. According to Crunchbase, the company has raised $17.2 billion in total. However, much of that funding has come in the form of Amazon and Google cloud computing credits — not real money.
Santa Clara Law professor Ed Lee warned in a blog post that the ruling means “Anthropic faces at least the potential for business-ending liability.” He separately wrote that if Anthropic loses at trial, it will need to post a bond for the full amount of damages during any appeal, unless the judge grants an exception. In most cases, this bond needs to be backed by 100 percent collateral, plus a 1-2 percent annual premium.
Lee wrote in another post that Judge Alsup “has all but ruled that Anthropic’s downloading of pirated books is [copyright] infringement,” leaving “the real issue at trial… the jury’s calculation of statutory damages based on the number of copyrighted books/works in the class.”
On Thursday, the company filed a motion to stay — a request to essentially pause the case — in which they acknowledged the books covered likely number “in the millions.” Anthropic’s lawyers also wrote, “the specter of unprecedented and potentially business-threatening statutory damages against the smallest one of the many companies developing [large language models] with the same books data.”
The company could settle, but doing so could still cost billions given the scope of potential penalties.
Anthropic, for its part, told Obsolete it “respectfully disagrees” with the decision, arguing the court “failed to properly account for the significant challenges and inefficiencies of having to establish valid ownership millions of times over in a single lawsuit,” and said it is “exploring all avenues for review.”
The plaintiffs lawyers did not reply to a request for comment.
From “fair use” win to catastrophic liability
Just a month ago, Anthropic and the rest of the industry were celebrating what looked like a landmark victory. Alsup had ruled that using copyrighted books to train an AI model — so long as the books were lawfully acquired — was protected as “fair use.” This was the legal shield the AI industry has been banking on, and it would have let Anthropic, OpenAI, and others off the hook for the core act of model training.
But Alsup split a very fine hair. In the same ruling, he found that Anthropic’s wholesale downloading and storage of millions of pirated books — via infamous “pirate libraries” like LibGen and PiLiMi — was not covered by fair use at all. In other words: training on lawfully acquired books is one thing, but stockpiling a central library of stolen copies is classic copyright infringement.
Thanks to Alsup’s ruling and subsequent class certification, Anthropic is now on the hook for a class action encompassing five to seven million books — although only works with registered US copyrights are eligible for statutory damages, and the precise number remains uncertain. A significant portion of these datasets consists of non-English titles, many of which were likely never published in the US and may fall outside the reach of US copyright law. For example, an analysis of LibGen’s holdings suggests that only about two-thirds are in English.
Assuming that only two-fifths of the five million books are covered and the jury awards the statutory minimum of $750 per work, you still end up with $1.5 billion in damages. And as we saw, the company’s own lawyers just said the number is probably in the millions (though it’s worth noting they have an incentive to amplify the stakes in the case to the judge).
The statutory maximum and with five million books covered? $150,000 per work, or $750 billion total — a figure Anthropic’s lawyers called “ruinous.” No jury will award that, but it gives you a sense of the range.
The previous record for a case like this was set in 2019, when a federal jury found Cox Communications liable for $1 billion after the nation’s biggest music labels accused the company of turning a blind eye to rampant piracy by its internet customers. That verdict was overturned on appeal years later and is now under review by the Supreme Court.
But even that historic sum could soon be eclipsed if Anthropic loses at trial.
The decision to treat AI training as fair use was widely covered as a win for the industry — and, to be fair, it was. But Anthropic is now facing an existential threat, with barely a mention. Outside of the legal and publishing press, only Reuters and The Verge have covered the class certification ruling, and neither discussed the fact that this case could spell the end for Anthropic.
Respecting copyright is “not doable”
The legal uncertainty now facing the company comes as the industry continues an aggressive push in Washington to reshape the rules in their favor. In comments submitted earlier this year to the White House’s “AI Action Plan,” Meta, Google, and OpenAI all urged the administration to protect AI companies’ access to vast training datasets — including copyrighted materials — by clarifying that model training is unequivocally “fair use.” Ironically, Anthropic was the only leading AI company to not mention copyright in its White House submission.
At the Wednesday launch of the AI Action Plan, President Trump dismissed the idea that AI firms should pay to use every book or article in their training data, calling strict copyright enforcement “not doable” and insisting that “China’s not doing it.” In spite of these comments, Trump’s actual plan is conspicuously silent on copyright and administration officials told the press the issue should be left to the courts.
Anthropic made some mistakes
Anthropic isn’t just unlucky to be up first. The judge described this case as the “classic” candidate for a class action: a single company downloading millions of books in bulk, all at once, using file hashes and ISBNs to identify the works. The lawyers suing Anthropic are top-tier, and the judge has signaled he won’t let technicalities slow things down. A single trial will determine how much Anthropic owes; a jury could choose any number between the statutory minimum and maximum.
Crucially, the order makes clear that every act of downloading a pirated book is its own copyright violation — even if Anthropic later bought a print copy or only used part of the book for training.
And the company’s handling of the data after the piracy isn’t winning it any sympathy.
As detailed in the court order, Anthropic didn’t just download millions of pirated books; it kept them accessible to its engineers, sometimes in multiple copies, and apparently used the trove for various internal tasks long after training. Even when pirate sites started getting taken down, Anthropic scrambled to torrent fresh copies. After a company co-founder discovered a mirror of “Z-Library,” a database shuttered by the FBI, he messaged his colleagues: “[J]ust in time.” One replied, “zlibrary my beloved.”
That made it much easier for the judge to say: this is “Napster” for the AI age, and the copyright law is clear.
Anthropic is separately facing a major copyright lawsuit from the world’s biggest music publishers, who allege that the company’s chatbot Claude reproduced copyrighted lyrics without permission — a case that could expose the firm to similar per-work penalties from thousands to potentially millions of songs.
Ironically, Anthropic appears to have tried harder than some better-resourced competitors to avoid using copyrighted materials without any compensation. Starting in 2024, the company spent millions buying books, often in used condition — cutting them apart, scanning them in-house, and pulping the originals — to feed its chatbot Claude, a step no rival has publicly matched.
Meta, despite its far deeper pockets, skipped the buy-and-scan stage altogether — damning internal messages show engineers calling LibGen “obviously pirated” data and revealing that the approach was approved by Mark Zuckerberg.
Why the other companies should be nervous
If Anthropic settles, it could end up the only AI company forced to pay out — if judges in other copyright cases follow Meta’s preferred approach and treat downloading and training as a single, potentially fair use act.
Right now, Anthropic’s only real hope is to win on appeal and convince a higher court to treat the downloading and training together as a single act of training, which the judge deemed fair use. (The judge in Meta’s case assessed them together as one act of training, which gives Anthropic a shot.) But appeals usually have to wait until after a jury trial — so the company faces a brutal choice: settle for potentially billions, or risk a catastrophic damages award and years of uncertainty. If Anthropic goes to trial and loses on appeal, the resulting precedent could drag Meta, OpenAI, and possibly even Google into similar liability.
OpenAI and Microsoft now face 12 consolidated copyright suits — a mix of proposed class actions by book authors and cases brought by news organizations (including The New York Times) — in the Southern District of New York before Judge Sidney Stein.
If Stein were to certify an authors’ class and adopt an approach similar to Alsup’s ruling against Anthropic, OpenAI’s potential liability could be far greater, given the number of potential covered works.
What’s next
A trial is set for December 1st. Unless Anthropic can pull off a legal miracle, the industry is about to get a lesson in just how expensive “move fast and break things” can be when the thing you’ve broken is copyright law — a few-million times over.
A multibillion dollar settlement or jury award would be a death-knell for almost any four-year-old company, but the AI industry is different. The cost to compete is enormous, and the leading firms are already raising multibillion dollar rounds multiple times a year.
That said, Anthropic has access to less capital than its rivals at the frontier — OpenAI, Google DeepMind, and, now, xAI. Overall, company-killing penalties may be unlikely, but they’re still possible, and Anthropic faces the greatest risk at the moment.
And some competitors seem to have functionally unlimited capital. To build out its new superintelligence team, Meta has been poaching rival AI researchers with nine-figure pay packages, and Zuckerberg recently said his company would invest “hundreds of billions of dollars” into its efforts.
To keep up with its peers, Anthropic recently decided to accept money from autocratic regimes, despite earlier misgivings. On Sunday, CEO Dario Amodei issued a memo to staff saying the firm will seek investment from Gulf states, including the UAE and Qatar. The memo, which was obtained and reported on by Kylie Robison at WIRED, admitted the decision would probably enrich “dictators” — something Amodei called a “real downside.” But, he wrote, the company can’t afford to ignore “a truly giant amount of capital in the Middle East, easily $100B or more.”
Amodei apparently acknowledged the perceived hypocrisy of the decision, after his October essay/manifesto “Machines of Loving Grace” extolled how important it is that democracies win the AI race.
In the memo, Amodei wrote, “Unfortunately, I think ‘No bad person should ever benefit from our success’ is a pretty difficult principle to run a business on.”
The timing is striking: the note to staff went out only days after the class action certification suddenly presented Anthropic with potentially existential legal risk.
The question of whether generative AI training can lawfully proceed without permission from rights-holders has become a defining test for the entire industry.
OpenAI and Meta may still wriggle out of similar exposure, depending on how their judges rule and whether they can argue that the core act of AI training is protected by fair use. But for now, it’s Anthropic — not OpenAI or Meta — that’s been forced onto the front lines, while the rest of the industry holds its breath.
Edited by Sid Mahanta, with inspiration and review from my friend Vivian.
If you enjoyed this post, please subscribe to Obsolete.
Discuss