Anthropic owes authors $1.5B for pirating work â but the claims process is a Kafkaesque mess
Earlier this year, the author Maureen Johnson was fighting with Anthropic. Specifically, she was wrestling with the Anthropic copyright settlement website. Johnson is the author of 18 books, most of them YA and many of them bestsellers. The AI company Anthropic owes her an estimated $3,000 per book (to be split 50-50 with her publisher) for several of them. The payouts are part of a first-of-its-kind settlement that was handed down last fall, in which Anthropic admitted that it downloaded millions of pirated, copyrighted books to train its AI models without authorsâ permission. (According to the New York Times, âAs part of the settlement, Anthropic said it did not use any pirated works to build A.I. technologies that were publicly released.â) A judge found that the use of those books without authorial permission constituted fair use, but the piracy did not. Similar suits are pending against Meta and OpenAI. (Disclosure: Voxâs Future Perfect is funded in part by the BEMC Foundation, whose major funder was also an early investor in Anthropic; they donât have any editorial input into our content.) - Anthropic owes a class of half a million authors $1.5 billion as a legal settlement for downloading pirated books to train its AI model. - However, Anthropicâs data set was so buggy that authors had a hard time navigating the website set up to administer the claim. - Plus, that $1.5 billion works out to a very small amount for each individual author in the class, particularly after theyâve split the payout with their publishers. - The settlement will go to court for a fairness hearing on May 14. The class-action lawsuit was intended to even the playing field between individual authors and one of the most valuable companies in the world. To distribute the money to authors, Anthropic and the plaintiffâs lawyers worked with a claims administrator (a company that specializes in managing compensation claims) to set up a website that authors can use to access a small piece of the record-breaking $1.5 billion payout. But Johnson, like other authors who spoke to Vox, quickly hit a snag: The claims site is glitchy and unreliable, forcing people to jump through endless hoops to collect the money theyâre owed. By March, she had already submitted claims for her 14 eligible titles twice, spending 90 minutes each time to painstakingly fill out the forms. Now, the claims administrator was telling her they couldnât find either of her entries. They escalated her through several layers of management, each of whom repeated the same thing. âIt was getting more and more surreal, how little this system worked,â Johnson said. Eventually, Johnson connected with an employee who she said spent the entire call giggling. He told her that he had found her first claim submission from February, but not the new one. âThis system is really fluky,â Johnson said she told him. âItâs just not well-programmed.â In response, Johnson said the employee giggled again. âCoding is hard,â he told her. Johnson is not alone in her frustrating experience. Authors had six months to register their claims for Anthropicâs payout, and a lot of them struggled to do so. Anthropic regularly touts its ethical and philanthropic bona fides. (The company is here to serve humanityâs long-term well-being! Itâs the safe and responsible AI company! Claude helped NASAâs Perseverance rover travel on Mars!) But the good it is doing is based on stolen work â and the people who created that work are having trouble getting the very small recourse that they are owed. All of the popular large language models were trained on books; that was the only way to get them enough high-quality text to start generating their own. Most of those books were downloaded from pirate libraries, in at least one instance on the grounds that it would simply be too expensive to pay for each title. As it became increasingly clear that this was the case, the class action lawsuits began rolling in. Bartz et al. v. Anthropic PBC was the first to be settled. In September 2025, a judge approved a $1.5 billion settlement between Anthropic and the nearly half a million writers it had determined belonged to the class. Things got tricky, however, when it came time to determine who those half a million writers were. They had to be authors of books that appeared in one of the three pirated databases Anthropic used in 2021. But trying to create a comprehensive list from those databases proved difficult. Anthropic hadnât created its own records as it fed pirated books into its training corpus, so lawyers on both sides had to rely on the pirate sitesâ own data. And they had to do it quickly, because the trial came with strict deadlines. âItâs, like, crowdsourced pirate library metadata,â Dave Hansen, executive director of the advocacy group Authors Alliance, told Vox. (Authors Alliance has filed amicus briefs in the Bartz case and published extensive technical explainers for authors.) âI wouldnât rely on that for almost anything, much less administering legal claims in a large and important lawsuit. But that was kind of the best that they had given the data sources being used.â âI think everyone agrees itâs not the best data, but itâs the best that they could do on the time frame,â publishing industry reporter Jane Friedman told Vox. âI think it was just the reality for class counsel. The judge was really expediting matters, and so they did the best they could in the time that they had.â Neither Anthropic, its lawyers, the class counsel for this case, or the claims administrator responded to a request for comment from Vox. But it appears that the plaintiffâs lawyers and the claims administrator worked together to narrow down Anthropicâs starting list of 7 million books to only titles that were under US copyright in 2022. âThen they used a bunch of other industry sources to enrich that data so that they had more information about current publishers, and then used that to generate contact info,â Hansen said. âAt that scale, itâs really hard to get 100 percent accuracy.â He added, âOne of my bigger criticisms of how this settlement and process has gone is the data. They just havenât been very transparent about it.â From there, the claims administrator and class counsel used that wonky list to build their glitchy website, which is how Maureen Johnson eventually found herself on the phone with a giggling man who told her coding was hard. Other authors were in a similar boat. âI have 19 titles in the database,â said Christopher Moore, the author of zany comedic novels like Lamb: The Gospel According to Biff, Christâs Childhood Best Friend. After he had done the paperwork for 18 of them, he had to walk away from his computer. When he came back the next day to finish the paperwork for the 19th book, everything had been deleted. A month went by after he submitted the form a second time, Moore said. âAnd I got another notice: what about these other titles?â Most of the titles belonged to one of the four other Christopher Moores working as authors. One was actually his, Moore said, âbut it showed it with some weird Texas copyright.â He filed the claim anyway and is still waiting to hear back. April Henry, who writes YA mysteries, also found unusual copyright holders on her books. âOne of the books on the list appeared to be an audiobook and showed the narrator as one of the copyright holders,â she said. Meanwhile, she is struggling to figure out how to handle the seven of her 22 books that she wrote with a co-author. âNo one ever had it in their contract that youâre going to split the rights to a legal settlement,â Henry said. âYou know what I mean?â And as authors struggle to navigate the claims process, theyâre doing so with mixed emotions. Johnson is still furious about her experience with the claim administratorâs website. âYour AI monster ate all of our work,â she said, addressing Anthropic. âNow youâre trying to pay us off with this [âŠ] piece of garbage that doesnât wâŠ
Send this story to anyone â or drop the embed into a blog post, Substack, Notion page. Every play sends rev-share back to Vox.