Menlo Park, CA, February 17, 2025 – Newly unsealed court documents from the case Kadrey v. Meta shed light on internal discussions among Meta employees about the use of copyrighted materials to train the company’s artificial intelligence models.
The filings, submitted by plaintiffs that include prominent authors, indicate that Meta staffers debated methods of incorporating copyrighted content—such as books and online data—obtained through legally questionable means, into training sets for models in the company’s Llama family.
According to the documents, internal work chats revealed that some Meta employees advocated an “ask forgiveness, not for permission” approach when considering the use of copyrighted works. In one discussion, research engineer Xavier Martinet suggested acquiring e-books at retail prices as an alternative to negotiating licensing deals with publishers. He noted that many startups were likely already using pirated content for similar purposes, arguing that direct licensing negotiations could be time-consuming.
Senior manager Melanie Kambadur and colleagues also discussed potential data sources, including Libgen—a website known for providing access to copyrighted works without authorization. One chat highlighted that some within the team viewed using Libgen as essential for achieving state-of-the-art model performance, despite its controversial legal status. To mitigate legal exposure, proposals were made to remove data marked as pirated and to refrain from publicly citing the use of such datasets.
The filings further reveal that Meta’s internal strategy included tuning AI models to “avoid IP risky prompts,” such as requests to reproduce extensive excerpts from copyrighted texts. Additional conversations touched on the possibility of revisiting previous decisions on training sets, with some team members arguing that Meta’s proprietary data from its social platforms was insufficient to meet the growing demands for training material.
Meta maintains that training its models on copyrighted works falls under “fair use,” a position that is contested by the plaintiffs in the case. The plaintiffs, which include well-known authors Sarah Silverman and Ta-Nehisi Coates, argue that Meta’s practices violate copyright law. In response, Meta has bolstered its legal team with Supreme Court litigators from the law firm Paul Weiss.
The case, pending in the U.S. District Court for the Northern District of California, continues to raise complex questions about the balance between technological innovation, intellectual property rights, and the legal frameworks governing AI training data.
Read next
12:30
Volgograd airport
Meta, the parent company of Facebook, unveiled a free standalone artificial intelligence (AI) app for its Meta AI assistant on Tuesday, marking a significant move as part of the growing wave of new AI apps in the tech industry.
19:30
LlamaCon
Meta is launching its first-ever AI developer conference, LlamaCon, on Tuesday, aiming to re-engage the developer community and spotlight the company’s Llama family of open AI models.
21:30
Brussels has fined Apple and Meta over €700 million combined, launching its first crackdown under the Digital Markets Act aimed at curbing big tech’s power and boosting competition in the digital economy.
17:00
Apple and Meta have been fined €500 million and €200 million respectively by the European Union for breaching the bloc’s new Digital Markets Act, escalating tensions in the transatlantic tech trade dispute.
12:24
A former Meta executive, Sarah Wynn-Williams, testified before U.S. senators on Wednesday, alleging that the company compromised national security to build a lucrative business in China.
What is your opinion on this topic?
Leave the first comment