Menlo Park, CA, February 17, 2025 – Newly unsealed court documents from the case Kadrey v. Meta shed light on internal discussions among Meta employees about the use of copyrighted materials to train the company’s artificial intelligence models.
The filings, submitted by plaintiffs that include prominent authors, indicate that Meta staffers debated methods of incorporating copyrighted content—such as books and online data—obtained through legally questionable means, into training sets for models in the company’s Llama family.
According to the documents, internal work chats revealed that some Meta employees advocated an “ask forgiveness, not for permission” approach when considering the use of copyrighted works. In one discussion, research engineer Xavier Martinet suggested acquiring e-books at retail prices as an alternative to negotiating licensing deals with publishers. He noted that many startups were likely already using pirated content for similar purposes, arguing that direct licensing negotiations could be time-consuming.
Senior manager Melanie Kambadur and colleagues also discussed potential data sources, including Libgen—a website known for providing access to copyrighted works without authorization. One chat highlighted that some within the team viewed using Libgen as essential for achieving state-of-the-art model performance, despite its controversial legal status. To mitigate legal exposure, proposals were made to remove data marked as pirated and to refrain from publicly citing the use of such datasets.
The filings further reveal that Meta’s internal strategy included tuning AI models to “avoid IP risky prompts,” such as requests to reproduce extensive excerpts from copyrighted texts. Additional conversations touched on the possibility of revisiting previous decisions on training sets, with some team members arguing that Meta’s proprietary data from its social platforms was insufficient to meet the growing demands for training material.
Meta maintains that training its models on copyrighted works falls under “fair use,” a position that is contested by the plaintiffs in the case. The plaintiffs, which include well-known authors Sarah Silverman and Ta-Nehisi Coates, argue that Meta’s practices violate copyright law. In response, Meta has bolstered its legal team with Supreme Court litigators from the law firm Paul Weiss.
The case, pending in the U.S. District Court for the Northern District of California, continues to raise complex questions about the balance between technological innovation, intellectual property rights, and the legal frameworks governing AI training data.
Read next
17:00
The European Commission has warned Meta that it may face daily fines starting 27 June if its modified pay-or-consent advertising model fails to meet EU antitrust requirements under the Digital Markets Act.
11:30
Meta OpenAI
Meta has snapped up three star researchers from OpenAI's Zurich lab, escalating the fierce battle for top Artificial Intelligence talent as tech giants race to dominate the next era of artificial intelligence.
16:30
Meta has teamed up with Oakley to release AI-powered smart glasses, expanding its push into wearable tech after the success of Ray-Ban Meta glasses. The social media company is expanding its partnership with Oakley and Ray-Ban-parent EssilorLuxottica.
10:26
Sam Altman
OpenAI CEO Sam Altman has alleged that Meta Platforms offered signing bonuses as high as $100 million to OpenAI employees in an attempt to recruit top talent, underscoring the intensifying competition among tech firms for artificial intelligence expertise.
03:30
Meta & AI
Meta CEO Mark Zuckerberg has launched a 'superintelligence' project to take artificial intelligence to a new level. According to Bloomberg, Zuckerberg is assembling an exclusive team of leading AI experts to create technology that surpasses human-level capabilities.
What is your opinion on this topic?
Leave the first comment