Future of AIAI

Generative AI and Intellectual Property: Where Do We Draw the Line?

By Adam Philipp, founder of AEON Law

Platoโ€™s allegory of the cave has taken on renewed relevance in the age of generative AI.

We are surrounded by shadowsโ€”artworks, stories, songs, and voices that resemble the real but may not have originated with any person. These outputs are often uncannily convincing. But the deeper question isnโ€™t how realistic they seem. Itโ€™s what lies behind themโ€”and what rights, if any, remain with the creators whose works helped train the machines.

As courts and policymakers begin to engage with the complex questions surrounding AI-generated content, the core issue is becoming clearer: creators are being systematically written out of the systems built on their labor. The legal framework, while evolving, is still far behind the technology.

Itโ€™s time to draw the line. The challenge is figuring out where.

Legal Flashpoints: Authors, Actors, and the Shadow Economy of Training Data

One of the clearest signs of legal momentum came in the recent Anthropic lawsuit. A class of authors alleged that their copyrighted books had been usedโ€”without permissionโ€”to train Anthropicโ€™s Claude AI. A federal district court allowed the case to move forward, finding that the company had saved pirated copies of as many as seven million works. Though the court granted partial summary judgment in favor of the company on certain fair use claims, it declined to dismiss the core allegations and ordered a trial on damages.

That trial was scheduled for December 2025โ€”until Anthropic abruptly settled, signing a binding term sheet with plaintiffs. The company, notably, had just closed a funding round valuing it at $183 billion.

That juxtaposition is not lost on observers: a company accused of large-scale copyright infringement becoming one of the most highly valued AI firms in the world, while authors learned of their eligibility for the class action via email.

Other disputes are unfolding along similar lines. In Lehrman v. LOVO, voice actors sued over alleged misappropriation of their voices in AI-generated content. There, a federal judge allowed state law claimsโ€”including right of publicity and unfair competitionโ€”to proceed even after dismissing federal copyright claims. The ruling underscores how generative AI raises harms that donโ€™t map neatly onto copyright law.

And the lawsuits keep coming. Major record labels have accused Anthropic of training on copyrighted lyrics. Reddit has alleged unauthorized scraping of user content. OpenAI and Meta face lawsuits from authors and media companies. Each case addresses a different facet of the same fundamental problem: the systematic use of protected worksโ€”whether books, code, music, or voicesโ€”to fuel AI systems that increasingly shape the digital landscape.

The Law Is Fragmented and Unclear

These cases are making their way through courts, but they are not resolving the underlying uncertainty.

The U.S. Copyright Office, in a recent report, acknowledged the ambiguity surrounding how copyright law applies to AI training and output. The Office confirmed that using copyrighted works to train AI systems โ€œmay implicate the reproduction rightโ€ but declined to offer firm guidance on where the line is drawn. It did, however, reiterate that outputs โ€œthat are substantially similar to existing worksโ€ could infringe.

In Europe, the legal landscape is somewhat more structured. The European Unionโ€™s AI Act includes transparency obligations and recognizes a right to opt out of data mining for training purposes. But even there, enforcement mechanisms remain limited, and rights holders are left with few tools to meaningfully monitor or control how their works are used.

Complicating matters is the patchwork nature of state laws in the U.S. While copyright law is federal, state laws govern misappropriation of voice, likeness, and identity. This creates a fragmented framework where the legal outcome may depend on whether the harm sounds more like โ€œcopyingโ€ or more like โ€œidentity theft.โ€

Philosophical and Economic Tensions

At the heart of these disputes is a philosophical divide: is the use of copyrighted content to train AI systems transformative and innovative, or is it exploitative and parasitic?

Proponents of broad fair use argue that training an AI model is akin to a researcher reading books or listening to music in order to produce new ideas. The model doesnโ€™t memorize or reproduce exact works (they claim), but instead learns patterns and structures in the aggregate. From this perspective, requiring licenses for all training data would stifle development and entrench the incumbents who can afford to pay for massive datasets.

Criticsโ€”particularly creatorsโ€”view this as a dangerous oversimplification. AI systems are not merely โ€œreadingโ€ in the human sense. They ingest and encode millions of works, sometimes replicating style, structure, or specific language. In some cases, outputs have been shown to reproduce copyrighted content nearly verbatim. Even when outputs are technically โ€œoriginal,โ€ they often compete with the works that trained them.

The economic incentives are also clear. Training on copyrighted material is faster, cheaper, and more effective than building models from licensed or public domain content. The companies doing the training, however, frequently disclaim responsibility for how the content was obtained or used.

Toward a Coherent Framework

What is needed now is not more litigation alone, but a coherent framework that acknowledges both technological realities and the rights of creators, like the clients we advise atย AEON Law.

Such a framework would likely include:

  • Disclosure of Training Data: AI developers should be required to identify, at least in general terms, the nature and source of training datasets. Transparency is a prerequisite for accountability.
  • Collective Licensing Options: Just as musicians and authors benefit from performance rights organizations and licensing collectives, similar structures could facilitate licensing for training datasets.
  • Clear Rules for Output Liability: Courts and lawmakers will need to define when AI-generated outputs cross the line into infringement. This may involve new standards that account for imitation, not just duplication.
  • Protection Beyond Copyright: Voice, likeness, and persona-based harms must be addressed by updating or enforcing existing state laws and creating consistent federal standards.
  • Differentiation of Use Cases: The law should distinguish between training for research, for commercial deployment, and for derivative exploitation. Not all uses are created equal.

Drawing the Line

The question is no longer whether generative AI will disrupt intellectual property law. That disruption is well underway. The real question is how society will adaptโ€”and whose interests the new rules will serve.

Without clear boundaries, creators will lose control of their work, audiences will be flooded with synthetic content of uncertain origin, and innovation itself may become a game of who can train faster and ask forgiveness later.

Drawing the line is not about protecting legacy industries or stifling innovation. Itโ€™s about recognizing that creativity has valueโ€”even when itโ€™s inconvenient to the bottom line of those building machines to replace it.

As Plato taught, shadows are not the truth. But they still come from something real.
______

About the Author

Adam Philippย is the founder ofย AEON Law, an intellectual property law firm in Seattle. Recognized by Chambers USA, IAM Patent 1000, and IP Stars, Adam helps clients in high-tech industries protect their creations and profit from them.

Author

Related Articles

Back to top button