Copyright is one of the biggest points of AI criticism. And while some users protest AI tools because they don’t feel using AI is ethically sound, affected companies are taking more drastic actions. Encyclopedia Britannica and its Merriam-Webster subsidiary recently filed a joint lawsuit against OpenAI for copyright infringement.
The publishers are accusing the AI company of training its ChatGPT models on massive amounts of scraped copyrighted data. The complaint states that OpenAI illegally scraped nearly 100,000 protected articles and dictionary definitions without ever asking for permission.
But the problem runs deeper than sheer LLM training. According to the legal filing, ChatGPT frequently produces near-verbatim copies of original encyclopedia entries when answering user prompts. The plaintiffs argue that this aggressive content generation directly cannibalizes their web traffic.
This perfectly fits the trend where AI overviews in general, not those inside ChatGPT only, are causing massive traffic drops for publishers all across the internet. The difference between this case and the case of millions of other publishers is that Encyclopedia Britannica has the financial and logistical means to pursue justice in court.
But that’s not all, as the lawsuit also accuses OpenAI of serious trademark infringement. The publishers claim that ChatGPT regularly generates completely fabricated information and falsely attributes those hallucinations directly to Britannica. These hallucinated summaries threaten to damage the long-standing reputation of the brand.
Reuters reports that an OpenAI representative responded to the new Manhattan filing by claiming their training data is publicly available and completely protected under fair use.
This new court battle is just the latest in a wave that seems like an outright battle between AI companies and legacy media. For example, the Chinese company ByteDance was recently forced to adjust its highly-capable Seedance 2.0 AI video model because it produced clips with copyrighted characters.