Since April 3, 2025, a file has been sitting on the desk of the EU Court of Justice that puts the relationship between generative AI and copyright law under sharp scrutiny: Like Company v. Google Ireland (C-250/25)1. Hungarian publisher Like Company accuses Google's chatbot Gemini (formerly Bard) of displaying substantial fragments from its news articles — including about singer Kozsó — without permission or compensation when requested by users.
From search result to chat output
Anyone reading the eighteen-page preliminary ruling sees how classic copyright law intersects with the workings of large language models. According to Google, Gemini is not a database; it breaks texts into tokens and doesn't "remember" complete articles. Like Company argues that this tokenization doesn't change the fact that the model made copies during training and that the final chat output undermines the economic value of journalistic content.
The case perfectly illustrates the tension between traditional copyright concepts and modern AI technology. Where it was previously clear when reproduction occurred — think of copying an article — this becomes much more complex with AI systems. The model "reads" millions of texts, processes them into statistical patterns, and then generates new text that sometimes bears striking resemblance to the original material.
Four questions that could reshape the playing field
The Budapest Környéki Törvényszék particularly wants to know from the Court1:
-
Is displaying longer press fragments by a chatbot a "communication to the public"?
- This touches the core of how we should legally qualify AI output
- An affirmative answer would mean that every chatbot response with substantial content is a copyright act
-
Can training an LLM on open web material be considered reproduction?
- This question concerns the fundamentals of how AI models learn
- The answer determines whether training without explicit permission remains possible at all
-
If so, may such reproduction remain under the EU exception for text and data mining (art. 4 DSM directive)?
- Article 4 of the DSM directive2 allows TDM, but with important limitations
- The question is whether commercial AI training falls under this exception
-
Does the concrete display of such a fragment in the chat interface constitute reproduction by the provider again?
- This concerns the final responsibility of AI companies for their output
- An affirmative answer would force providers to implement much stricter content filtering
An affirmative answer to one or more of these questions would mean that LLM developers must conclude explicit licenses or respect opt-out signals from publishers. Conversely, a rejection would further open the door for large-scale model training on public web material.
The broader impact on AI development in Europe
Whichever way the ruling falls, it directly touches the transparency and due diligence rules from the EU AI Act3. That law requires generative models from mid-2025 to publish an "adequate summary" of their training data.
If the Court later rules that training is indeed copyright reproduction, that summary will likely need to become more detailed and verifiable, so that rights holders can file claims. This could lead to:
- Mandatory license databases where AI companies precisely track which content they use
- Automatic compensation mechanisms for publishers and authors
- Geographic restrictions on AI models that don't comply with EU copyright requirements
If tokenization is not seen as reproduction, the AI sector can interpret that transparency requirement more broadly — but the chatbot output itself remains under strict scrutiny.
Practical steps before the ruling falls
Don't wait until 2026 to take action. For AI developers and companies using AI tools, there are concrete steps to take:
For AI developers:
- Document now which datasets you use and under which license or TDM basis this happens
- Implement opt-out mechanisms that publishers can use to exclude their content
- Build product functionality that prevents users from retrieving entire articles with one prompt
For companies using AI tools:
- Review contracts with external model suppliers: ask in black and white what permission they have or which exception they rely on
- Implement internal guidelines for using AI-generated content
- Ensure transparency to customers about the use of AI in your services
For publishers and content creators:
- Consider robots.txt adjustments to ward off AI crawlers
- Research licensing models for AI training of your content
- Actively monitor whether your content appears in AI outputs
A case to follow
Like Company v. Google Ireland is not just a conflict between a news publisher and a tech giant. It's the litmus test for whether Europe can combine an open, innovative AI ecosystem with robust protection of intellectual property.
The ruling, expected in 2026, will likely set the standard for how AI companies worldwide deal with copyrighted content. For Europe, this means a chance to position itself as the region that finds the balance between innovation and rights protection.
Those who today invest in transparent data chains and "copyright-aware" model architecture will stand stronger legally and strategically tomorrow. The question is not whether regulation is coming, but how quickly companies adapt to the new reality where AI and copyright must go hand in hand.