Google says no to training AI on its search results

Google is suing SerpApi, a web-scraping company that provides its customers with an API that mimics human searching, the latest salvo in the battle over access to data for training and operating AI large language models. Many of the large language models powering AI services today were trained on data scraped from websites, often without the knowledge or permission of the sites’ owners. Now, copyright holders are fighting back, suing AI companies or their suppliers, and striking licensing deals worth millions or even billions of dollars . Google is on both sides of that fight: collecting and curating one of the world’s largest datasets , while simultaneously training its own family of LLMs, Gemini, and integrating them into its services — including search . Now other companies are seeking to access that dataset to build competing AI products, and Google sees it as a threat. SerpApi is “circumventing security measures protecting others’ copyrighted content that appears in Google search results,” Google General Counsel Halimah DeLaine Prado wrote in a blog post announcing the lawsuit . “We did this to ask a court to stop SerpApi’s bots and their malicious scraping, which violates the choices of websites and rightsholders about who should have access to their content,” she wrote. While Google obtains most of its search results by scraping websites itself, Prado said Google’s lawsuit specifically targets SerpApi’s access to content Google has licensed or created. “SerpApi deceptively takes content that Google licenses from others (like images that appear in Knowledge Panels, real-time data in Search features and much more), and then resells it for a fee. In doing so, it willfully disregards the rights and directives of websites and providers whose content appears in Search,” she wrote. SerpApI denied wrongdoing, saying that it provides developers, researchers, and businesses with access to public search data that is the same information anyone can access from their browser. “We believe this lawsuit is an effort to stifle competition from the innovators who rely on our services to build next-generation AI, security, browsers, productivity, and many other applications,” it said in a written statement. “As we state on our website, ‘The crawling and parsing of public data is protected by the First Amendment of the United States Constitution.’ We work closely with our attorneys to ensure our services comply with all applicable laws, including fair use principles. SerpApi stands firmly behind its business model and will vigorously defend itself in court.” Google must be particularly concerned about the help that its competitors are receiving from SerpApi. In August, The Information reported that OpenAI and Perplexity were customers of SerpApi . No free ride Some see the lawsuit as an indication that the free ride for AI firms is coming to an end. “AI development is moving extremely fast precisely because the legal framework around content usage is unclear,” said Martin Jeffrey , founder of AI search optimization consultancy Harton Works. “Companies are optimizing for AI discovery now rather than waiting for permission or clarity, and maybe this is why Google is making these kinds of moves.” Matt Hasan , CEO of AI marketing firm aiResults, concurs. “The period where AI developers could move quickly with little pushback from content providers is clearly ending. As legal and regulatory constraints tighten, we should expect a slowdown in experimentation, more cautious product development, and a shift toward defensible, licensed, or vertically integrated data strategies. That doesn’t stop AI progress, but it does reshape who can afford to participate and how fast they can move.” Google’s action will certainly help the company with the continuing development of its own AI offering, said Jeffrey. “Google fell behind a bit with Gemini. They’re catching up now and are implementing Gemini into everything,” he said. He’s curious to see what Google does after its action against SerpApi: “If they win that, will they tackle larger firms? It looks like they’re going after the small guy first; it’s a shot across the bow.” There are already signs that some of Gemini’s competitors are beginning to be impacted by Google’s strides in the AI market. Earlier this month, OpenAI CEO Sam Altman declared a ‘Code Red’ alert in his attempt to maintain its market leading position against Google’s incursions into the market. The lawsuit against SerpApi is not Google’s first attempt to limit the use its AI rivals can make of its data. In October it limited search queries to just 10 results per request, where previously it would provide up to 100. This action forced companies scraping its site to considerably scale up their crawling efforts to achieve the same results.