Collector: Hızlı ve Tarafsız Haber

Computerworld NZ
5 saat, 6 dakika

SerpApi fights back against Google lawsuit

The web scraping wars have just intensified. In December, Google announced that it was taking action against web scraping company SerpApi, whose API lets customers’ scrapers mimic human searching, claiming that the company’s tool was “circumventing security measures” that protect its search results to feed the voracious appetite for training data required by many AI large language models, often without the knowledge or permission of the sites’ owners. Now SerpApi is fighting back. On February 20, it filed a motion in a California court to have the action dismissed, saying that Google is attempting to “weaponize the Digital Millennium Copyright Act” to prevent others from doing what it has done: mass web scraping. In a written statement , the company said: “Google thinks it owns the internet. That’s the subtext of its lawsuit against SerpApi, the quiet part that it’s suddenly decided to shout out loud. The problem is, no one owns the internet. And the law makes that clear.” Law is not clear However, despite SerpApi’s claims, the law is not very clear. “While outright facts (like the idea that the sky is blue) are not copyrightable, courts have often found that compilations of such facts are copyrightable to a certain extent. For example, an encyclopedia or phone book might be copyrightable to a certain extent (for example, with respect to their arrangement), even if both fundamentally contain basic facts,” said intellectual property (IP) lawyer Kirk Sigmon . “In other words, there’s an open question as to whether Google has a copyright on the search results/summaries/information it generates and provides. If the work is defined in that manner, SerpApi will face a much tougher battle,” he noted. But some observers think that the scrap between Google and SerpApi is already quaintly old-fashioned. “From my perspective, it feels like a lawsuit that’s outdated. Crawling and scraping has moved on from what it was a couple of years ago,” said Martin Jeffrey , founder of AI search optimization consultancy Harton Works. “Since October, we’ve seen a massive wave of China search traffic that’s being routed through Singapore in an attempt to disguise its origins. We’ve also seen instances of AppleBot massively increase. The industry has moved [on] from SerpApi.” In addition, he noted, “these instances from China, they could have an effect on [corporate] sites that use WordPress, or sites that are badly maintained. These businesses could find that they have intellectual property or previously hidden intranet messages now being used in AI language models.” Google is alone in mass scraping of training data for AI, he added; other AI businesses are taking a different approach to acquiring it. “Anthropic and OpenAI used to do a lot of scraping,” he said, “but that has changed in the last year. ChatGPT still relies heavily on scraping, but is now reducing it. And we’re a seeing a massive reduction in Anthropic’s use; it’s not absolutely clear what Claude is doing, but it looks like they’re not scraping whole websites, but selecting individual pages.” Nevertheless, IP lawyer Sigmon noted that it’s not yet possible to say what’s going to happen in the court case. “Big picture, despite the internet being around for quite some time, there’s a bit of a dearth of good case law on web scraping, especially in the manner it’s conducted today ,” he said . “ SerpApi’s argument might help the court begin to chew on some of those nuances, but I wouldn’t necessarily characterize it as an easy win . ”

Go to News Site