This post was originally published on this site
For years, AI companies have been quietly scraping content from the internet to train their large language models (LLMs). Publishers, already grappling with declining revenues, have cried foul, demanding payment for what they see as theft of their intellectual property. Now, Perplexity AI, an AI-powered search engine, is stepping into this fray with a potential olive branch. This month, they’re set to launch an ad revenue-sharing program with web publishers, promising to compensate them for content cited in Perplexity’s results. But as the dust settles from accusations of data theft and copyright infringement, a crucial question emerges: Is Perplexity’s new initiative a lifeline for struggling news organizations, or just another way for AI companies to exploit journalistic content?
The Battle With How AI Is Using News Content
Perplexity hasn’t exactly been on publishers’ nice list lately. The company recently found itself accused of copyright infringement by multiple organizations, including Forbes and CNBC. And other publications like WIRED have called out Perplexity for unauthorized data scraping and ignoring robot.txt opt-outs. Worse yet – security experts were able to trace an IP address from a data-scraping bot back to a AWS server being used by Perplexity. Amazon announced that they are investigating.
Aravind Srinivas, the CEO of Perplexity, came to the defense of his company in an X comment to John Paczkowski, executive editor of tech & innovation at Forbes. When responding to accusations of plagiarism and bypassing paywalls, Srinivas defended Perplexity’s Pages product. “It has rough edges, and we are improving it