Perplexity is Taking Website Data Even After Being Blocked

These domains were not indexed by any of the search engines or even made publicly accessible or discoverable. The researchers even put a robots.txt file on these domains to discourage the bots from accessing any part of the website.

Most readers read for free. A small group from the TelecomTalk community keeps this going. Support only if our work adds value for you.

Highlights

  • Perplexity, an AI (artificial intelligence) chat product available globally, has been accused of taking data from websites even after being blocked to do so.
  • Cloudflare, in a blog post confirmed that Perplexity bots are crawling and taking information from websites even when the websites have blocked them from doing so.
  • What Perplexity is doing is that it is using stealth tactics to scrape website data even from the websites have explicitly blocked the following bots.

Follow Us

perplexity is taking website data even after

Perplexity, an AI (artificial intelligence) chat product available globally, has been accused of taking data from websites even after being blocked to do so. This was said by Cloudflare, a global web security services company. Cloudflare, in a blog post confirmed that Perplexity bots are crawling and taking information from websites even when the websites have blocked them from doing so. What Perplexity is doing is that it is using stealth tactics to scrape website data even from the websites have explicitly blocked the following bots - Perplexity-User and PerplexityBot from doing so. The researchers at Cloudflare were able to confirm this by creating new test domains.




Read More - Apple is Developing ChatGPT Like Product

These domains were not indexed by any of the search engines or even made publicly accessible or discoverable. The researchers even put a robots.txt file on these domains to discourage the bots from accessing any part of the website. When the researchers went to Perplexity and asked about these domains, Perplexity was able to scrape the data from the domains and produce it as if no bot was ever blocked.

Read More - Bharti Airtel Offers Free Perplexity Pro AI Subscription to All Customers

As per Cloudflare, Perplexity's bots or user agents took several steps to bypass the directions from the website. Even after the denial from the robots.txt file to scourage the website, Perplexity went ahead and took the information. In case the website has implemented a web application firewall (WAF) for blocking the bot, Perplexity starts using a generic browser agent for impersonating Google Chrome or macOS. This undeclared bot from Perplexity is also claimed to use multiple IPs not listed in the official IP range of Perplexity for tricking the website.

Cloudflare was successful in intercepting the undeclared bots from Perplexity. After that, the results from the Perplexity about these domains weren't as good.

Most readers read for free. A small group from the TelecomTalk community keeps this going. Support only if our work adds value for you.

Reported By

Tanuja is a passionate technology and telecom buff who has been following the telecom industry for several years now.

Recent Comments

Rohit :

First time koi vi ka ceo sensible baat kiya warna jo aaye Airtel ka rate chart utha k bak dete…

Vodafone Idea CEO on Tariffs: Will Opt for Minor Corrections,…

Integration of mind :

Unfortunately in my place I no longer get bsnl 3g in which bsnl has shutdown 3g but no bsnl 4g…

BSNL Has Successfully Transitioned from a 3G Provider to a…

Aunty :

Yes Waste poor Network. Govt should close BSNL immediately.

BSNL Has Successfully Transitioned from a 3G Provider to a…

Integration of mind :

Are you getting speed limit of 40 mbps in bsnl 4g , are you getting B 41 LTE band and…

BSNL Has Successfully Transitioned from a 3G Provider to a…

Load More
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments