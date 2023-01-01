Dark Visitors

ChatGPT-User
12% blocked

ChatGPT-User is dispatched by OpenAI's ChatGPT in response to user prompts. Its answers will usually contain a summary of the content on the website rather than relaying it to the user directly.

AI Assistant
cohere-ai
3% blocked

cohere-ai is an unconfirmed agent possibly dispatched by Cohere's AI chat products in response to user prompts when it needs to retrieve content on the internet.

AI Assistant
anthropic-ai
6% blocked

anthropic-ai is a unconfirmed agent possibly used by Anthropic to download training data for its LLMs (Large Language Models) that power AI products like Claude.

AI Data Scraper
Bytespider
3% blocked

Bytespider is a web crawler operated by ByteDance, the Chinese owner of TikTok. It's allegedly used to download training data for its LLMs (Large Language Model) including those powering ChatGPT competitor Doubao.

AI Data Scraper
CCBot
17% blocked

CCBot is a web crawler used by Common Crawl to maintainin an open source repository of web crawl data that is available for anyone to use. This repository has been used to train many LLMs (Large Language Models), including OpenAI's GPTs.

AI Data Scraper
Diffbot
0% blocked

Diffbot is an intelligent web crawler used to understand, aggregate, and ultimately sell structured website data for real-time monitoring and training other AI models.

AI Data Scraper
FacebookBot
4% blocked

FacebookBot is a web crawler used by Meta to download training data for its AI speech recognition technology.

AI Data Scraper
Google-Extended
16% blocked

Google-Extended is a web crawler used by Google to download AI training content for its AI products like the Gemini assistant and its Vertex AI generative APIs.

AI Data Scraper
GPTBot
32% blocked

GPTBot is a web crawler used by OpenAI to download training data for its LLMs (Large Language Models) that power AI products like ChatGPT.

AI Data Scraper
omgili
2% blocked

Omgili is a web crawler used by Webz.io to maintain a repository of web crawl data that it sells to other companies, including those using it to train AI models.

AI Data Scraper
Amazonbot
3% blocked

Amazonbot is a web crawler used by Amazon to index search results that allow the Alexa AI Assistant to answer user questions. Alexa's answers normally contain references to the website.

AI Search Crawler
Applebot
2% blocked

Applebot is a web crawler used by Apple to index search results that allow the Siri AI Assistant to answer user questions. Siri's answers normally contain references to the website.

AI Search Crawler
PerplexityBot
1% blocked

PerplexityBot is a web crawler used by Perplexity to index search results that allow their AI Assistant to answer user questions. The assistant's answers normally contain references to the website as inline sources.

AI Search Crawler
YouBot
1% blocked

YouBot is a web crawler used by You.com to index search results that allow their AI Assistant to answer user questions. The assistant's answers normally contain references to the website as inline sources.

AI Search Crawler
